Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save & Serve tfrs that does not recommend items previously interacted with using BruteForce layer #400

Open
yrianderreumaux opened this issue Nov 4, 2021 · 5 comments

Comments

@yrianderreumaux
Copy link

I am working with a relatively small data set and am noticing that users are being recommended a lot of items they have previously interacted with and I would like to pre-filter out these items before saving and serving the mode on ai platform. I realize this could also be done after the list has been generated, but the way our app works makes it rather difficult (and slow) to filter post hoc on the client side.

I have seen a few other issues addressing related questions (e.g., 307, 113). However, I have not seen a definitive solution, and these seem to deal with either excluding a set of items for all users (rather than a unique set for each user), or excluding items from test recommendations.

Currently, I generate an index with 80 recommended items: index = tfrs.layers.factorized_top_k.BruteForce(model.user_model, k = 80)

and then remove any duplicates that existed in the training df: index.index_from_dataset( tf.data.Dataset.zip((unique_recipe_id_pred.batch(80), unique_recipe_id_pred.batch(80).map(model.recipe_model))))

This works well, but how could I also exclude items in the index that users have interacted with? Could the query with exclusions function be a possible solution?

Apologies if I have missed something here and thanks in advance for any advice!

@msvensson222
Copy link

msvensson222 commented Nov 4, 2021

I'm not sure why you would not use the query with exclusions feature, seems to fit your use case. However, if you for some reason cannot do that, one idea that might work is to make your own version of an "index", eg get the user and recipe embeddings, and do the math yourself.

# First, select all recipes that your user(s) have not interacted with previously;
possible_candidates_for_a_user = ...

# Get candidate embeddings for these recipes
candidate_embeddings = recipe_model.predict(possible_candidates_for_a_user.batch(256))

# Get user embeddings
user_dict = {"customer_no": customer} # Add what other user features you might have as well
user_embeddings = user_model.predict(tf.data.Dataset.from_tensor_slices(query_dict).batch(256))

# Get top-k
k = 80

# Get a score for each recipe
scores = np.dot(user_embeddings, candidate_embeddings.T)
indices = np.argpartition(scores, range(-k, 0), axis=1)[:, :-(k + 1):-1]

top_k_recipies_to_recommend = candidates[indices]

Thoughts?

@yrianderreumaux
Copy link
Author

Thanks @msvensson222 for engaging with the question!
Your proposed solution is a creative one and something I would not have considered.

Two quick follow up questions: (1) I can see how this would work with a single user, but it's still not clear how this would work with a set of users that each interacted with a different list of recipes. For the latter, would possible_candidates_for_a_user be a dictionary with each unique user id referencing a list of all previously seen recipe ids? and (2) It's important for me to save the model as a graph that takes in raw json user ids so that it can be hosted on ai platform for on-demand predictions, how could this version of an index be saved in such a way?

@patrickorlando
Copy link

You need to track the state of what user's have interacted with separately from the model.
If for each user you maintain a list of item_ids that the user has interacted with, you can then filter them out in your caller API, or wrap the BruteForce Index in a model with some logic to remove them.

@maciejkula
Copy link
Collaborator

maciejkula commented Nov 18, 2021

Patrick's answer is spot on - ultimately, every real production system has a business logic layer that sits between the model and the user. This layer will normally keep track of things like past interactions, and filters those out of recommendations if that makes sense for the product in question.

@yrianderreumaux
Copy link
Author

@maciejkula & @patrickorlando thank you both for your thoughts on this. I am curious if you could point me to any code regarding the second option of wrapping the BruteForce index in a model with logic, I fear my technical skills are not sufficient to figure it out alone. In the meantime, using your suggestion we have added a middle (business) layer using React Native to filter out previously interacted with items, which whilst slowing us down ~.5 seconds per call, has solved the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants