-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Create demo for using dense, sparse and embedded features. #31
Comments
Would it be possible to learn a ranker from pairwise data where the features are latent factors (w/o any hand made features)? Like a matrix factorization model? So the input to the pairwise loss is the respective embeddings for two documents you are ranking... |
Yes, it is possible. |
Yes, I was thinking about jointly training a document embedding. I have pairwise labels (A > B, etc). For each labeled pair (A,B), I'll lookup their embeddings (A_emb, B_emb) and use that as the document features. This would replace classical LTR query-document features (I don't have any queries in my context anyways). Not sure what you mean w/ the |
Here's my example (modified from tf ranking example) of using an embedding to learn a latent factor model:
I modified my input function to only return an This gets good results on my ranking task for MRR: baseline: .54 |
Thanks for sharing this example, Alex. This looks great. If you wish, you could define the feature columns outside, so that you can also use them to make the |
When you say "define your feature columns outside", you mean like in the notebook example, where there is a Also, I don't understand what the |
Hello, I would just like to chime in that having an example of using feature columns where group_size and feature dimension is not 1 would be helpful. I can use a groupwise feature tensor directly with dimension scores = _score_fn(
large_batch_context_features, large_batch_group_features, reuse=False) error is:
I think it has something to do with the shape to the feature column but I'm unsure what's the issue here. |
FWIW I ran into the same issue as @darlliu |
Please check out the demo on using sparse features and embeddings in TF-Ranking. You can click on the colab link to start executing the content of the notebook. Tensorboard is integrated into the notebook, and you can use it to visualize the eval and loss curves. Feel free to post your feedback by responding on this issue. |
Thanks for posting some concrete examples in this new notebook. Some questions:
|
And also, I think it would be nice to have the hyper def example_feature_columns(params):
rest_id = categorical_column_with_identity(key='rid', num_buckets=item_buckets)
rest_emb = embedding_column(rest_id, params.K)
return {"rid": rest_emb}
def make_transform_fn():
def _transform_fn(features, mode, params):
"""Defines transform_fn."""
example_name = next(six.iterkeys(example_feature_columns(params)))
input_size = tf.shape(input=features[example_name])[1]
context_features, example_features = tfr.feature.encode_listwise_features(
features=features,
input_size=input_size,
context_feature_columns=context_feature_columns(),
example_feature_columns=example_feature_columns(),
mode=mode,
scope="transform_layer")
return context_features, example_features
return _transform_fn See how I added The use case for this is the common one where the embedding dimensions are a hyper parameter. Would you accept a PR from me for this? |
Hi Alex, great set of questions. Please find my replies inline.
|
I like this PR suggestion. Please go ahead with this. One thing to keep in mind is that you will need to change the model_fn builder, which expects only (features, mode) argument. See this line for more details.
|
No description provided.
The text was updated successfully, but these errors were encountered: