New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replicate precision@k score with predict method (rather than predict_rank) #568
Comments
Hello! I believe the problem is in the order of applied operations in the lines
The conversion of
|
Oh, of course. Thanks for pointing that out. |
Follow up question if anyone still follows this thread: Why does measuring performance using the predict method solution given above yields the same results as the predict_rank method without setting the train_interactions argument? The way I understand it, using the train_interactions argument in the precision_at_k function would be equivalent to the excluding known positives from recommendations step, but I guess it is not. |
Btw, I made the following change to the code and now it work as I expected:
In the original code the list comprehension filter was not working due to known_positives_ids being a pd.Series and not a list. |
is precision@10 : 0.004322766792029142 is good enough for the model , actually I'm working on another model & the best precision about 0.012 |
Hi,
I tried to replicate the precision@k score resulting from the
precision_at_k
method using thepredict
method.The
precision_at_k
method is based onpredict_rank
, but since I have many items to rank for each user, thepredict
method is more suitable/faster.Clearly, whether one is using
predict_rank
orpredict
should not change the precision@k score, but I was unable to replicate the score I get fromprecision_at_k
(based onpredict_rank
) with thepredict
method.In fact the evaluation scores from the
predict
method are always worse than the evaluation scores derived by theprecision_at_k
method included in the package. Why is that?Below is an example using open source data.
For simplicity, I'm using only a fraction of the data, a basic model without features, known positives are not removed (train_data argument is not specified in
precision_at_k
).Why is this important: The
predict
method is more suitable in cases where many items need to be ranked, which is my use case as well. Also, I want to calculate ndcg for evaluation and if I can replicate the prec@k score with predict, I know the post-processing of the predictions is correctly set up and I can just change the metric.This gives precision@10 : 0.004322766792029142.
Under the hood, the precision@k used the predict_rank method which generates the precision@k like this:
Just to demonstrate that this gives precision@10 : 0.004322766792029142.
Which gives 0.0005763688760806917.
So in summary, the predict_rank gives precision@k score = 0.004322766792029142 and the predict method gives precision@k score=0.0005763688760806917.
The text was updated successfully, but these errors were encountered: