-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with r-precision #2
Comments
Hi Oliver, Thanks for your interest in Yesterday, when I did the last commit, I noticed something was off in the code there! Thanks for your feedback. Have a good one, Elias |
@osf9018 the issue is now fixed. Thanks again for your feedback! Closing. |
Hi Elias,
Thanks for the fix.
Until recently, I focused mainly on recall, MAP, R-precision and ***@***.*** but
since one of my students is using ***@***.*** in the context of question
answering, I also consider including it in the measures I use. If I don't
make a mistake, you define ***@***.*** as "the number of relevant documents
retrieved". It is no so easy to find a reference definition for this
measure but I have the feeling that it is most likely defined as the
fraction of queries for which at least one relevant document is found among
its top-k retrieved documents. At the query level, it is a binary measure :
0 is none relevant document is found among the top-k retrieved documents
and 1 if at least one relevant document is found.
I agree that in Information Retrieval, we can speak about the number of
hits for referring to the number of relevant documents among the documents
retrieved for a query but I have the feeling that ***@***.*** generally refers
to a measure with values in the range [0,1]. Did you define your ***@***.***
measure with a specific reference in mind?
Best regards,
Olivier
Le mer. 1 déc. 2021 à 19:15, Elias Bassani ***@***.***> a
écrit :
… @osf9018 <https://github.com/osf9018> the issue is now fixed.
Thanks again for your feedback!
Closing.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACS37TCNACMCBABQQRLD4H3UOZQ55ANCNFSM5JBG6I7Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Sorry Oliver, I do not understand what |
Hi Elias,
I am not sure to understand what you don't understand since I don't see any
***@*** in my message :-) You mean ***@***.***? More globally, my message was
just about the definition of the hits measure.
Olivier
Le lun. 6 déc. 2021 à 09:33, Elias Bassani ***@***.***> a
écrit :
… Sorry Oliver, I do not understand what ***@***.*** means.
Could you clarify, please?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACS37TDZFAC74BSYKKETME3UPRYN7ANCNFSM5JBG6I7Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Sorry Oliver, I am confused: I see I think I probably called that measure It is a sub-function of other metrics, so I decided to expose it if someone wants to use it, but maybe I should hide it as it is not that useful anyway. For example, def precision(qrels, run, k):
# If k is 0 use the number of retrieved documents
k = k if k != 0 else len(run)
return hits(qrels, run, k) / k while def recall(qrels, run, k):
# If k is 0 use the number of retrieved documents
# In this case k is used just to avoid useless computations
# as we divide the number of retrieved relevant documents (hits)
# by the number of relevant documents later
k = k if k != 0 else len(run)
return hits(qrels, run, k) / len(qrels) You can use it for an analysis purpose if you want, but I suggest you stick with the other metrics for scientific evaluation / comparison. |
Hi Elias,
Le jeu. 9 déc. 2021 à 15:45, Elias Bassani ***@***.***> a
écrit :
Sorry Oliver, I am confused: I see ***@***.*** in your messages on four
different browsers on two devices.
Funny. I guess it is perhaps the message passed through a github email
address and @ is interpreted as something special in this context.
I think I probably called that measure Hits because it is the (mean)
number of relevant documents retrieved for each query.
It is an integer value for each query, not a boolean one.
I am sure I saw it in some paper, but I do not recall which one at the
moment.
It is a sub-function of other metrics, so I decided to expose it if
someone wants to use it, but maybe I should hide it as it is not that
useful anyway.
For example, precision is something like:
def precision(qrels, run, k):
# If k is 0 use the number of retrieved documents
k = k if k != 0 else len(run)
return hits(qrels, run, k) / k
while recall is something like:
def recall(qrels, run, k):
# If k is 0 use the number of retrieved documents
# In this case k is used just to avoid useless computations
# as we divide the number of retrieved relevant documents (hits)
# by the number of relevant documents later
k = k if k != 0 else len(run)
return hits(qrels, run, k) / len(qrels)
You can use it for an analysis purpose if you want, but I suggest you
stick with the other metrics for scientific evaluation / comparison.
I see your point and it was already what I understood from your code.
Perhaps, it can be less confusing to change the name of this metric in
https://github.com/AmenRa/rank_eval/blob/76b6e241b4c8a860e72c305d95204e5bc04d20bf/rank_eval/meta_functions.py#L29
into something like n_hits or something like that.
if metric == "hits":
return hits
…-->
if metric == "n_hits":
return hits
Only a suggestion for avoiding misunderstanding from people who know "hits
at k" as a metric with the definition I mentioned. But it is not essential.
Best regards,
Olivier
|
Hi Oliver, Thanks again for your feedback. I also inform you that my tool is going to change name soon because of naming similarities with other tools. Best regards, Elias |
Hi,
I tested your code and found that it was easy to use and integrate. Moreover, the results I got are fully coherent with those I previously obtained with a personal implementation of trec_eval and the computation of the measures is fast. This is clearly an interesting software and its presentation to the demo session of ECIR 2022 is a good thing.
I had only a problem with the R-precision measure. The main problem is that if you replace ""ndcg@5" in the 4th cell of the overview.ipynb notebook, you get:
`
TypeError Traceback (most recent call last)
/tmp/ipykernel_28676/2318072837.py in
1 # Compute NDCG@5
----> 2 evaluate(qrels, run, "r-precision")
/vol/data/ferret/tools-distrib/_research_code/rank_eval/rank_eval/meta_functions.py in evaluate(qrels, run, metrics, return_mean, threads, save_results_in_run)
149 for m, scores in metric_scores_dict.items():
150 for i, q_id in enumerate(run.get_query_ids()):
--> 151 run.scores[m][q_id] = scores[i]
152 # Prepare output -----------------------------------------------------------
153 if return_mean:
TypeError: 'numpy.float64' object does not support item assignment
`
I first detected the problem through the integration of your code and obtained the same error. By looking at the file meta_functions.py where the problem arises:
I saw your recent last update of this part of the code but there is still a problem since for R-precision, the mean of the scores is stored in run.score and not in run.mean_scores. As a consequence, the use of run.scores for storing the score of each query raises a problem if both return_mean and save_results_in_run flags are set to True. More globally, I am not sure to understand why you differentiate R-precision from the other measures concerning the computation of the mean score.
Thank you by advance for your efforts for fixing the issue.
Olivier
The text was updated successfully, but these errors were encountered: