New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do you have any plans to release the frequency of the entity? #9
Comments
Yes, we just counted the number of hyperlinks as the frequency. |
Thanks for your prompt reply! |
I'm trying to extract entities from an existing query, looking for the frequency of entities. But I found that there are a large number of entities that can't find the corresponding frequency. Please is it because of something wrong with my code or is there really some entities missing from the file. Below is my code: |
We used the For example, the label of the Wikidata item Q138518 is "Princeton", while the corresponding Wikipedia title is "Princeton, New Jersey". When building the dataset, we extracted (subject, relation, object) triples from TREx and used the Let me know if you have any further questions! |
It is mentioned in the paper that “In our analysis, we use the Wikipedia hyperlink count as a proxy for an entity’s frequency.”.
My understanding is to traverse the entire Wikipedia to count the number of hyperlinks that link to an entity details page to represent the entity frequency.
This can be difficult, and implementations can introduce bias, Do you have any plans to release a file describing the frequencies of entities, or code that counts entity frequencies.
Thank you so much!
The text was updated successfully, but these errors were encountered: