Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about OpenTapioca #41

Open
pmitankin opened this issue Jun 2, 2022 · 0 comments
Open

A question about OpenTapioca #41

pmitankin opened this issue Jun 2, 2022 · 0 comments

Comments

@pmitankin
Copy link

Dear Antonin Delpeuch,

I read your paper "OpenTapioca: Lightweight Entity Linking for Wikidata" https://arxiv.org/abs/1904.09131.
I like the paper a lot.
I looked at the code https://github.com/wetneb/opentapioca.
I would like to ask you a question about the paper and the implementation.
In the paper on page 6 the last term in the equation for s(e, e') is
(1 - beta)^2 |l(e) intersection l(e')| / |l(e)| |l(e')| .
As far as I understand this term is implemented here:
https://github.com/wetneb/opentapioca/blob/master/opentapioca/similarities.py#L67
My question is: why is it not proba += (1-beta)(1-beta)(len_common/(len(edges_a)*len(edges_b))) ?
In the paper |l(e) intersection l(e')| is not squared but in the implementation len_common is squared.
As far as I understand len_common should not be squared because the term
(1 - beta)^2 |l(e) intersection l(e')| / |l(e)| |l(e')|
is the probability of reaching the same vertex v with one hop from e and one hop from e' if v does not belong to l(e) and l(e').
And (1 - beta)^2 is the probability of not staying on e and e',
|l(e) intersection l(e')| / |l(e)| is the probability of reaching from e (selecting) some vertex v from the intersection and
1 / |l(e')| is probability of reaching from e' the selected vertex v.
It seems that the formula in the paper is correct but is not implemented correctly, isn't it?
Thank you.

Regards,
Petar Mitankin
Software developer
Sirma AI, trading as Ontotext, http://www.ontotext.com/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant