Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1vsAll objective and reciprocal triples #19

Closed
loreloc opened this issue Feb 28, 2023 · 4 comments
Closed

1vsAll objective and reciprocal triples #19

loreloc opened this issue Feb 28, 2023 · 4 comments

Comments

@loreloc
Copy link

loreloc commented Feb 28, 2023

Hi,
I have noticed that in your experiments the flag --score_lhs is not enabled, and this flag includes the component $-\log P_\theta(s\mid p,o)$ into loss. In contrast, the 1vsAll objective includes this conditional likelihood, so it seems there is a discrepancy between the objective function in the paper (where there is a conditioning on the subjects) and the one used here.

Is it because you augment the data set with reciprocal triples? If so, is this equivalent to assuming that $P_\theta(S=s\mid R=p,O=o) = P_\theta(O=s\mid R=p^{-1},S=o)$, where $r^{-1}$ denotes the inverse relation?

Thank you

@yihong-chen
Copy link
Contributor

yihong-chen commented Feb 28, 2023

@loreloc Hi Lorenzo, you are right. We did augment the data set with reciprocal triples as empirically we found that it is better than using score_lhs. Let me know if you have further questions.

@loreloc
Copy link
Author

loreloc commented Mar 1, 2023

Hi @yihong-chen,
thank you for your answer.

So do you confirm that the loss stated in the paper (Eq. 2) is not exactly the one used in the experiments, but it is actually the one showed in Lacroix et. al (2018) (Eq. 7) with the addition of the relation prediction auxiliary?

@yihong-chen
Copy link
Contributor

Hi @loreloc We have two implementations in our codebase, with- and without- reciprocal triples. The --score_lhs should be turned on if you are not using reciprocal triples. We also derive our objective (Eq.2) using this setting, as it is more clear to see the underlying idea of "perturbing every position". This view of "perturbing every position" is very similar to masked language modelling in NLP, if you treat each position (subject/predicate/object) as one token and mask it.

Our reported results are with reciprocal triples. So you are right, it is Lacroix et. al (2018) (Eq. 7) + the relation prediction auxiliary. In general, using reciprocal triples is a very useful trick as observed both in Dettmers et al., 2018 and Lacroix et. al (2018).

Let me know if there is anything else I can help.

@loreloc
Copy link
Author

loreloc commented Mar 2, 2023

Thank you! I think this can be closed.

@loreloc loreloc closed this as completed Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants