New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reopening issue #43 on data augmentation with reversed triples #45
Comments
Sorry, I misunderstood you the first time. You are talking not about inversing entities (this is standard across all work), but you are talking about concrete, additional parameters for inverse relationships. Yes, ConvE uses the same data augmentation method but this is so because ConvE is uni-directional and proper evaluation requires inverse relationships. Thus not using inverse relationships can be considered cheating in ConvE since it would give an advantage if relationships are invertible. For a comparison on how ConvE does with and without inverse relationships please see #18. Since the issue, I also worked on YAGO3-10 on which results are worse if a separate invertible relationship is used — all results in the ConvE paper are up-to-date with the correct procedure. Does that help? |
Ah, by construction ConvE cannot predict the x of a triple(x r y), right?
Does it make sense? |
I was having a closer look at Lacroix et al., 2018. They are using inverse relations whereas ConvE is using reverse relations -- I have been unclear in my terminology (I get confused by it all the time.) Reverse relationships are for example, (A, father of->, B) and (B, <-father of, A); inverse relationships are for example, (A, father of, B), (B, child of, A). It is very clear that you should get large gains on WN18, FB15k, WN18RR, and FB15k-237 if you augment your data with inverse relationships. I would consider this cheating because the model cannot know inverse relationships a priori unless they use a component in the model that models inverse relationships as the inverse model in this repository. Just to be explicit: ConvE does not use inverse relationships! As for reverse relationships, we have data for (1) and (2) for ConvE and it does not really help (better for Hits@10, worse for MRR and Hits@1). However, we only have (1) for the other models and I agree that proper baselines would have been better here. However, we were only aware of this bug in my code-base a couple of months after the publication and currently, I am a bit short on time to run grid searches for reverse relationships baselines. Could you do this? You can run this code-base for DistMult and ComplEx and then run the same baselines for 7b38eaf and then compare the results. |
Is it okay if I edit my posts about this? I mention that ConvE uses inverse relationships in some which is incorrect. |
Hi Tim,
I am not quite sure about the difference between relations "<-father of" and "child of". They are just symbols without any meanings, aren't they? Or do you mean "child of" is a bijective function that is the inverse of function "father of"? I think [Lacroix et al. 2018] augmented the training set with reverse relations by concatenating two independent tensors along the axis for relations (cf. Fig 1c in the paper). I think they don't make use of any inverse relationships.
I'll do this some time later. In general I think the field of knowledge graph embeddings needs a proper benchmark framework where one can switch between different losses (margin ranking loss, binary cross entropy and other new losses) and switch between datasets with and without reversed triples. Perhaps, then, the baselines may strike back again.
Sure! But please clarify first what you mean by inverse and reverse relation (see above). |
I'm pretty sure Lacroix et al., 2018 use reverse relations the same way as in ConvE and don't add any extra information, i.e. their relations aren't in any way named. |
Yes, you are both right. I checked the trail of references and the final description comes from Lin et al., 2016. [Lacroix et al. 2018] cites a work that cites this work for their method. Indeed, across these papers, the terminology between inverse and reverse is confusing (I was confused about this too), but the correct usage is: If you want to reverse the direction of a relationship, then that is reverse relationships. If you want to map this to another relationship which explicitly describes then this would be adding inverse relationships. Inverse relationships yield an advantage because the dataset might have such relationships in them (WN18, WN18RR, FB15k, FB15k-237). Reverse relationships are usually not contained within a training set and should give a smaller advantage, if at all. I agree that KBC would need a general framework which is correctly implemented and offers all the quirks and nuances across current work. The main difficulty is really getting the evaluation right. Many papers, including "Baselines Strike Back" are non-replicable and I think this is mainly due to the evaluation procedure. But yeah, would be great if you could run reverse relationships baselines for DistMult and ComplEx, @dschaehi! Let me know if you need anything to do that! |
Yes, I also agree about running a proper summary paper-like evaluation across all models. Actually, we've recently run reverse relations with 1-N scoring for DistMult and ComplEx: see Table 7 in our paper Hypernetwork Knowledge Graph Embeddings. |
Hi both, I am thinking about implementing a more thorough evaluation framework using PyTorch for popular baselines (TransE, RESCAL, ComplEx, DistMult ...). Once I have the basic framework done, then I'll let you know. Perhaps, you can join the project as contributers or fork the project and make pull requests. |
That sounds great, thanks! |
Thank you, @ibalazevic, for the pointer to the DistMult/Complex results — also a really nice paper! @dschaehi I do not know if I would be able to contribute to the project, but I would be happy to check implementations for correctness. Just ping me via email! Closing this as this has been settled. |
Hi both, I just came across a new ICLR submission (You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings) that does more that what I planned to do. So let's just wait until the authors release the source code. |
That looks great! Thanks for posting this! |
Thanks for pointing this out, I will have a look! |
Thanks for the answer in #43 (comment), but I don't quite get the point. As pointed out in [1] adding inverse relations to the training set affects the performance of the model. To cite their paper:
So does ConvE addi inverse relations as [1] did in their paper? Then according to [1] one can conclude that ConvE has profited from this data augmentation, unless it does an ablation study and shows there is no difference, right? I think this is an important point concerning a fair comparison against other existing; this can decide acceptance/rejection of future knowledge graph embeddings papers!
[1] Lacroix, Timothee, Nicolas Usunier, and Guillaume Obozinski. “Canonical Tensor Decomposition for Knowledge Base Completion.” In International Conference on Machine Learning, 2863–72, 2018. http://proceedings.mlr.press/v80/lacroix18a.html.
The text was updated successfully, but these errors were encountered: