New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade sampler capability #209
Conversation
@lvermue sorry if I am commenting on work in progress 😅 |
🙄 |
@PyKEEN-bot Would you mind testing this PR? |
Trigger CI
@abstractmethod | ||
def sample(self, positive_batch: torch.LongTensor) -> torch.LongTensor: | ||
"""Generate negative samples from the positive batch.""" | ||
raise NotImplementedError | ||
|
||
def _filter_negative_triples(self, negative_batch: torch.LongTensor) -> torch.LongTensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary? I thought we argued in the paper that we could skip this because it's low probability
At least make a note in the docstring on when this assumption isn't so good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That depends. In usual datasets it's low probability. However there are some aspects to it.
E.g. is this expected to act a bit as a regularization, since it adds noise signal to the training data. To make things more complicated, the added noise signal depends on the ratio of true triples for a given entity relation or entity entity pair. Therefore, the effects are hard to distinguish and a researcher might want to exclude the possibility of having false negatives in the proposed negative triples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like something that could go in the module docstring ;)
In my opinion not, because heads and tails are corrupted based on the property of the relations. Uniformly sampling the relations would not reflect the initial motivation behind the Bernoulli-Sampler. |
@PyKEEN-bot test |
…to upgrade_sampler
Co-authored-by: Charles Tapley Hoyt <cthoyt@gmail.com>
:param num_negs_per_pos: Number of negative samples to make per positive triple. Defaults to 1. | ||
:param filtered: Whether proposed corrupted triples that are in the training data should be filtered. | ||
Defaults to False. | ||
:param corruption_scheme: What sides ('h', 'r', 't') should be corrupted. Defaults to head and tail ('h', 't'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did the reviewer have a specific use case when you might want to do something other than default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but there could be reasonable situations, e.g. when working with images where you know the objects/entities, but are interested in training to reason the most likely relation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find a reference right now, but iirc visual relation detection usually first detects all objects in the image (h
/ t
) and then predicts the relations between them (r
). One argument is that the distribution of visual representations of relations is more of a long tail distribution, and might include representations of the objects: e.g. in (man, rides, bike)
' the visual representations of rides
likely (partially) shows both, the man and the bike.
Far better explanations can be given by @sharifza 🙂
This allows better documentation
Trigger CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new API looks good and seems to be tested and documented very well.
There are two remaining comments from my side - one about possibly adding more documentation about a use case for non-default corruption schemes. If there are good reasons a person should know this feature exists besides satisfying the reviewer, it might be nice to include in this PR. The second is just a cosmetic thing. After CI is done, feel free to address those or just merge. Thanks for the good work.
@mberr can hopefully check the more methodological aspects then get this one done. Thanks Laurent!
@PyKEEN-bot Here's the merge contestant 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides these comments, lgtm
:param num_negs_per_pos: Number of negative samples to make per positive triple. Defaults to 1. | ||
:param filtered: Whether proposed corrupted triples that are in the training data should be filtered. | ||
Defaults to False. | ||
:param corruption_scheme: What sides ('h', 'r', 't') should be corrupted. Defaults to head and tail ('h', 't'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find a reference right now, but iirc visual relation detection usually first detects all objects in the image (h
/ t
) and then predicts the relations between them (r
). One argument is that the distribution of visual representations of relations is more of a long tail distribution, and might include representations of the objects: e.g. in (man, rides, bike)
' the visual representations of rides
likely (partially) shows both, the man and the bike.
Far better explanations can be given by @sharifza 🙂
@@ -59,7 +59,7 @@ def setUp(self) -> None: | |||
|
|||
def test_sample(self) -> None: | |||
# Generate negative sample | |||
negative_batch = self.negative_sampler.sample(positive_batch=self.positive_batch) | |||
negative_batch, _ = self.negative_sampler.sample(positive_batch=self.positive_batch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR: Here, this comment is unnecessary.
_
is actually a variable name, although by code convention normally not used for "proper" variables. It however means that the second return value is bound to a variable and kept in memory until it gets ouf of scope. Thus, in particular when having pytorch code, it is usually preferrable to use
negative_batch = self.negative_sampler.sample(positive_batch=self.positive_batch)[0]
As said in the TL;DR, this is only a unittest so minimizing memory utilization shouldn't be of primary interest here 😅
@PyKEEN-bot What do you think about the latest commit? |
This PR will: