New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add bert base class and restructure encoding extractor #383
Conversation
@adelavega @tyarkoni this is now at a stage where it would benefit from some input, both on the general structure as well as on a few specific more points:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall; see comments.
Thoughts on your questions:
Looks pretty sensible to me. One thing you could potentially do is add a slimmer, abstract (i.e., uses
That would be neat if it's very straightforward. If it isn't, we can hold off until after we've used these classes a bit and have a sense of how we like the API.
Yeah, this problem has come up elsewhere. Passing additional arguments to
That seems like a good idea (though probably not a high priority); feel free to open an issue.
Per our discussion a couple of weeks ago, I think I'm leaning towards defining a new |
Pretty much agree on everything Tal already said. I think adding an About the I have some thoughts about the postprocessing issue, I'll open an issue to discuss. |
Co-Authored-By: Tal Yarkoni <tyarkoni@gmail.com>
Co-Authored-By: Tal Yarkoni <tyarkoni@gmail.com>
@adelavega @tyarkoni There are a bunch of new tests compared to the previous Bert implementations, as we now support more extractors and more pretrained models. However, tests always crash on travis, most often with a mysterious I tried a few things to solve this. First, I When I run only one test, disabling all others, tests tend to pass. Which suggests that it is not an issue with the memory requirements of the individual tests. What's weird is that sometimes, even when choosing a subset of the tests and succeeding in running all of them, out of memory errors after, that is for the I've also tried to put all Bert tests in a separate test file, and run it on travis as a separate line in There's also one more issue, which seems to be more easily solvable though. Some of the tests require downloading pre-trained models and tokenizers. This takes a few minutes, sometimes even more than 10 (much faster locally), and travis has a 10 minutes no input timeout. But this seems to be more easily solvable, maybe using Sorry for the long comment and the bother, I've really tried my best and have now run out of ideas. |
No worries, thanks for making an effort! My guess is we're hitting the limits of Travis's free tier. I'll look into pricing etc. and see if it makes sense to upgrade (likely not), but for the time being let's not worry about this. What would probably be helpful is to create a new pytest tag called "high_mem" or something like that, and then mark any tests known to cause failure on travis and make sure they're skipped in the pytest configuration. We already do this for the paid tests, you can just adapt from that. Obviously this isn't ideal, but for the time being we can just run the tests locally, and we'll figure something out later. (This is also still an issue for the paid services; I disabled them because of cost/bugs > 1 year ago, and haven't re-enabled them. At some point we might just need to rethink our testing approach/infrastructure). |
Thanks for the quick reply, Tal! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one minor comment about a missing docstring.
self.return_softmax = return_softmax | ||
self.return_masked_word = return_masked_word | ||
|
||
def update_mask(self, new_mask): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is publicly exposed, so needs a docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Tal! Docstring updated. If everything else is good should be ready to merge.
@rbroc this is ready to merge, right? If so, I can go ahead and do so |
ready to merge, @adelavega, unless @tyarkoni has further remarks. |
restructuring BERT extractors so to have general PretrainedBert class from which encoding extractor and masked language model extractor both inherit.