New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CLOSED] Complete pytorch transformers interface, deprecate old GPT implement #881
Comments
Comment by sleepinyourhat Thanks for taking this on! I'm interested to add RoBERTa as soon as that comes out in pytorch-transformers, so if this isn't done by then, we may have competing PRs. |
Comment by sleepinyourhat |
Comment by pep8speaks Hello @HaokunLiu! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
You can repair most issues by installing black and running: Comment last updated at 2019-08-26 16:26:29 UTC |
Comment by HaokunLiu Mostly done, I'll run some test to see if there is any decline in performance. In the meanwhile, feel free to check the code, implementation decisions, naming, notations etc. and tell me where needs to change. |
Comment by sleepinyourhat RoBERTa is out in pytorch_transformers! We should try to merge it in soon. Mind either trying to finish this within a day or two so we can start a new RoBERTa PR, or else adding RoBERTA here directly (which might require other changes to match pytorch_transformers 1.0->1.1)? |
Comment by HaokunLiu
The tuning and testing haven't finished running, I am not sure if everything can be in place in one or two days. But I can add RoBERTA here. |
Comment by sleepinyourhat Thanks for the update! I'll take a closer look when this is all done. I'm away tonight–Sunday, but if you need help/feedback sooner, bug Yada/Alex/Ian. (If it's obviously done before then, they can dismiss my requests and merge.) |
Comment by HaokunLiu GLUE val results come out.
|
Comment by sleepinyourhat Thanks again for taking this on. I took a somewhat quick look at the newer changes, and everything seems reasonable to me. The new results are good enough that I'm not worried about major bugs/regressions, and I think it'll be hard to catch smaller performance problems through this kind of experiment. I left a couple of small suggestions. Let me know if there's anything that you want to do, or anything that I should look at closely, before we merge. |
Comment by HaokunLiu
I think I have addressed all suggests on this code from last and this round. I'll make a PR to the site repo in a day or two. One last thing, although pytorch_transformer_interface provided get_pretrained_lm_head and apply_lm_boundary_tokens (a small part of them are used in NPI and Blimp projects, but not in this PR), I didn't update LanguageModelingTask and MultiTaskModel._lm_forward |
Comment by sleepinyourhat For the LM code, would you mind adding some asserts to prevent people from using the untested code? Once that's done, it sounds like this is ready to merge. |
Comment by HaokunLiu
I don't know yet, I'll let you know when the result comes out. |
Comment by HaokunLiu
I was thinking about including tokenizer and indexer inside model_preprocessing_interface as well, and change the main process from create_tasks -> preprocess -> create model to create_tasks -> create model -> preprocess, so that members of model_preprocessing_interface can be passed from model, instead of first creating from args in preprocess, and again creating from args in model. But this is unnecessarily radical for the sake of #881 . Since you are one of the major developers of jiant, maybe you can consider this idea, and when the time comes, figure out a well-rounded overall architecture for jiant. |
Comment by HaokunLiu
Some results are different, but I think, considering how old gpt and new gpt differs, it meets the expectation. |
Comment by sleepinyourhat Thanks! That's not that informative of a comparison, but since you're getting numbers in the same ballpark as what OpenAI published, I think that's enough. I agree that it's okay to leave in some awkward abstractions for now—better to get this out there and refactor later than to put too much burden on you for doing it. There are some new merge conflicts (we moved the config dir), BTW. |
Comment by sleepinyourhat Ready to merge? |
Comment by HaokunLiu Yes, it’s ready. |
Comment by sleepinyourhat Great—I'll make a proper release tm unless someone beats me to it. |
Issue by HaokunLiu
Thursday Aug 08, 2019 at 15:20 GMT
Originally opened as nyu-mll/jiant#881
HaokunLiu included the following code: https://github.com/nyu-mll/jiant/pull/881/commits
The text was updated successfully, but these errors were encountered: