Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pegasus support #11

Merged
merged 2 commits into from Sep 5, 2020
Merged

add pegasus support #11

merged 2 commits into from Sep 5, 2020

Conversation

HenryDashwood
Copy link
Contributor

Pegasus has the same parameter groups in Bart so the splitter works for both.

When Pegasus decodes it leaves these <n> symbols in, so I added a line the take them out.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@HenryDashwood
Copy link
Contributor Author

See #10

@ohmeow
Copy link
Owner

ohmeow commented Sep 5, 2020

Thanks @HenryDashwood. Going to work thru this after the fastcore PR. Was going to start looking at Pegasus support this weekend after adding in tests for the question answering bits but I'm glad to see you beat me to it :)

@ohmeow ohmeow merged commit 13452dc into ohmeow:master Sep 5, 2020
@ohmeow
Copy link
Owner

ohmeow commented Sep 8, 2020

Hey @HenryDashwood ... how are you testing this with Pegasus? That model is a beast! Training on colab right now but can only get away with a batch size = 2 and a max_length=256 (for the text to summarize).

@HenryDashwood
Copy link
Contributor Author

Yeah it’s a bit of a nightmare to work with. My advice, if you haven’t already seen it and need more memory, is https://datacrunch.io/products/

I’ve moved all the GPU work I do in and outside of work over to there since it’s shockingly cheap compared to Amazon who we were using before https://aws.amazon.com/sagemaker/pricing/

Obviously not free though. Let me know if you have any tests you would like run and I’ll see what I can do!

@ohmeow
Copy link
Owner

ohmeow commented Sep 9, 2020

Ah, never heard of it ... pricing looks nice.

You guys just using it for training I assume? Are you just running everything on the instance itself or are you training through something like Docker? Back in the old days I remember it being easiest just spinning up an EC2 instance, running a bash script that did all my installs, and training directly on it ... anyways, curious to know how you all are using it. Thanks for the info.

btw, added T5 and Pegasus support to library now. I couldn't run any of my tests for Pegasus cuz I'm running everything on my local 1080 TI. Anyhow, if you check the docs you'll see how I have the tests set up ... if you get a chance, yah, it would be great if you could test Pegasus and lmk how things go. I have tests for both the data and modeling so curious to know if the tokenization is right for the architecture and if it trains :). Lmk if you can. Thanks.

@HenryDashwood
Copy link
Contributor Author

HenryDashwood commented Sep 9, 2020

Will do!

Re DataCrunch. Yeah we only use it for training. All our models still get served on cpus.

They actually offer a one click to start up Jupyter notebook image with things like Fastai preinstalled. That's how I came across them. However I prefer to spin up a plain Ubuntu image, ssh in, run a shell script to set it up the way I like, and then do all my work through the terminal and VSCode. VSCode's notebook and ssh support is amazing now and DataCrunch is what finally gave me a reason to mould my workflow around it. I'm thinking of writing a blog or making a video about it because it's so much better than anything I've seen anywhere else.

@ohmeow
Copy link
Owner

ohmeow commented Sep 10, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants