Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model training #4

Closed
malteos opened this issue Feb 10, 2022 · 13 comments
Closed

Model training #4

malteos opened this issue Feb 10, 2022 · 13 comments

Comments

@malteos
Copy link

malteos commented Feb 10, 2022

Thanks for your awesome work @lucidrains !

Are you aware of any efforts on reproducing the actual model training?

@lucidrains
Copy link
Owner

lucidrains commented Feb 10, 2022

yup, @enijkamp is planning on doing so, but he will first test this repo and then port it over to jax

potentially 7B parameters i'm told, and he is going to push for open source 🥳

@malteos
Copy link
Author

malteos commented Feb 10, 2022

Glad to hear that. I'd be happy to contribute. Are there any particular issues that need some help?

@lucidrains
Copy link
Owner

@malteos best to reach out to Erik, as he is doing the training ;)

@paraschopra
Copy link

paraschopra commented Feb 11, 2022 via email

@lucidrains
Copy link
Owner

@paraschopra Yup, someone at Eleuther is eyeing the paper (and probably going to use my repo) - so if Erik doesn't fall through, there's them

@ronald-d-rogers
Copy link

ronald-d-rogers commented Feb 12, 2022

Do you all plan on open sourcing the world knowledge somehow as well?

@lucidrains
Copy link
Owner

@ronald-d-rogers do you mean the retrieval database?

@ronald-d-rogers
Copy link

ronald-d-rogers commented Feb 14, 2022 via email

@lucidrains
Copy link
Owner

@ronald-d-rogers I don't know what their specific plans are. Just ran into someone working close to the eleuther founders who was also working on retro

@lucidrains
Copy link
Owner

ok, i'm closing this, feel free to reach out to Erik or Kip (at Eleuther) if you are interested in contributing towards an open sourced model

@enijkamp
Copy link

@lucidrains Yes, working on retro-fitting CodeGen, but may take a few more weeks:
https://mobile.twitter.com/arankomatsuzaki/status/1508246117351362560

@ronald-d-rogers
Copy link

@lucidrains Yes, working on retro-fitting CodeGen, but may take a few more weeks: https://mobile.twitter.com/arankomatsuzaki/status/1508246117351362560

@enijkamp I think what you all are doing is great. A difference between this and other models though is that it's a two part system, one is the model and the other is the retrieval database. Have y'all thought about whether or not you'd open source the retrieval database as well? My understanding is that it would be quite large (~93TB for MassiveText which is 10.5TB on disk, so maybe ~8TB for The Pile?).

@paraschopra
Copy link

@enijkamp great work with codegen. Looking forward to the open source version of RETRO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants