Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning a Pretrained Model Using MuP #31

Closed
zanussbaum opened this issue Dec 7, 2022 · 3 comments
Closed

Finetuning a Pretrained Model Using MuP #31

zanussbaum opened this issue Dec 7, 2022 · 3 comments

Comments

@zanussbaum
Copy link
Contributor

Somewhat of a naive question, but say we have pretrained a model and now want to finetune it on a downstream task. Is there any reason we shouldn't replace the MuP layers with the equivalent torch layers? I have to imagine that we don't need to use MuP here, but want to make sure that this doesn't break anything if we replace them

@thegregyang
Copy link
Contributor

thegregyang commented Dec 7, 2022 via email

@zanussbaum
Copy link
Contributor Author

zanussbaum commented Dec 7, 2022 via email

@thegregyang
Copy link
Contributor

thegregyang commented Dec 7, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants