You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I was wondering if you guys can tell me how I can generate weights for distilgpt2 for the mend/efk baselines, similar to what you have for gpt2-xl here: https://rome.baulab.info/data/weights/. I'm trying to run these baselines but don't have the saved weights. I tried simply loading and saving huggingface's weights for distilgpt2 but it looks like the code is looking for something a bit different. If you guys have a script/suggestions, that would be great.
Thanks!
The text was updated successfully, but these errors were encountered:
They have instructions for training a MEND baseline for a variety of GPT models; I don't recall whether DistilGPT is supported out-of-the-box, but I'm sure Eric can help you with this if not! After training a model, simply place the trained .pt model in baselines/mend/weights. You can consult mend_main.py for details on naming conventions, so that the code registers the new model file.
Hi, I was wondering if you guys can tell me how I can generate weights for distilgpt2 for the mend/efk baselines, similar to what you have for gpt2-xl here: https://rome.baulab.info/data/weights/. I'm trying to run these baselines but don't have the saved weights. I tried simply loading and saving huggingface's weights for
distilgpt2
but it looks like the code is looking for something a bit different. If you guys have a script/suggestions, that would be great.Thanks!
The text was updated successfully, but these errors were encountered: