Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to fine tune the existing weights on new data ? #50

Closed
asiddhant opened this issue Jun 14, 2018 · 4 comments
Closed

How to fine tune the existing weights on new data ? #50

asiddhant opened this issue Jun 14, 2018 · 4 comments

Comments

@asiddhant
Copy link

I converted the hdf5 file back as a ckpt file (using the custom_getter method in bilm/model.py) and tried to use it with architecture in bilm/training.py but the loaded weights give very bad perplexity on heldout data when I do run_test.py. Are the architectures in bilm/model.py and bilm/training.py compatible. If you feel I m doing something wrong, is it possible for you to share the ckpt file of the given hdf5 file.

Thanks

@matt-peters
Copy link
Contributor

matt-peters commented Jun 14, 2018

The architectures are the same. However, the hdf5 file doesn't contain the softmax weights as they aren't needed to compute the ELMo representations, and they significantly increase the size of the file. I'll make the checkpoint file for the pre-trained model with the softmax weights available shortly.

@asiddhant
Copy link
Author

Thanks a lot. I missed that part. Yeah it would be great to have ckpt file as well but since vocab will anyways be new, pretrained softmax would not be required for fine tuning as well. So I am closing this issue.

@asiddhant asiddhant reopened this Jul 18, 2018
@asiddhant
Copy link
Author

Hi would it be possible for you to share the checkpoint file as well?

@matt-peters
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants