-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Shared layers in multi-task model are no longer shared after loading the model from a checkpoint #3446
Comments
For this specific example, I think either of the following two method works. (Please let me know if you see any problem in these two methods.) I was wondering if this bug can be fixed inside the Method 1: assign the embedding layers of one task to the other tasks
Method 2: create each component in the same way they were created initially and load their state dicts separately.
|
Describe the bug
Thank you for developing and maintaining this invaluable module!
We would like to learn a multi-task model on two NER tasks by sharing a transformer word embedding.
We fine-tuned the model for several epochs and saved the checkpoint every epoch by specifying
save_model_each_k_epochs=1
when calling the functionfine_tune
.Now, assume we would like to continue fine-tuning from a previously saved checkpoint.
We loaded the model from a checkpoint by calling the function
MultitaskModel.load
. However, the transformer word embedding is no longer shared between the two tasks.To Reproduce
Expected behavior
Shared layers between tasks are still shared after loading from a checkpoint.
Logs and Stack traces
No response
Screenshots
No response
Additional Context
No response
Environment
Versions:
Flair
0.13.1
Pytorch
2.0.0+cu117
Transformers
4.40.0
GPU
True
The text was updated successfully, but these errors were encountered: