Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing models for training #18

Closed
pengshancai opened this issue Mar 8, 2022 · 2 comments
Closed

Missing models for training #18

pengshancai opened this issue Mar 8, 2022 · 2 comments

Comments

@pengshancai
Copy link

Dear author, I tried to load the fluency_news_model_file models but failed. It seems that the "news_gpt2_bs32.bin" is not provided in the release.

I tried to replace it with "fluency_news_bs32.bin", but it does not seem to match the GeneTransformer. I.e. when I tried to load the fluency model using modelf=GeneTransformer(max_output_length=args.max_output_length, device=args.device, starter_model=fluency_news_model_file)
it shows "IncompatibleKeys(missing_keys=['transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.masked_bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.masked_bias'], unexpected_keys=[])
"

Is this fine?

In addition, when I tried to load the key word coverage model, the keys do not match either
I.e. When running modelc = KeywordCoverage(args.device, keyword_model_file=coverage_keyword_model_file, model_file=coverage_model_file)}
It shows
IncompatibleKeys(missing_keys=['bert.embeddings.position_ids', 'cls.predictions.decoder.bias'], unexpected_keys=[])

Wondering how I could deal with this situation

@tingofurro
Copy link
Collaborator

Hey Pengshan,

Thanks for reaching out. The missing keys are due to a version mismatch in Transformers, they had very minor modifications to the model, which means that some parameters moved around. However, from experience, the models still work (typically you have to load with strict=False, so that the program keeps going).

We had a follow-up paper called Keep it Simple (on Text Simplification), and the Coverage & Fluency models are more recent and slightly better, I would use those, they have the same APIs so it should be easy to swap. (In particular I simplified the Coverage model).

Let me know if it doesn't work on your end!

@pengshancai
Copy link
Author

Thank, the new repo helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants