Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Base Model Name of LlamaForQuestionAnswering #29258

Merged
merged 3 commits into from Mar 1, 2024

Conversation

lenglaender
Copy link
Contributor

@lenglaender lenglaender commented Feb 23, 2024

What does this PR do?

The LlamaForQuestionAnswering currently has the LlamaModel in the transformer variable. This does not match the base_model_prefix set in LlamaPreTrainedModel, which is "model".

This Pull Request changes the name from transformer to model in LlamaForQuestionAnswering

Who can review?

@lenglaender lenglaender marked this pull request as ready for review February 23, 2024 23:20
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR ! Unfortunately this is a breakign change - you could overwrite the base_model_prefix only for that class though, what do you think?

@lenglaender
Copy link
Contributor Author

lenglaender commented Feb 27, 2024

True, I didn't think about whether renaming the variable would be a breaking change. In this case, setting base_model_prefix to the right value is sufficient. I changed it.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be "breaking" but since it was not reported it means it was not used as you mentioned you cannot save + load

@ArthurZucker
Copy link
Collaborator

@younesbelkada feel free to merge if it is alright with you

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks !

@younesbelkada younesbelkada merged commit 2858d6c into huggingface:main Mar 1, 2024
19 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

calpt added a commit to adapter-hub/adapters that referenced this pull request Apr 6, 2024
Changes:
- HF changed parts of the Llama model implementation
- HF added a `LlamaForQuestionAnswering`. However, this model has a
wrong base model name. I added a workaround that solves this problem
until this is fixed in Transformers
(huggingface/transformers#29258)

---------

Co-authored-by: calpt <calpt@mail.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants