-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idefics-2-base model fine-tuning throws indexing error #30464
Comments
I have the same issue. |
Hi @rabiulcste @BiliBraker thanks for reporting! cc @VictorSanh In case you have an immediate idea why this is happening? |
Problem you running into is that tokenizer for base model is incorrect and contains <end_of_utterance> token(prbably it's exactly the same as chat model), but base model's embedding layer doesn't have it. So if you reuse dataset/collator code for finetuning chat model and use processor.apply_chat_template function, it'll add non existing token_id and model's embedding layer will freak-out.
|
@jjkjkj That's a good find. So, for now I just removed the token and it seem to be working
|
I wanted to mention another issue in the same script. While
It doesn't occur while QLora is set to True. |
@rabiulcste Can you open a new issue with this info? This helps us keep better track of what has and hasn't been resolved as well as finding similar issues |
Does not ring a bell unfortunately :/ need to focus on idefics2 2nd release wave but will for sure allocate time to dig in this week if it's not solved by then |
Sure, I'll create a new issue then. I have a couple more issues though :) Is it suggested to create a separate issue for each? |
@rabiulcste Yes please, as long as they're independent. |
@VictorSanh No need to dig! Issue was found and explained by @jjkjkj here. It was to do with the presence of the In fact, we can now close this issue :) |
System Info
transformers
version: 4.40.0.dev0Who can help?
@amyeroberts
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
HuggingFaceM4/idefics2-8b-base
as model name. This behavior doesn't appear with the instruction-tuned checkpointHuggingFaceM4/idefics2-8b
https://colab.research.google.com/drive/1NtcTgRbSBKN7pYD3Vdx1j9m8pt3fhFDB?usp=sharingExpected behavior
The model should be working fine as the instruction-tuned one. I suppose it might be some tokenization issue.
The text was updated successfully, but these errors were encountered: