Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed Token Missing #18

Closed
Youho99 opened this issue Apr 10, 2024 · 3 comments
Closed

Embed Token Missing #18

Youho99 opened this issue Apr 10, 2024 · 3 comments

Comments

@Youho99
Copy link

Youho99 commented Apr 10, 2024

I use LLaVA-1.5-13B

I want to add a LoRa on it (any one, to test), like this one for example.
However, I run into an error telling me that the embed tokens do not exist.

After extensive research, here is what I found on the internet:

For certain models, the embedding layers are trained for LoRa (when new words are present only in the finetune dataset for example)
TUDB-Labs/mLoRA#122

LLaMa (and therefore by extension LLaVA) is one of these models.
According to the comment on lines 1334 to 1337 of this code: https://github.com/FartyPants/Training_PRO/blob/main/script.py

# modules_to_save = ["lm_head", "embed_tokens"]
# If you added new tokens to the tokenizer, you may need to save some LoRA modules because they need to know the new tokens.
# For LLaMA and Mistral, you need to save `embed_tokens` and `lm_head`. It may vary for other models.
# `embed_tokens` converts tokens to embeddings, and `lm_head` converts embeddings to token probabilities.

You must then configure the modules ["lm_head", "embed_tokens"] in modules_to_save

In your LoRa these are not configured so it makes an error. Indeed, the embedding layer and the tokens that you have regenerated are not available in the LoRa provided.

Don't hesitate to tell me if I'm wrong about something.

@tingxueronghua
Copy link
Owner

hi, thanks for your detailed explanation.

I did not fine-tune lm_head or embed_tokens, so ChartLlama does not contain the weights of them. The script https://github.com/tingxueronghua/ChartLlama-code/blob/main/model_vqa_lora.py should work well for inference.

I am not sure how you load LLaVA-1.5-13B. But this link https://huggingface.co/liuhaotian/llava-v1.5-13b/tree/main should contain all the modules needed.

@Youho99
Copy link
Author

Youho99 commented Apr 11, 2024

I use the LLaVA template via text-generation-webui (https://github.com/oobabooga/text-generation-webui)

The multimodal model part works, but when given a LoRa, there is the problem mentioned above. Here is an issue explaining the issue in question: oobabooga/text-generation-webui#5826

I suppose there is therefore an error in the multimodal implementation of text-generation-webui repo

So I will try this repo to use your LoRa

@tingxueronghua
Copy link
Owner

I use the LLaVA template via text-generation-webui (https://github.com/oobabooga/text-generation-webui)

The multimodal model part works, but when given a LoRa, there is the problem mentioned above. Here is an issue explaining the issue in question: oobabooga/text-generation-webui#5826

I suppose there is therefore an error in the multimodal implementation of text-generation-webui repo

So I will try this repo to use your LoRa

I have not used that repository before and thus cannot provide suggestions... Sorry for that. Maybe you need to do some work to transfer ChartLlama's Lora weights to it. Feel free to reopen this issue if you still have other questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@tingxueronghua @Youho99 and others