fix gemma4 num attention head bugs#7975
fix gemma4 num attention head bugs#7975mingxiang1006 wants to merge 2 commits intodeepspeedai:masterfrom
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Hi @mingxiang1006 , thanks for the fix. I have two questions:
|
HI @delock --> it was a temporary fix . agree with your suggestion, should fall back to text config, if hf_model_config does not have the key. Yes it need further thought into this, when to trigger text config and vision. |
Hi, we can start from making text_config a fallback path. For pick between text config and vision, does the modeling know which one is being used? It might be okay to stay with text_config for the time being, because Ulysses SP are likely work on text than vision, but I want to have better understanding of the mechanism behind Gemma4. |
Error as there is module under Gemma4Config, either Gemma4 text config, visual or audio config, to grab the num attention head. This will cause run time error during deepspeed launch.