You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly thank you so much for building this. I am really looking forward to using it with LangChain to get chat functions into Slack. Hopefully they integrate it soon! When my Python is a bit more up-to-scratch, I'll hopefully be able to get involved!
Secondly, I'm experiencing an issue when using the GPU containers for my models using safetensors. The output is long but here is a snippet:
llm-api-app | File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
llm-api-app | raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
llm-api-app | RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
llm-api-app | Missing key(s) in state_dict: "model.layers.0.self_attn.k_proj.g_idx", "model.layers.0.self_attn.o_proj.g_idx", "model.layers.0.self_attn.q_proj.g_idx", "model.layers.0.self_attn.v_proj.g_idx", "model.layers.0.mlp.down_proj.g_idx", "model.layers.0.mlp.gate_proj.g_idx", "model.layers.0.mlp.up_proj.g_idx", "model.layers.1.self_attn.k_proj.g_idx", "model.layers.1.self_attn.o_proj.g_idx", "model.layers.1.self_attn.q_proj.g_idx", "model.layers.1.self_attn.v_proj.g_idx", "model.layers.1.mlp.down_proj.g_idx", "model.layers.1.mlp.gate_proj.g_idx", "model.layers.1.mlp.up_proj.g_idx", "model.layers.2.self_attn.k_proj.g_idx", "model.layers.2.self_attn.o_proj.g_idx", "model.layers.2.self_attn.q_proj.g_idx", "model.layers.2.self_attn.v_proj.g_i ... snip ...
I have tried a few models models, which all exhibit this issue. If you want to test with one, I reliably get the error with the following model: TheBloke/wizard-mega-13B-GPTQ
After running a safetensor model I also then can not run other models e.g. anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g
I am using your upstream image for this (not building locally) via the provide compose file.
Is there anything I should be doing different when using GPTQ models? Please let me know if you need more information.
The text was updated successfully, but these errors were encountered:
Firstly thank you so much for building this. I am really looking forward to using it with LangChain to get chat functions into Slack. Hopefully they integrate it soon! When my Python is a bit more up-to-scratch, I'll hopefully be able to get involved!
Secondly, I'm experiencing an issue when using the GPU containers for my models using safetensors. The output is long but here is a snippet:
I have tried a few models models, which all exhibit this issue. If you want to test with one, I reliably get the error with the following model:
TheBloke/wizard-mega-13B-GPTQ
After running a safetensor model I also then can not run other models e.g.
anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g
I am using your upstream image for this (not building locally) via the provide compose file.
Is there anything I should be doing different when using GPTQ models? Please let me know if you need more information.
The text was updated successfully, but these errors were encountered: