Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for LLAMA models #4

Closed
parth-chudasama opened this issue May 21, 2023 · 5 comments
Closed

Support for LLAMA models #4

parth-chudasama opened this issue May 21, 2023 · 5 comments

Comments

@parth-chudasama
Copy link

Hi, Is there any plan to support LLAMA based models?

@aws-rhsoln
Copy link

We are working on adding support for LLAMA in an upcoming release. Will update once we have it.

@romanserg
Copy link

Hi, I am trying to apply the current implementation of the class LlamaForSampling. I am trying this code:

model = AutoModelForCausalLM.from_pretrained("openlm-research/open_llama_7b_700bt_preview")
save_pretrained_split(model, 'llama-split')
model_neuron = LlamaForSampling.from_pretrained('llama-split', batch_size=1, tp_degree=2, n_positions=256, amp='f32', unroll=None)
model_neuron.to_neuron()

and I am getting this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[22], line 1
----> 1 model_neuron.to_neuron()

File ~/anaconda3/envs/llm310/lib/python3.10/site-packages/transformers_neuronx/llama/model.py:76, in LlamaForSampling.to_neuron(self)
     73 new_layer.add_pre_mlp_layer_norm(layer.post_attention_layernorm.weight.detach(), None)
     75 # Note: Automatic MLP padding is safe since zeros are *only* introduced to intermediary state
---> 76 new_layer.add_parameter(mlp.gate_proj.weight.T, sharding=1, allow_pad=True)
     77 new_layer.add_parameter(mlp.up_proj.weight.T, sharding=1, allow_pad=True)
     78 new_layer.add_parameter(mlp.down_proj.weight.T, sharding=0, allow_pad=True)

File ~/anaconda3/envs/llm310/lib/python3.10/site-packages/torch/nn/modules/module.py:1269, in Module.__getattr__(self, name)
   1267     if name in modules:
   1268         return modules[name]
-> 1269 raise AttributeError("'{}' object has no attribute '{}'".format(
   1270     type(self).__name__, name))

AttributeError: 'DecoderLayer' object has no attribute 'add_parameter'

Do you have any ideas regarding this issue?

@AWSGH
Copy link

AWSGH commented Jul 10, 2023

Llama is still under dev, please follow progress here: https://github.com/aws-neuron/transformers-neuronx/tree/main/src/transformers_neuronx/llama

@aws-donkrets
Copy link

Hi parth-chudasama, SDK releases > 2.14 offer support for llama2 models. Can you download one and let us know if that works for you. We plan to continue improving its accuracy and performance in subsequent releases.

@mrnikwaws
Copy link
Contributor

Closing since LlamaV2 support has now been added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants