Support for LLaMA #841

zphang · 2023-03-18T02:25:45Z

No description provided.

megatron/model/transformer.py

megatron/neox_arguments/neox_args.py

StellaAthena · 2023-04-10T15:05:31Z

@zphang where are we on this?

zphang · 2023-04-12T21:11:10Z

Other than the above note on LLaMAMLP, I can incorporate the necessary changes and update the PR.

zphang · 2023-04-18T21:24:36Z

Addressed comments. Please take another look. For further additions (e.g. guide to tuning a LLaMA model), I can do a separate PR.

StellaAthena · 2023-04-19T22:42:10Z

I'm confused by the "make_vocab_size_divisible_by" argument in the config file. Can you explain what it does? We generally assume that the data has been preprocessed including pre-tokenzied, and that's the point in time where it would make sense to have such a thing (and we do).

Otherwise, this looks good to me. @Quentin-Anthony, any thoughts?

zphang · 2023-04-20T08:05:50Z

make_vocab_size_divisible_by is an argument pre-existing in NeoX. If I understand correctly, NeoX determines the vocab size based on the tokenizer, and then pads it to a multiple of make_vocab_size_divisible_by for computational efficiency reasons. Setting it to 1 effectively disables the padding. The feature isn't LLaMA-specific, but I think this makes sense for a default config for LLaMA models.

Quentin-Anthony · 2023-04-21T18:06:31Z

I'm confused by the "make_vocab_size_divisible_by" argument in the config file. Can you explain what it does? We generally assume that the data has been preprocessed including pre-tokenzied, and that's the point in time where it would make sense to have such a thing (and we do).

Otherwise, this looks good to me. @Quentin-Anthony, any thoughts?

Looking good to me. I'm gonna run some tests then merge.

CLAassistant · 2023-04-23T02:52:34Z

All committers have signed the CLA.

StellaAthena · 2023-04-25T16:22:54Z

@zphang We need you to sign the CLA before merging this PR :)

zphang · 2023-04-25T16:33:37Z

Signed!

all comments appear resolved now

Quentin-Anthony · 2023-05-02T08:32:37Z

Works for all of my tests. Merging.

DaoD · 2023-05-04T08:55:31Z

@zphang Thanks for your work! It seems that there is no params.json in the llama checkpoint. Where can I get it? Thanks!

wiio12 · 2023-05-04T09:56:31Z

@zphang Hi, thank you for your work! can you provide a readme explaining how to run llama in neox?

wiio12 · 2023-05-04T11:50:20Z

@zphang Thanks for your work! It seems that there is no params.json in the llama checkpoint. Where can I get it? Thanks!

Hi @DaoD, this params.json file can be found in the offical checkpoint (It requires to fill the google form).

DaoD · 2023-05-04T12:40:32Z

@zphang Thanks for your work! It seems that there is no params.json in the llama checkpoint. Where can I get it? Thanks!

Hi @DaoD, this params.json file can be found in the offical checkpoint (It requires to fill the google form).

Thanks so much!

DaoD · 2023-05-04T13:03:25Z

@zphang Hi, thank you for your work! can you provide a readme explaining how to run llama in neox?

Yes. I have converted the checkpoint and tried to use the 6-7B's config to load it, but there are some missing keys and unexpected keys in the dict. Could you please provide a readme for using it?

wiio12 · 2023-05-04T13:30:24Z

Hi @DaoD, I think llama should be loaded with the llama/7B.yml config file.

I post an issue that discusses problems that I came across when loading llama in Neox.

DaoD · 2023-05-05T02:32:54Z

@zphang I think we need a new convert_sequential_to_hf.py to convert the obatined model into HF style. Have you done something about this? Thanks!

borghives · 2023-05-05T17:42:58Z

@zphang in the tools/convert_raw_llama_weights_to_neox.py, how does one convert llama tokenizer?

I have tried using the documented flag in the tool "--model_size tokenizer_only" (got error "assert model_size in NUM_SHARDS" when trying to run the convert_raw_llama_weights_to_neox.py)
I have tried using the huggingface llama tokenizer.json file (got error missing "merge-file" when trying to run generate)

DaoD · 2023-05-06T01:13:47Z

@zphang in the tools/convert_raw_llama_weights_to_neox.py, how does one convert llama tokenizer?

I have tried using the documented flag in the tool "--model_size tokenizer_only" (got error "assert model_size in NUM_SHARDS" when trying to run the convert_raw_llama_weights_to_neox.py)

I have tried using the huggingface llama tokenizer.json file (got error missing "merge-file" when trying to run generate)

I think you do not need to convert llama tokenizer. Just set

"tokenizer_type": "SPMTokenizer",
"vocab-file": "/llama-7b-hf/tokenizer.model",

The tokenizer.model can be obtained from this link.

DaoD · 2023-05-06T07:42:28Z

I find another problem. The eod token's id of SentencePieceTokenizer (eos_token_id=0) is differnet from the orginal LlamaTokenizer (eos_token_id =1), which may cause some problems in training and inference.

wiio12 · 2023-05-08T03:05:35Z

I find another problem. The eod token's id of SentencePieceTokenizer (eos_token_id=0) is differnet from the orginal LlamaTokenizer (eos_token_id =1), which may cause some problems in training and inference.

Hi @DaoD, have you tried to do inference with this model, can you generate reasonable text with it?

DaoD · 2023-05-08T03:09:21Z

I find another problem. The eod token's id of SentencePieceTokenizer (eos_token_id=0) is differnet from the orginal LlamaTokenizer (eos_token_id =1), which may cause some problems in training and inference.

Hi @DaoD, have you tried to do inference with this model, can you generate reasonable text with it?

Yes, but I do not use the inference code provided by gpt-neox. I just convert the model into Huggingface style, and use the HF function for generation. It seems correct.

wiio12 · 2023-05-08T03:11:30Z

I find another problem. The eod token's id of SentencePieceTokenizer (eos_token_id=0) is differnet from the orginal LlamaTokenizer (eos_token_id =1), which may cause some problems in training and inference.

Hi @DaoD, have you tried to do inference with this model, can you generate reasonable text with it?

Yes, but I do not use the inference code provided by gpt-neox. I just convert the model into Huggingface style, and use the HF function for generation. It seems correct.

Thx! This helps a lot:)

sxthunder · 2023-09-11T03:48:29Z

Hello, can you convert a gpt-neox llama model to HF style?

zphang requested a review from a team as a code owner March 18, 2023 02:25

zphang requested review from Quentin-Anthony and ShivanshuPurohit March 18, 2023 02:25

StellaAthena previously requested changes Mar 18, 2023

View reviewed changes

megatron/model/transformer.py Show resolved Hide resolved

megatron/model/transformer.py Show resolved Hide resolved

megatron/neox_arguments/neox_args.py Outdated Show resolved Hide resolved

megatron/neox_arguments/neox_args.py Outdated Show resolved Hide resolved

zphang added 11 commits April 18, 2023 17:12

llama

5bc009f

spm tokenizer

2eff69f

pipeline

2053b4b

llama to neox conversion script

fc4eb04

llama checkin

21d3681

weights script update and pp reversion

14eaf60

revert for PR

d4e7d1b

configs

1de1b1e

7B-specific tweak

28f15ef

LLaMA updates

d01f304

PR feedback

04deac7

zphang force-pushed the llama2 branch from 18d6d78 to 04deac7 Compare April 18, 2023 21:22

Quentin-Anthony self-assigned this Apr 21, 2023

Quentin-Anthony added the merge-queue This PR is next on the queue to merge label Apr 21, 2023

StellaAthena self-requested a review April 25, 2023 16:22

initialize multiple_of

9998fde

Quentin-Anthony approved these changes May 2, 2023

View reviewed changes

Quentin-Anthony merged commit 299b68c into EleutherAI:main May 2, 2023

wiio12 mentioned this pull request May 8, 2023

Problems on generating with llama model #921

Open

Ktakuya332C mentioned this pull request May 11, 2023

Add SPMTokenizer to preprocess_data.py #882

Closed

Quentin-Anthony removed the merge-queue This PR is next on the queue to merge label May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for LLaMA #841

Support for LLaMA #841

zphang commented Mar 18, 2023

StellaAthena commented Apr 10, 2023

zphang commented Apr 12, 2023

zphang commented Apr 18, 2023

StellaAthena commented Apr 19, 2023

zphang commented Apr 20, 2023

Quentin-Anthony commented Apr 21, 2023

CLAassistant commented Apr 23, 2023 •

edited

Loading

StellaAthena commented Apr 25, 2023

zphang commented Apr 25, 2023

Quentin-Anthony commented May 2, 2023

DaoD commented May 4, 2023

wiio12 commented May 4, 2023

wiio12 commented May 4, 2023

DaoD commented May 4, 2023

DaoD commented May 4, 2023

wiio12 commented May 4, 2023 •

edited

Loading

DaoD commented May 5, 2023

borghives commented May 5, 2023 •

edited

Loading

DaoD commented May 6, 2023

DaoD commented May 6, 2023

wiio12 commented May 8, 2023

DaoD commented May 8, 2023

wiio12 commented May 8, 2023

sxthunder commented Sep 11, 2023

Support for LLaMA #841

Support for LLaMA #841

Conversation

zphang commented Mar 18, 2023

StellaAthena commented Apr 10, 2023

zphang commented Apr 12, 2023

zphang commented Apr 18, 2023

StellaAthena commented Apr 19, 2023

zphang commented Apr 20, 2023

Quentin-Anthony commented Apr 21, 2023

CLAassistant commented Apr 23, 2023 • edited Loading

StellaAthena commented Apr 25, 2023

zphang commented Apr 25, 2023

Quentin-Anthony commented May 2, 2023

DaoD commented May 4, 2023

wiio12 commented May 4, 2023

wiio12 commented May 4, 2023

DaoD commented May 4, 2023

DaoD commented May 4, 2023

wiio12 commented May 4, 2023 • edited Loading

DaoD commented May 5, 2023

borghives commented May 5, 2023 • edited Loading

DaoD commented May 6, 2023

DaoD commented May 6, 2023

wiio12 commented May 8, 2023

DaoD commented May 8, 2023

wiio12 commented May 8, 2023

sxthunder commented Sep 11, 2023

CLAassistant commented Apr 23, 2023 •

edited

Loading

wiio12 commented May 4, 2023 •

edited

Loading

borghives commented May 5, 2023 •

edited

Loading