Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add llama3 #30334

Merged
merged 12 commits into from Apr 24, 2024
Merged

Add llama3 #30334

merged 12 commits into from Apr 24, 2024

Conversation

ArthurZucker
Copy link
Collaborator

@ArthurZucker ArthurZucker commented Apr 19, 2024

TODOs

  • docs/source/en/model_doc/llama3.md

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@devdevgoat
Copy link

devdevgoat commented Apr 23, 2024

Managed to get this commit working, but it requires setting flag --llama_version 3 when calling convert_llama_wieghts_to_hf.py

python src/transformers/models/llama/convert_llama_weights_to_hf.py \
--input_dir /mnt/nvme1n1/Models/llama3/Meta-Llama-3-8B-Instruct \
--model_size 8B \
--output_dir /mnt/nvme1n1/Models/llama3/Meta-Llama-3-8B-Instruct-hf \
--llama_version 3

Additionally, it tries to delete the tmp folder before it's empty, so throws an error at the end, even though conversion is successful.

@ArthurZucker
Copy link
Collaborator Author

it requires setting flag --llama_version 3 when calling convert_llama_wieghts_to_hf.py

that is expected since you are converting version3, but should still work for earlier versions.

@kisseternity
Copy link

Hello, I notice the mask procedure ensuring self-attention does not cross document boundaries in meta's llama3 blog, as the following:
image

Should llama3 have an extra attention mask for this?

@ArthurZucker
Copy link
Collaborator Author

ArthurZucker commented Apr 24, 2024

In transformers we usually let the user create the custom mask. The trainer might even support it. but no it's not a modeling code change we want 😉 you can already pass any 4d or 2d mask to Llama

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +33 to +42
<Tip warning={true}>

The `Llama3` models were trained using `bfloat16`, but the original inference uses `float16`. The checkpoints uploaded on the Hub use `torch_dtype = 'float16'`, which will be
used by the `AutoModel` API to cast the checkpoints from `torch.float32` to `torch.float16`.

The `dtype` of the online weights is mostly irrelevant unless you are using `torch_dtype="auto"` when initializing a model using `model = AutoModelForCausalLM.from_pretrained("path", torch_dtype = "auto")`. The reason is that the model will first be downloaded ( using the `dtype` of the checkpoints online), then it will be casted to the default `dtype` of `torch` (becomes `torch.float32`), and finally, if there is a `torch_dtype` provided in the config, it will be used.

Training the model in `float16` is not recommended and is known to produce `nan`; as such, the model should be trained in `bfloat16`.

</Tip>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!

@ArthurZucker
Copy link
Collaborator Author

Failing tests are unrelated

@ArthurZucker ArthurZucker merged commit 89c510d into main Apr 24, 2024
15 of 23 checks passed
@ArthurZucker ArthurZucker deleted the add-llama3 branch April 24, 2024 08:11
@lhanchao777
Copy link

I meet this error when using the convert_llama_weights_to_hf.py. ImportError: cannot import name 'TikTokenConverter' from 'transformers.convert_slow_tokenizer', how to solve it ? The version of my transformers is 4.40.1

@LysandreJik
Copy link
Member

Hello @lhanchao777, conversion scripts should be used with a source install of transformers. You can install from source with the following:

pip install git+https://github.com/huggingface/transformers

You could also clone the repo and add it as an editable installation:

git clone https://github.com/huggingface/transformers
pip install -e ./transformers

@ZJL0111
Copy link

ZJL0111 commented Apr 29, 2024

I meet this error when using the convert_llama_weights_to_hf.py. ImportError: cannot import name 'TikTokenConverter' from 'transformers.convert_slow_tokenizer', how to solve it ? The version of my transformers is 4.40.1

Hi, @lhanchao777 i download model from modelscop and meet the same problem when trans model format;
To solve this, you need to instal transformer from source;
cause transformer source code uodated to 4.41.0, but transformer of huggingface updated to 4.40.1,so we need to install transformer from source;
And install from conda not pip

Transformers can be installed using conda as follows:
conda install conda-forge::transformers
NOTE: Installing transformers from the huggingface channel is deprecated.
# todo: https://github.com/huggingface/transformers/tree/main

but even thoughi solve this problem, i still get other bugs which i have no clue; so finally i still get llama3 direclly from huggingface

``import transformers
import torch
from huggingface_hub import login
login(token="your access token")

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
pipeline = transformers.pipeline("text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.float16},
device="cuda",
)
pipeline("Hey how are you doing today?")``

微信图片_20240429172839 微信图片_20240429172845

hope it helps!

@danieltmeta
Copy link

Hi there! Not sure if this is the right place, but I'm trying to convert the 8B model to llama2, is that possible by changing the flag in this fashion:

python src/transformers/models/llama/convert_llama_weights_to_hf.py
--input_dir /path/to/downloaded/llama/weights --model_size 8B --output_dir /output/path --llama_version 2

--llama_version 3 works fine, but that result is not what I'm looking for in my application.

I keep getting this error:

RuntimeError: Internal: could not parse ModelProto from ./Meta-Llama-3-8B/original/tokenizer.model

Thanks!

@ZJL0111
Copy link

ZJL0111 commented Apr 30, 2024 via email

@NekoMimiUnagi
Copy link

Hi there! I followed the above instructions to convert Meta-Llama-3-8B to hf format but still got errors as follows:

Traceback (most recent call last):
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 407, in <module>
    main()
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 394, in main
    vocab_size = len(write_tokenizer(args.output_dir, spm_path, llama_version=args.llama_version))
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 358, in write_tokenizer
    tokenizer = Llama3Converter(input_tokenizer_path).tokenizer
  File "/data/models/origin-format/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 319, in __init__
    tokenizer = self.converted()
  File "/data/models/origin-format/transformers/src/transformers/convert_slow_tokenizer.py", line 1534, in converted
    tokenizer = self.tokenizer()
  File "/data/models/origin-format/transformers/src/transformers/convert_slow_tokenizer.py", line 1527, in tokenizer
    vocab_scores, merges = self.extract_vocab_merges_from_model(self.vocab_file)
  File "/data/models/origin-format/transformers/src/transformers/convert_slow_tokenizer.py", line 1503, in extract_vocab_merges_from_model
    bpe_ranks = load_tiktoken_bpe(tiktoken_url)
  File "/home/xxx/miniconda3/envs/pytorch/lib/python3.8/site-packages/tiktoken/load.py", line 115, in load_tiktoken_bpe
    return {
  File "/home/xxx/miniconda3/envs/pytorch/lib/python3.8/site-packages/tiktoken/load.py", line 117, in <dictcomp>
    for token, rank in (line.split() for line in contents.splitlines() if line)
ValueError: too many values to unpack (expected 2)

I installed transformers from the source file under the folder /data/models/origin-format/transformers/; the version is 4.41.0.dev0. By the way, I follow the instruction to install it.

Hello @lhanchao777, conversion scripts should be used with a source install of transformers. You can install from source with the following:

pip install git+https://github.com/huggingface/transformers

You could also clone the repo and add it as an editable installation:

git clone https://github.com/huggingface/transformers
pip install -e ./transformers

@ArthurZucker
Copy link
Collaborator Author

You are most probably not using the correct original tokenizer.model. We proof tested the script many times 😉

@danieltmeta
Copy link

@ArthurZucker You are correct, I was not using this correctly. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet