Llama 3.1 8B and 70B checkpoints #1619

rasbt · 2024-07-23T15:29:11Z

Adds the new Llama 3.1 8B and 70B checkpoints.

405B will be done separately as it requires tensor parallelism.

rasbt · 2024-07-23T20:14:20Z

Finetuning works fine, but there's something weird about the RoPE scaling when evaluating with lm_eval. Haven't seen this before:

⚡ ~ litgpt evaluate /teamspace/studios/this_studio/out/finetune/qlora-llama3.1-8b/final --tasks mmlu 
{'access_token': None,
 'batch_size': 1,
 'checkpoint_dir': PosixPath('/teamspace/studios/this_studio/out/finetune/qlora-llama3.1-8b/final'),
 'device': None,
 'dtype': None,
 'force_conversion': False,
 'limit': None,
 'num_fewshot': None,
 'out_dir': None,
 'save_filepath': None,
 'seed': 1234,
 'tasks': 'mmlu'}
{'checkpoint_dir': PosixPath('/teamspace/studios/this_studio/out/finetune/qlora-llama3.1-8b/final'),
 'output_dir': PosixPath('/teamspace/studios/this_studio/out/finetune/qlora-llama3.1-8b/final/evaluate')}
2024-07-23:20:12:02,098 INFO     [huggingface.py:170] Using device 'cuda'
Traceback (most recent call last):
  File "/home/zeus/miniconda3/envs/cloudspace/bin/litgpt", line 8, in <module>
    sys.exit(main())
  File "/teamspace/studios/this_studio/litgpt2/litgpt/__main__.py", line 71, in main
    CLI(parser_data)
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/jsonargparse/_cli.py", line 119, in CLI
    return _run_component(component, init.get(subcommand))
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/jsonargparse/_cli.py", line 204, in _run_component
    return component(**cfg)
  File "/teamspace/studios/this_studio/litgpt2/litgpt/eval/evaluate.py", line 106, in convert_and_evaluate
    model = HFLM(pretrained=str(out_dir.resolve()), device=device, batch_size=batch_size, dtype=dtype)
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/lm_eval/models/huggingface.py", line 196, in __init__
    self._get_config(
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/lm_eval/models/huggingface.py", line 470, in _get_config
    self._config = transformers.AutoConfig.from_pretrained(
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 989, in from_pretrained
    return config_class.from_dict(config_dict, **unused_kwargs)
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/transformers/configuration_utils.py", line 772, in from_dict
    config = cls(**config_dict)
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 161, in __init__
    self._rope_scaling_validation()
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 182, in _rope_scaling_validation
    raise ValueError(
ValueError: `rope_scaling` must be a dictionary with two fields, `type` and `factor`, got {'factor': 8.0, 'low_freq_factor': 1.0, 'high_freq_factor': 4.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}

They must have changed something in the Llama 3.1 RoPE. I guess I have to buckle up and read the 92-page paper tonight.

Andrei-Aksionov · 2024-07-24T08:32:28Z

I don't know, do we need to add tests for version 3.1 by adding it here:

litgpt/tests/test_model.py

Lines 208 to 217 in 5ff6343

    
           @pytest.mark.parametrize( 
        
               "ours_kwargs", 
        
               [ 
        
                   {"name": "Llama-2-7b-hf"}, 
        
                   {"name": "CodeLlama-7b-hf"}, 
        
                   {"name": "Llama-2-70b-chat-hf", "n_query_groups": 1}, 
        
                   {"name": "Llama-3-8B"}, 
        
                   {"name": "Llama-3-8B-Instruct"}, 
        
               ], 
        
           )

There are no architectural changes and some params this test overrides anyway.

Andrei-Aksionov · 2024-07-24T08:36:41Z

As for the RoPE thing during evaluation, either something is changed in this particular part of the LlaMA 3.1 architecture, or it's something in the eval that is specific to this model.
Because the test uses pythia-14m as a model and I don't see any fails on CI.

Or the test is incorrect.

README.md

rasbt · 2024-07-24T13:58:53Z

I now read the complete paper and couldn’t find anything particular about the RoPE scaling in Llama 3.1. I did some research on the internet, and it seems like there’s nothing particular about it. Also, when I tried it, it works fine with the standard Llama 3 RoPE. (see Discussion via https://news.ycombinator.com/item?id=41053201)

HF transformers may have added something RoPE-specific to the Llama 3 model which causes lm_eval to fail. I guess we have to wait for an Evaluation Harness update here but this shouldn't hold up the PR.

Llama 3.1 8B and 70B checkpoints

af6490e

rasbt requested review from awaelchli and lantiga as code owners July 23, 2024 15:29

rasbt marked this pull request as draft July 23, 2024 15:29

updates

cc39c40

add tests

d3df4c3

rasbt marked this pull request as ready for review July 23, 2024 20:23

rasbt requested a review from williamFalcon as a code owner July 23, 2024 20:23

rasbt requested review from Andrei-Aksionov and removed request for williamFalcon July 23, 2024 21:48

Andrei-Aksionov reviewed Jul 24, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

awaelchli approved these changes Jul 24, 2024

View reviewed changes

Update README.md

670f874

Andrei-Aksionov approved these changes Jul 24, 2024

View reviewed changes

rasbt merged commit fd71063 into main Jul 24, 2024
9 checks passed

rasbt deleted the llama3.1-small branch July 24, 2024 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama 3.1 8B and 70B checkpoints #1619

Llama 3.1 8B and 70B checkpoints #1619

rasbt commented Jul 23, 2024 •

edited

Loading

rasbt commented Jul 23, 2024 •

edited

Loading

Andrei-Aksionov commented Jul 24, 2024

Andrei-Aksionov commented Jul 24, 2024

rasbt commented Jul 24, 2024

Llama 3.1 8B and 70B checkpoints #1619

Llama 3.1 8B and 70B checkpoints #1619

Conversation

rasbt commented Jul 23, 2024 • edited Loading

rasbt commented Jul 23, 2024 • edited Loading

Andrei-Aksionov commented Jul 24, 2024

Andrei-Aksionov commented Jul 24, 2024

rasbt commented Jul 24, 2024

rasbt commented Jul 23, 2024 •

edited

Loading

rasbt commented Jul 23, 2024 •

edited

Loading