Llama 3 & Mistral LoRA Examples Error (needs `eval_sample_packing: False`) #1644

VelocityRa · 2024-05-21T04:52:36Z

Please check that this issue hasn't been reported before.

I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

I'm running:

CUDA_VISIBLE_DEVICES="" python -m axolotl.cli.preprocess examples/llama-3/lora-8b.yml

The command should complete successfully, outputting the preprocessed dataset.

Current behaviour

It errors out:

...
[2024-05-21 07:42:19,020] [DEBUG] [axolotl.calculate_total_num_steps:299] [PID:15484] [RANK:0] total_num_tokens: 18_827
[2024-05-21 07:42:19,020] [DEBUG] [axolotl.calculate_total_num_steps:312] [PID:15484] [RANK:0] `total_supervised_tokens: 14_240`
[2024-05-21 07:42:21,265] [INFO] [axolotl.utils.samplers.multipack._len_est:184] [PID:15484] [RANK:0] packing_efficiency_estimate: 1.0 total_num_tokens per device: 18827
[2024-05-21 07:42:21,266] [DEBUG] [axolotl.calculate_total_num_steps:365] [PID:15484] [RANK:0] data_loader_len: 0
[2024-05-21 07:42:21,266] [INFO] [axolotl.calc_sample_packing_eff_est:371] [PID:15484] [RANK:0] sample_packing_eff_est across ranks: [0.7660725911458334]
[2024-05-21 07:42:21,266] [DEBUG] [axolotl.calculate_total_num_steps:383] [PID:15484] [RANK:0] sample_packing_eff_est: None
[2024-05-21 07:42:21,266] [DEBUG] [axolotl.calculate_total_num_steps:391] [PID:15484] [RANK:0] total_num_steps: 0
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/velo/axolotl/src/axolotl/cli/preprocess.py", line 82, in <module>
    fire.Fire(do_cli)
  File "/home/velo/axolotl/venv/lib/python3.10/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/velo/axolotl/venv/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/velo/axolotl/venv/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/velo/axolotl/src/axolotl/cli/preprocess.py", line 72, in do_cli
    load_datasets(cfg=parsed_cfg, cli_args=parsed_cli_args)
  File "/home/velo/axolotl/src/axolotl/cli/__init__.py", line 403, in load_datasets
    train_dataset, eval_dataset, total_num_steps, prompters = prepare_dataset(
  File "/home/velo/axolotl/src/axolotl/utils/data/sft.py", line 107, in prepare_dataset
    raise ValueError(
ValueError: eval dataset split is too small for sample_packing. You should set `eval_sample_packing: False`.

Steps to reproduce

Set up axolotl
Run CUDA_VISIBLE_DEVICES="" python -m axolotl.cli.preprocess examples/llama-3/lora-8b.yml

Config yaml

No response

Possible solution

Set sample_packing to false in the example config? But it's explicitly set to true there, so not sure if something else is wrong.

Similar issue: #999

Edit: Issue happens for mistral/lora.yml too.

Which Operating Systems are you using?

Linux
macOS
Windows

Python Version

3.10

axolotl branch-commit

main/22ae21a

Acknowledgements

My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this bug has not been reported yet.
I am using the latest version of axolotl.
I have provided enough information for the maintainers to reproduce and diagnose the issue.

The text was updated successfully, but these errors were encountered:

RodriMora · 2024-06-25T20:38:11Z

Submitteda a PR #1716

VelocityRa added the bug Something isn't working label May 21, 2024

VelocityRa changed the title ~~Llama 3 LoRA Example Errors Out~~ Llama 3 LoRA Example Errors Out (needs eval_sample_packing: False) May 21, 2024

VelocityRa changed the title ~~Llama 3 LoRA Example Errors Out (needs eval_sample_packing: False)~~ Llama 3 LoRA Example Error (needs eval_sample_packing: False) May 21, 2024

VelocityRa changed the title ~~Llama 3 LoRA Example Error (needs eval_sample_packing: False)~~ Llama 3 & Mistral LoRA Examples Error (needs eval_sample_packing: False) May 21, 2024

RodriMora mentioned this issue Jun 25, 2024

Fix eval_sample_packing in llama-3 lora example #1716

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama 3 & Mistral LoRA Examples Error (needs `eval_sample_packing: False`) #1644

Llama 3 & Mistral LoRA Examples Error (needs `eval_sample_packing: False`) #1644

VelocityRa commented May 21, 2024 •

edited

Loading

RodriMora commented Jun 25, 2024

Llama 3 & Mistral LoRA Examples Error (needs eval_sample_packing: False) #1644

Llama 3 & Mistral LoRA Examples Error (needs eval_sample_packing: False) #1644

Comments

VelocityRa commented May 21, 2024 • edited Loading

Please check that this issue hasn't been reported before.

Expected Behavior

Current behaviour

Steps to reproduce

Config yaml

Possible solution

Which Operating Systems are you using?

Python Version

axolotl branch-commit

Acknowledgements

RodriMora commented Jun 25, 2024

Llama 3 & Mistral LoRA Examples Error (needs `eval_sample_packing: False`) #1644

Llama 3 & Mistral LoRA Examples Error (needs `eval_sample_packing: False`) #1644

VelocityRa commented May 21, 2024 •

edited

Loading