You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
start a runpod container with the pytorch 2.01 template and lots of disk space
run your sample command on a properly formatted dataset:
python -m llamatune.train
--model_name meta-llama/Llama-2-13b-chat-hf
--data_path master_qa.json
--training_recipe lora
--batch_size 8
--gradient_accumulation_steps 4
--learning_rate 1e-4
--output_dir chat_llama2_13b
--use_auth_token xxxzzz
result is:
Model ready for training!
trainable params: 250347520 || all params: 6922337280 || trainable: 3.616517223500557
WARNING:root:Loading data...
WARNING:root:Tokenizing inputs... This may take some time...
config TrainingConfig(model_name='meta-llama/Llama-2-13b-chat-hf', data_path='master_qa.json', output_dir='chat_llama2_13b', training_recipe='lora', optim='paged_adamw_8bit', batch_size=8, gradient_accumulation_steps=4, n_epochs=3, weight_decay=0.0, learning_rate=0.0001, max_grad_norm=0.3, gradient_checkpointing=True, do_train=True, lr_scheduler_type='cosine', warmup_ratio=0.03, logging_steps=1, group_by_length=True, save_strategy='epoch', save_total_limit=3, fp16=True, tokenizer_type='llama', trust_remote_code=False, compute_dtype=torch.float16, max_tokens=4096, do_eval=True, evaluation_strategy='epoch', use_auth_token='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', use_fast=False, bits=4, double_quant=True, quant_type='nf4', lora_r=64, lora_alpha=16, lora_dropout=0.0)
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/llamatune/train.py", line 50, in
trainer.train()
File "/usr/local/lib/python3.10/dist-packages/llamatune/trainer.py", line 25, in train
self.model_engine.train(data_module=self.data_module)
File "/usr/local/lib/python3.10/dist-packages/llamatune/model_engines/llama_model_engine.py", line 33, in train
trainer = Trainer(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 405, in init
raise ValueError(
ValueError: The model you want to train is loaded in 8-bit precision. if you want to fine-tune an 8-bit model, please make sure that you have installed bitsandbytes>=0.41.1.
steps to reproduce
python -m llamatune.train
--model_name meta-llama/Llama-2-13b-chat-hf
--data_path master_qa.json
--training_recipe lora
--batch_size 8
--gradient_accumulation_steps 4
--learning_rate 1e-4
--output_dir chat_llama2_13b
--use_auth_token xxxzzz
Model ready for training!
trainable params: 250347520 || all params: 6922337280 || trainable: 3.616517223500557
WARNING:root:Loading data...
WARNING:root:Tokenizing inputs... This may take some time...
config TrainingConfig(model_name='meta-llama/Llama-2-13b-chat-hf', data_path='master_qa.json', output_dir='chat_llama2_13b', training_recipe='lora', optim='paged_adamw_8bit', batch_size=8, gradient_accumulation_steps=4, n_epochs=3, weight_decay=0.0, learning_rate=0.0001, max_grad_norm=0.3, gradient_checkpointing=True, do_train=True, lr_scheduler_type='cosine', warmup_ratio=0.03, logging_steps=1, group_by_length=True, save_strategy='epoch', save_total_limit=3, fp16=True, tokenizer_type='llama', trust_remote_code=False, compute_dtype=torch.float16, max_tokens=4096, do_eval=True, evaluation_strategy='epoch', use_auth_token='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', use_fast=False, bits=4, double_quant=True, quant_type='nf4', lora_r=64, lora_alpha=16, lora_dropout=0.0)
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/llamatune/train.py", line 50, in
trainer.train()
File "/usr/local/lib/python3.10/dist-packages/llamatune/trainer.py", line 25, in train
self.model_engine.train(data_module=self.data_module)
File "/usr/local/lib/python3.10/dist-packages/llamatune/model_engines/llama_model_engine.py", line 33, in train
trainer = Trainer(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 405, in init
raise ValueError(
ValueError: The model you want to train is loaded in 8-bit precision. if you want to fine-tune an 8-bit model, please make sure that you have installed
bitsandbytes>=0.41.1.