We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model_name_or_path: /home/ubuntu/Meta-Llama-3-8B
stage: sft do_train: true finetuning_type: freeze
template: llama3
ddp_timeout: 180000000 deepspeed: examples/deepspeed/ds_z2_config.json
dataset: alpaca_gpt4_zh ,alpaca_zh,firefly cutoff_len: 1024
overwrite_cache: true preprocessing_num_workers: 16
output_dir: ../saves/llama3-8b/sft save_total_limit: 2 logging_steps: 20 save_steps: 500 plot_loss: true overwrite_output_dir: false
per_device_train_batch_size: 16 gradient_accumulation_steps: 1 learning_rate: 0.00005 num_train_epochs: 2.0 lr_scheduler_type: cosine warmup_steps: 0.1 bf16: true
val_size: 0.001 per_device_eval_batch_size: 1 evaluation_strategy: steps eval_steps: 500
#!/bin/bash
NPROC_PER_NODE=2 NNODES=1 RANK=0 MASTER_ADDR=127.0.0.1 MASTER_PORT=29500
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node $NPROC_PER_NODE --nnodes $NNODES --node_rank $RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT src/train.py llama3_sft_freeze.yaml
llamafactory-cli chat --model_name ~/saves/llama3-8b/sft --template llama3
No response
The text was updated successfully, but these errors were encountered:
base 模型不要使用 llama3 template template: default
Sorry, something went wrong.
No branches or pull requests
Reminder
Reproduction
model
model_name_or_path: /home/ubuntu/Meta-Llama-3-8B
method
stage: sft
do_train: true
finetuning_type: freeze
template: llama3
ddp
ddp_timeout: 180000000
deepspeed: examples/deepspeed/ds_z2_config.json
dataset
dataset: alpaca_gpt4_zh ,alpaca_zh,firefly
cutoff_len: 1024
overwrite_cache: true
preprocessing_num_workers: 16
output
output_dir: ../saves/llama3-8b/sft
save_total_limit: 2
logging_steps: 20
save_steps: 500
plot_loss: true
overwrite_output_dir: false
train
per_device_train_batch_size: 16
gradient_accumulation_steps: 1
learning_rate: 0.00005
num_train_epochs: 2.0
lr_scheduler_type: cosine
warmup_steps: 0.1
bf16: true
eval
val_size: 0.001
per_device_eval_batch_size: 1
evaluation_strategy: steps
eval_steps: 500
#!/bin/bash
NPROC_PER_NODE=2
NNODES=1
RANK=0
MASTER_ADDR=127.0.0.1
MASTER_PORT=29500
CUDA_VISIBLE_DEVICES=0,1 torchrun
--nproc_per_node $NPROC_PER_NODE
--nnodes $NNODES
--node_rank $RANK
--master_addr $MASTER_ADDR
--master_port $MASTER_PORT
src/train.py llama3_sft_freeze.yaml
Expected behavior
llamafactory-cli chat --model_name ~/saves/llama3-8b/sft --template llama3
System Info
No response
Others
No response
The text was updated successfully, but these errors were encountered: