run finetune llama2-7B error #77

13416157913 · 2023-10-18T06:08:42Z

help please
error：
output_tensor = forward_step(forward_step_func, data_iterator,
File "/home/dengkaibiao/Megatron-LLM/megatron/schedules.py", line 117, in forward_step
output_tensor, loss_func = forward_step_func(data_iterator, model)
File "/home/dengkaibiao/Megatron-LLM/finetune.py", line 223, in forward_step
batch = get_batch(data_iterator)
File "/home/dengkaibiao/Megatron-LLM/finetune.py", line 101, in get_batch
tokenizer = get_tokenizer()
File "/home/dengkaibiao/Megatron-LLM/megatron/global_vars.py", line 45, in get_tokenizer
_ensure_var_is_initialized(_GLOBAL_TOKENIZER, 'tokenizer')
File "/home/dengkaibiao/Megatron-LLM/megatron/global_vars.py", line 198, in _ensure_var_is_initialized
assert var is not None, '{} is not initialized.'.format(name)
AssertionError: tokenizer is not initialized.

this is my finetune script：
LOG_ARGS="--log_interval 1 --save_interval 100 --eval_interval 50"
TRAIN_ARGS="--train_iters 100 --lr_decay_style cosine --lr_warmup_iters 50 --lr 3e-4 --min_lr 1e-6"
DISTRIBUTED_ARGS="--nproc_per_node 8 --nnodes 1 --node_rank 0 --master_addr localhost --master_port 8000"
COMMON_ARGS="--num_layers 32 --num_attention_heads 32 --seq_length 4096 --max_position_embeddings 4096 --ffn_hidden_size 11008
--hidden_dropout 0.0 --position_embedding_type rotary --no_bias_gelu_fusion
--no_bias_dropout_fusion --use_checkpoint_args
--attention_dropout 0.0 --adam_beta1 0.9 --adam_beta2 0.95 --adam_eps 1e-5
--layernorm_epsilon 1e-6
--weight_decay 0.1 --sequence_parallel --recompute_granularity selective
--log_timers_to_tensorboard
--rope_scaling_factor 1.0"

torchrun $DISTRIBUTED_ARGS finetune.py
--tensor_model_parallel_size 2
--pipeline_model_parallel_size 1
--load /Megatron-LLM-sharded-weights
--save /Megatron-LLM-sharded-weights
--tensorboard_dir /Megatron-LLM-sharded-weights/tensorboard/
--data_path /Megatron-LLM/corpus_indexed/china_text_document
--split 100,0,0
--model_name llama2
--tokenizer_type SentencePieceTokenizer
--make_vocab_size_divisible_by 1
--bf16
--global_batch_size 1000
--micro_batch_size 2
--use_checkpoint_args
$COMMON_ARGS $LOG_ARGS $TRAIN_ARGS

kylematoba · 2023-10-18T07:08:47Z

I don't see a --vocab_file argument? Please check your call against the example at https://epfllm.github.io/Megatron-LLM/guide/instruction_tuning.html#training.

13416157913 · 2023-10-18T07:25:23Z

I don't see a --vocab_file argument? Please check your call against the example at https://epfllm.github.io/Megatron-LLM/guide/instruction_tuning.html#training.

kylematoba mentioned this issue Oct 18, 2023

run finetune llama2-7B error #78

Closed

13416157913 closed this as completed Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run finetune llama2-7B error #77

run finetune llama2-7B error #77

13416157913 commented Oct 18, 2023

kylematoba commented Oct 18, 2023

13416157913 commented Oct 18, 2023

run finetune llama2-7B error #77

run finetune llama2-7B error #77

Comments

13416157913 commented Oct 18, 2023

kylematoba commented Oct 18, 2023

13416157913 commented Oct 18, 2023