Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run finetune llama2-7B error #77

Closed
13416157913 opened this issue Oct 18, 2023 · 2 comments
Closed

run finetune llama2-7B error #77

13416157913 opened this issue Oct 18, 2023 · 2 comments

Comments

@13416157913
Copy link

help please
error:
output_tensor = forward_step(forward_step_func, data_iterator,
File "/home/dengkaibiao/Megatron-LLM/megatron/schedules.py", line 117, in forward_step
output_tensor, loss_func = forward_step_func(data_iterator, model)
File "/home/dengkaibiao/Megatron-LLM/finetune.py", line 223, in forward_step
batch = get_batch(data_iterator)
File "/home/dengkaibiao/Megatron-LLM/finetune.py", line 101, in get_batch
tokenizer = get_tokenizer()
File "/home/dengkaibiao/Megatron-LLM/megatron/global_vars.py", line 45, in get_tokenizer
_ensure_var_is_initialized(_GLOBAL_TOKENIZER, 'tokenizer')
File "/home/dengkaibiao/Megatron-LLM/megatron/global_vars.py", line 198, in _ensure_var_is_initialized
assert var is not None, '{} is not initialized.'.format(name)
AssertionError: tokenizer is not initialized.

this is my finetune script:
LOG_ARGS="--log_interval 1 --save_interval 100 --eval_interval 50"
TRAIN_ARGS="--train_iters 100 --lr_decay_style cosine --lr_warmup_iters 50 --lr 3e-4 --min_lr 1e-6"
DISTRIBUTED_ARGS="--nproc_per_node 8 --nnodes 1 --node_rank 0 --master_addr localhost --master_port 8000"
COMMON_ARGS="--num_layers 32 --num_attention_heads 32 --seq_length 4096 --max_position_embeddings 4096 --ffn_hidden_size 11008
--hidden_dropout 0.0 --position_embedding_type rotary --no_bias_gelu_fusion
--no_bias_dropout_fusion --use_checkpoint_args
--attention_dropout 0.0 --adam_beta1 0.9 --adam_beta2 0.95 --adam_eps 1e-5
--layernorm_epsilon 1e-6
--weight_decay 0.1 --sequence_parallel --recompute_granularity selective
--log_timers_to_tensorboard
--rope_scaling_factor 1.0"

torchrun $DISTRIBUTED_ARGS finetune.py
--tensor_model_parallel_size 2
--pipeline_model_parallel_size 1
--load /Megatron-LLM-sharded-weights
--save /Megatron-LLM-sharded-weights
--tensorboard_dir /Megatron-LLM-sharded-weights/tensorboard/
--data_path /Megatron-LLM/corpus_indexed/china_text_document
--split 100,0,0
--model_name llama2
--tokenizer_type SentencePieceTokenizer
--make_vocab_size_divisible_by 1
--bf16
--global_batch_size 1000
--micro_batch_size 2
--use_checkpoint_args
$COMMON_ARGS $LOG_ARGS $TRAIN_ARGS

@kylematoba
Copy link
Collaborator

I don't see a --vocab_file argument? Please check your call against the example at https://epfllm.github.io/Megatron-LLM/guide/instruction_tuning.html#training.

@13416157913
Copy link
Author

I don't see a --vocab_file argument? Please check your call against the example at https://epfllm.github.io/Megatron-LLM/guide/instruction_tuning.html#training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants