New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code stuck infinitely when performing Fine-Tuning #25
Comments
It seems that this warning has no effect on fine-tuning. I am not sure why the code will stuck infinitely. Maybe you can update your torch and transformers using "pip install --upgrade install torch transformers" |
So this happens only when performing fine-tuning for python. The process gets Killed after a long time with the epoch at either 0/8 or 1/8. |
@fengzhangyin Hi, zhangyin. Do you have any idea about this issue? |
Same thing happens when running the run.py script in code2nl |
Only for python language in code2nl? Can you share me with all log in code2nl? |
For code2nl it happens for all languages. Here's the terminal logs for the run on Javascript 02/19/2021 21:17:44 - INFO - main - Namespace(adam_epsilon=1e-08, beam_size=10, config_name='', dev_filename='../data/code2nl/CodeSearchNet//php/valid.jsonl', do_eval=True, do_lower_case=False, do_test=False, do_train=True, eval_batch_size=64, eval_steps=600, gradient_accumulation_steps=1, learning_rate=5e-05, load_model_path=None, local_rank=-1, max_grad_norm=1.0, max_source_length=256, max_steps=-1, max_target_length=128, model_name_or_path='microsoft/codebert-base', model_type='roberta', no_cuda=True, num_train_epochs=3.0, output_dir='model/php', seed=42, test_filename=None, tokenizer_name='', train_batch_size=64, train_filename='../data/code2nl/CodeSearchNet//php/train.jsonl', train_steps=30000, warmup_steps=0, weight_decay=0.0) 02/19/2021 21:17:45 - INFO - transformers.tokenization_utils - Model name 'microsoft/codebert-base' not found in model shortcut name list (roberta-base, roberta-large, roberta-large-mnli, distilroberta-base, roberta-base-openai-detector, roberta-large-openai-detector). Assuming 'microsoft/codebert-base' is a path, a model identifier, or url to a directory containing tokenizer files. |
It's hard for me to figure out the problem only from log. Can you try to print something in each line here and find which line is blocked : Lines 328 to 342 in 3aafd05
I guess there are some problems in your packages or environment. Have you tried to update your torch and transformers using "pip install --upgrade install torch transformers"? |
Updated pytorch and transformers. Still have the same issue. |
I am not sure whether your memory is limited. Can you try to use train_batch_size=1. Meanwhile, you can use GPU instead of CPU. |
I get this error when I set batch size to 1 Traceback (most recent call last): |
For code2nl, the code only support GPU. please remove --no_cuda parameter. |
For the code2nl thing, it is an issue with CUDA out of memory. With batch_size=1, it does run. |
For code search, I guess that the problem is also out of memory about CPU. Do you try to use GPU with batch_size =1 for code search fine tuning? |
When running the fine-tune operation, the script gets stuck at the following warning.
Epoch: 0%| | 0/8 [00:00<?, ?it/s]/home/akash/.local/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:224: UserWarning: To get the last learning rate computed by the scheduler, please use
get_last_lr()
.warnings.warn("To get the last learning rate computed by the scheduler, "
The text was updated successfully, but these errors were encountered: