Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Code-To-Text] Issues loading a re-trained model #30

Closed
lapplislazuli opened this issue Jan 14, 2021 · 6 comments
Closed

[Code-To-Text] Issues loading a re-trained model #30

lapplislazuli opened this issue Jan 14, 2021 · 6 comments

Comments

@lapplislazuli
Copy link

Hello,

I run the model training as described in the repository and everything is working fine.
But once I try to do the inference with my newly trained model, i get the following error:

01/14/2021 11:32:57 - INFO - main -   Namespace(adam_epsilon=1e-08, beam_size=10, config_name='', dev_filename=None, do_eval=False, do_lower_case=False, do_test=True, do_train=False, eval_batch_size=32, eval_steps=-1, gradient_accumulation_steps=1, learning_rate=5e-05, load_model_path=None, local_rank=-1, max_grad_norm=1.0, max_source_length=256, max_steps=-1, max_target_length=128, model_name_or_path='/models/pytorch_model.bin', model_type='roberta', no_cuda=True, num_train_epochs=10, output_dir='/experiment/output', seed=42, test_filename='/dataset/test.jsonl', tokenizer_name='', train_batch_size=32, train_filename=None, train_steps=-1, warmup_steps=0, weight_decay=0.0)
01/14/2021 11:32:57 - WARNING - main -   Process rank: -1, device: cpu, n_gpu: 0, distributed training: False
Traceback (most recent call last):
  File "./run.py", line 518, in <module>
    main()
  File "./run.py", line 255, in main
    config = config_class.from_pretrained(args.config_name if args.config_name else args.model_name_or_path)
  File "/root/anaconda/envs/code-to-text/lib/python3.7/site-packages/transformers/configuration_utils.py", line 347, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/root/anaconda/envs/code-to-text/lib/python3.7/site-packages/transformers/configuration_utils.py", line 391, in get_config_dict
    config_dict = cls._dict_from_json_file(resolved_config_file)
  File "/root/anaconda/envs/code-to-text/lib/python3.7/site-packages/transformers/configuration_utils.py", line 474, in _dict_from_json_file
    text = reader.read()
  File "/root/anaconda/envs/code-to-text/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
ERROR conda.cli.main_run:execute(33): Subprocess for 'conda run ['python3.7', './run.py', '--do_test', '--model_type', 'roberta', '--model_name_or_path', '/models/pytorch_model.bin', '--test_filename', '/dataset/test.jsonl', '--output_dir', '/experiment/output', '--max_source_length', '256', '--no_cuda','--max_target_length', '128', '--beam_size', '10', '--train_batch_size', '32', '--eval_batch_size', '32', '--learning_rate', '5e-5', '--num_train_epochs', '10']' command failed.  (See above for error)

At the moment I am only trying to run the java-specific model.

Inferrence using the microsoft\codebert-base works fine.

I am using the following anaconda environment, maybe something is wrong there:

name: code-to-text
channels:
  - conda-forge
  - defaults
dependencies:
  - _r-xgboost-mutex=2.0=cpu_0
  - idna=2.10
  - pip=20.3
  - pycparser=2.20
  - pyopenssl=20.0.0
  - python_abi=3.7
  - requests=2.25.0
  - six=1.15.0
  - tqdm=4.51.0
  - wheel=0.35.1
  - pytorch=1.4.0
  - pip:
    - click==7.1.2
    - filelock==3.0.12
    - joblib==0.17.0
    - numpy==1.19.3
    - packaging==20.4
    - protobuf==3.14.0
    - pyparsing==2.4.7
    - regex==2020.11.13
    - sacremoses==0.0.43
    - sentencepiece==0.1.91
    - tokenizers==0.9.3
    - transformers==3.5.0
    - urllib3==1.26.2

I would appreciate if something is wrong there.
If you want, I can make a PR for the conda environment if it seems appropriate.

@guody5
Copy link
Contributor

guody5 commented Jan 17, 2021

You also need to enter these command (including training command) in the inference:

lang=ruby #programming language
lr=5e-5
batch_size=32
beam_size=10
source_length=256
target_length=128
data_dir=../dataset
output_dir=model/$lang
train_file=$data_dir/$lang/train.jsonl
dev_file=$data_dir/$lang/valid.jsonl
epochs=10 
pretrained_model=microsoft/codebert-base #Roberta: roberta-base
batch_size=64
dev_file=$data_dir/$lang/valid.jsonl
test_file=$data_dir/$lang/test.jsonl
test_model=$output_dir/checkpoint-best-bleu/pytorch_model.bin #checkpoint for test

@lapplislazuli
Copy link
Author

@guody5 thank your for your answer,
but if I run it with all parameters, I will have to do the training?

I want to run just inference, like in the Readme.
Training+Inference works, but just Inference for a trained model fails.

@guoday
Copy link
Contributor

guoday commented Jan 18, 2021

lang=ruby #programming language
lr=5e-5
batch_size=32
beam_size=10
source_length=256
target_length=128
data_dir=../dataset
output_dir=model/$lang
train_file=$data_dir/$lang/train.jsonl
dev_file=$data_dir/$lang/valid.jsonl
epochs=10 
pretrained_model=microsoft/codebert-base #Roberta: roberta-base
batch_size=64
dev_file=$data_dir/$lang/valid.jsonl
test_file=$data_dir/$lang/test.jsonl
test_model=$output_dir/checkpoint-best-bleu/pytorch_model.bin #checkpoint for test
python run.py --do_test --model_type roberta --model_name_or_path microsoft/codebert-base --load_model_path $test_model --dev_filename $dev_file --test_filename $test_file --output_dir $output_dir --max_source_length $source_length --max_target_length $target_length --beam_size $beam_size --eval_batch_size $batch_size

@guoday
Copy link
Contributor

guoday commented Jan 18, 2021

In your error logs:
model_name_or_path should be "microsoft/codebert-base" but get '/models/pytorch_model.bin'

@guoday
Copy link
Contributor

guoday commented Jan 18, 2021

According to your log, you should run the following command:

python run.py --do_test --model_type roberta --model_name_or_path microsoft/codebert-base --load_model_path /models/pytorch_model.bin --test_filename /dataset/test.jsonl --output_dir /experiment/output --max_source_length 256 --max_target_length 128 --beam_size 10 --eval_batch_size 32

@lapplislazuli
Copy link
Author

@guoday Thank you very much!
Works like a charm. I did not read properly that pretrained model and load_model were seperate things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants