Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in run_flue.py #22

Closed
keloemma opened this issue Mar 30, 2020 · 7 comments
Closed

bug in run_flue.py #22

keloemma opened this issue Mar 30, 2020 · 7 comments

Comments

@keloemma
Copy link

keloemma commented Mar 30, 2020

Hi I got this error when running the run_flue.py
from transformers import flue_compute_metrics as compute_metrics
Import Error: cannot import name 'flue_compute_metrics'

I already preinstalled the requirements and update the transformer directory.

@formiel
Copy link
Contributor

formiel commented Mar 30, 2020

Hi @keloemma ,

Please try reinstalling using the following command.

pip install --upgrade --force-reinstall git+https://github.com/formiel/transformers.git@flue

@keloemma
Copy link
Author

keloemma commented Mar 31, 2020

thanks, I get this error :

Flaubert$ bash finetuning_flue.sh usage: run_flue.py [-h] --data_dir DATA_DIR --model_type MODEL_TYPE --model_name_or_path MODEL_NAME_OR_PATH --task_name TASK_NAME --output_dir OUTPUT_DIR [--config_name CONFIG_NAME] [--tokenizer_name TOKENIZER_NAME] [--cache_dir CACHE_DIR] [--max_seq_length MAX_SEQ_LENGTH] [--do_train] [--do_eval] [--do_lower_case] [--per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE] [--per_gpu_eval_batch_size PER_GPU_EVAL_BATCH_SIZE] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--learning_rate LEARNING_RATE] [--weight_decay WEIGHT_DECAY] [--adam_epsilon ADAM_EPSILON] [--max_grad_norm MAX_GRAD_NORM] [--num_full_passes NUM_FULL_PASSES] [--max_steps MAX_STEPS] [--warmup_steps WARMUP_STEPS] [--no_cuda] [--overwrite_output_dir] [--overwrite_cache] [--seed SEED] [--fp16] [--fp16_opt_level FP16_OPT_LEVEL] [--local_rank LOCAL_RANK] [--server_ip SERVER_IP] [--server_port SERVER_PORT] [--val_metrics {acc,f1,acc_and_f1}] [--early_stopping_patience EARLY_STOPPING_PATIENCE] [--steps_per_epoch STEPS_PER_EPOCH] [--do_test] [--scheduler {constant,constant-warmup,linear-warmup,cosine-w armup,None}] run_flue.py: error: unrecognized arguments: --num_train_epochs 30 --save_steps 5 0000

seems those two arguments are not recognised, so I comment out num_train_epoch in run_flue.py. but there was not --save_steps

So should I remove it (because It is listed in the list of parameters in the flue : evaluation in the example.

@formiel
Copy link
Contributor

formiel commented Mar 31, 2020

Hi @keloemma ,

There are no num_train_epoch and save_steps in the run_flue.py script. The num_train_epoch parameter was replaced by epochs and the steps_per_epoch parameter was added to control the steps per epoch in case you do not want to pass the whole dataset in one epoch. This also replaced the save_steps parameter as well.

Please refer to the script for more details and descriptions of each parameter. You should modify the running command that you used previously for run_glue.py.

@keloemma
Copy link
Author

keloemma commented Apr 6, 2020

Hello @formiel

I got another problem when running the script :

04/06/2020 10:44:08 - WARNING - __main__ - Process rank: -1, device: cuda, n_gpu: 2, distributed training: False, 16-bits training: True Traceback (most recent call last): File "/home/transformers/examples/run_flue.py", line 795, in <module> main() File "/home/transformers/examples/run_flue.py", line 750, in main cache_dir=args.cache_dir if args.cache_dir else None, File "/home/anaconda3/envs/env/lib/python3.6/site-packages/transformers/configuration_utils.py", line 188, in from_pretrained config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs) File "/home/anaconda3/envs/env/lib/python3.6/site-packages/transformers/configuration_utils.py", line 240, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "/home/anaconda3/envs/env/lib/python3.6/site-packages/transformers/configuration_utils.py", line 329, in _dict_from_json_file text = reader.read() File "/home/anaconda3/envs/env/lib/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
This is the command I do when launching :

python ~/eXP/Flaubert/transformers/examples/run_flue.py \ --data_dir $data_dir \ --model_type flaubert \ --model_name_or_path $model_name_or_path \ # best-*.pth --task_name $task_name \ --output_dir $output_dir \ --max_seq_length 512 \ --do_train \ --do_eval \ --max_steps $epochs \ # In the script, it says this parameter overwrite num_train_epochs --learning_rate $lr \ --fp16 \ --fp16_opt_level O1 \ |& tee output.log

when using on another server I got this error in the output log file

File "/home/transformers/run_flue.py", line 357 print(json.dumps({**logs, **{"step": global_step}})) ^ SyntaxError: invalid syntax

Do you know perhaps how can I solve it ?

@formiel
Copy link
Contributor

formiel commented Apr 8, 2020

Hi @keloemma,

Can you please replace the file run_flue.py by the new version that I just pushed and try running the code again?

@keloemma
Copy link
Author

keloemma commented Apr 23, 2020

I reload and retry but still got the same error ?
04/23/2020` 10:19:32 - WARNING - __main__ - Process rank: -1, device: cuda, n_g, 16-bits training: True Traceback (most recent call last): File "transformers/examples/run_flue.py", line 782, in <module> main() File "transformers/examples/run_flue.py", line 737, in main cache_dir=args.cache_dir if args.cache_dir else None, File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/site-packages/tra line 188, in from_pretrained config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **k File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/site-packages/tra line 240, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/site-packages/tra line 329, in _dict_from_json_file text = reader.read() File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/codecs.py", line (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid

@loic-vial
Copy link
Contributor

Hi @keloemma, I am not sure but I think that your error is because the data that you're using are not encoded in UTF-8. Maybe check your data to be sure that it is all valid UTF-8 characters ?

@formiel formiel closed this as completed Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants