Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Text-to-Code] Error during evaluation #60

Closed
micheletufano opened this issue Jun 1, 2021 · 6 comments
Closed

[Text-to-Code] Error during evaluation #60

micheletufano opened this issue Jun 1, 2021 · 6 comments

Comments

@micheletufano
Copy link

Hello, (cc @celbree)

When I try to replicate the results for Text-Code/text-to-code, I get an error on the evaluation step.

File: CodeXGLUE/Text-Code/text-to-code/code/run.py

Traceback (most recent call last):
  File "run.py", line 634, in <module>
    main()
  File "run.py", line 621, in main
    global_step, tr_loss = train(args, train_dataset, model, tokenizer, fh, pool)
  File "run.py", line 223, in train
    dev_bleu, dev_EM = eval_bleu(args, model, tokenizer, file_type='dev', num=100)
  File "run.py", line 359, in eval_bleu
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
  File "run.py", line 359, in <listcomp>
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
TypeError: tuple indices must be integers or slices, not tuple
@celbree
Copy link
Member

celbree commented Jun 2, 2021

Hi,

It seems I couldn't reproduce your error.

I guess it related to multi-GPU training. Did you use multi-GPU training with DataParallel instead of DistributedDataParallel? It's recommended to run eval_bleu on a single GPU. It would cause errors of BLEU score when evaluating during multi-GPU training.

I have committed a change for this to only calculate ppl when evaluating during training. You can evaluate BLEU scores after the training finished.

@micheletufano
Copy link
Author

Hi,

Yes, I was training with multi-GPU. I pulled the latest version and your change fixes the issues.
Thanks for the quick reply and fix. I'm closing the issue.

@micheletufano
Copy link
Author

Hi @celbree,

I'm reopening this because the issue is still present during evaluation (not training), even on a single GPU.

If I run the evaluation script, with export CUDA_VISIBLE_DEVICES=0, I still get this error:

Traceback (most recent call last):
  File "run.py", line 655, in <module>
    main()
  File "run.py", line 646, in main
    dev_bleu, dev_EM = eval_bleu(args, model, tokenizer, file_type='dev', num=2000)
  File "run.py", line 380, in eval_bleu
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
  File "run.py", line 380, in <listcomp>
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
TypeError: tuple indices must be integers or slices, not tuple

The nvidia-smi confirms that only 1 GPU was running for that evaluation.
I'm running on Tesla V100.

@micheletufano micheletufano reopened this Jun 3, 2021
@celbree
Copy link
Member

celbree commented Jun 4, 2021

Would you please provide your version number of python, pytorch and transformers?

@micheletufano
Copy link
Author

python: 3.6.9
torch: 1.8.0+cu111
transformers: 4.3.2

@celbree
Copy link
Member

celbree commented Jun 7, 2021

It seems in transformers 4.3.2, the format of outputs in GPT-2 model has been changed. You could try downgrade it to 3.3.0 and it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants