Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError when trying to train xlnet-large-uncased for the PDTB2_L1 task #4

Closed
arxaqapi opened this issue Apr 28, 2021 · 1 comment

Comments

@arxaqapi
Copy link

arxaqapi commented Apr 28, 2021

Subject

When trying to run the run.py script to train an xlne-large-uncased for the PDTB2_LEVEL1 task, a strange RuntimeError occurs.

Environnment

  • OS: Pop!_OS 20.10 (Ubuntu based)
  • Python: 3.8.6
  • Pytorch and CUDA: 1.8.1+cu111

Expected behaviour

The script should start the training process and then evaluate at the end.

Steps to reproduce

After downloading the PDTB2.0 corpus in CSV format, I ran your preprocessing script as follows:

$ python preprocess/preprocess_pdtb2.py --data_file data/pdtb2.csv \
    --output_dir data/pdtb2_patterson_L1 \
    --split single \
    --split_name patterson \
    --label_type L1
row 40600
Label count:  {'Expansion': 8861, 'Comparison': 2503, 'Contingency': 4255, 'Temporal': 950}

After the preprocess, I ran the src/pytorch-transformers/examples/run_pdtb.py with th following parameters:

python ../src/pytorch-transformers/examples/run_pdtb.py \
    --model_type xlnet \
    --task_name PDTB2_LEVEL1 \
    --model_name_or_path xlnet-large-cased \
    --do_train \
    --evaluate_during_training \
    --do_eval \
    --data_dir ../data/pdtb2_patterson_L1/ \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size 32 \
    --per_gpu_train_batch_size 8 \
    --learning_rate 2e-6 \
    --num_train_epochs 10.0 \
    --output_dir output/output_xlnet_patterson_L1  \
    --save_steps 500 \
    --logging_steps 500 \
    --seed 1 \
    --validation_metric acc \
    --n_gpu 1 \
    --cuda_no 0 \
    --deterministic

And the error message is the following:

Traceback (most recent call last):
  File "../src/pytorch-transformers/examples/run_pdtb.py", line 592, in <module>
    main()
  File "../src/pytorch-transformers/examples/run_pdtb.py", line 518, in main
    model = model_class.from_pretrained(args.model_name_or_path, from_tf=bool('.ckpt' in args.model_name_or_path), config=config)
  File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_utils.py", line 536, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_xlnet.py", line 1110, in __init__
    self.transformer = XLNetModel(config)
  File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_xlnet.py", line 731, in __init__
    self.word_embedding = nn.Embedding(config.n_token, config.d_model)
  File "/home/aq/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 137, in __init__
    self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim))
RuntimeError: Trying to create tensor with negative dimension -1: [-1, 1024]
@najoungkim
Copy link
Owner

With a quick google search it looks like it's a huggingface issue (though, I haven't looked into it). huggingface/transformers#4094 might help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants