Picking max_sequence_length in run_classifier.py CoLA task #106

artemlos · 2018-12-10T09:04:47Z

Is there an upper bound for the max_sequence_length parameter when using run_classifier.py with CoLA task?

When I tested with the default max_sequence_length of 128, everything worked good, but once I changed it to something else, eg 1024, it started the training and failed on the first iteration with the error shown below:

Traceback (most recent call last):
  File "run_classifier.py", line 643, in <module>
    main()
  File "run_classifier.py", line 551, in main
    loss = model(input_ids, segment_ids, input_mask, label_ids)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/jet/var/python/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 868, in forward
    _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/jet/var/python/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 609, in forward
    embedding_output = self.embeddings(input_ids, token_type_ids)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/jet/var/python/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 199, in forward
    embeddings = self.dropout(embeddings)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/modules/dropout.py", line 53, in forward
    return F.dropout(input, self.p, self.training, self.inplace)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/functional.py", line 595, in dropout
    return _functions.dropout.Dropout.apply(input, p, training, inplace)
  File "/jet/var/python/lib/python3.6/site-packages/torch/nn/_functions/dropout.py", line 40, in forward
    ctx.noise.bernoulli_(1 - ctx.p).div_(1 - ctx.p)
RuntimeError: Creating MTGP constants failed. at /jet/tmp/build/aten/src/THC/THCTensorRandom.cu:34

The command I ran is

python run_classifier.py \
  --task_name CoLA \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir $GLUE_DIR/Test/ \
  --bert_model bert-base-uncased \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3.0 \
  --output_dir /tmp/BERT/test1

The text was updated successfully, but these errors were encountered:

rodgzilla · 2018-12-10T09:32:58Z

As mentioned in #89, the maximum value of max_sequence_length is 512.

artemlos · 2018-12-10T15:14:47Z

@rodgzilla thanks!

Fix tensor normalization in EmbeddingsPipeline

artemlos closed this as completed Dec 10, 2018

ocavue pushed a commit to ocavue/transformers that referenced this issue Sep 13, 2023

Merge pull request huggingface#106 from chrislee973/fix-norm-calculation

13b570c

Fix tensor normalization in EmbeddingsPipeline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Picking max_sequence_length in run_classifier.py CoLA task #106

Picking max_sequence_length in run_classifier.py CoLA task #106

artemlos commented Dec 10, 2018

rodgzilla commented Dec 10, 2018

artemlos commented Dec 10, 2018

Picking max_sequence_length in run_classifier.py CoLA task #106

Picking max_sequence_length in run_classifier.py CoLA task #106

Comments

artemlos commented Dec 10, 2018

rodgzilla commented Dec 10, 2018

artemlos commented Dec 10, 2018