Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug - TFBertForSequenceClassification on SQUaD data #4453

Closed
2 tasks
yonatanbitton opened this issue May 19, 2020 · 11 comments
Closed
2 tasks

Bug - TFBertForSequenceClassification on SQUaD data #4453

yonatanbitton opened this issue May 19, 2020 · 11 comments

Comments

@yonatanbitton
Copy link

yonatanbitton commented May 19, 2020

🐛 Bug

Information

I'm using TFBertForSequenceClassification on SQUaD data v1 data.

The problem arises when using:

  • Both official example scripts and my own modified scripts

The tasks I am working on is:

  • an official SQUaD v1 data and my own SQUaD v1 data.

To reproduce

Try 1 - with official squad via tensorflow_datasets.load("squad"), trying to mimic the following official reference -

https://github.com/huggingface/transformers#quick-tour-tf-20-training-and-pytorch-interoperability

import tensorflow as tf
from transformers import TFBertForSequenceClassification, BertTokenizer, \
    squad_convert_examples_to_features, SquadV1Processor
import tensorflow_datasets

model = TFBertForSequenceClassification.from_pretrained("bert-base-cased")
tokenizer = BertTokenizer.from_pretrained("bert-base-cased")

data = tensorflow_datasets.load("squad")
processor = SquadV1Processor()
examples = processor.get_examples_from_dataset(data, evaluate=False)

dataset_features = squad_convert_examples_to_features(examples=examples, tokenizer=tokenizer, max_seq_length=384, doc_stride=128, max_query_length=64, is_training=True, return_dataset='tf')
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
opt = tf.keras.optimizers.Adam(learning_rate=3e-5)

model.compile(optimizer=opt,
              loss={'start_position': loss_fn, 'end_position': loss_fn},
              loss_weights={'start_position': 1., 'end_position': 1.},
              metrics=['accuracy'])

model.fit(dataset_features, epochs=3)

Stacktrace: - the bug is at the squad_convert_examples_to_features part

convert squad examples to features:   0%|             | 0/10570 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/data/processors/squad.py", line 95, in squad_convert_example_to_features
    cleaned_answer_text = " ".join(whitespace_tokenize(example.answer_text))
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/tokenization_bert.py", line 112, in whitespace_tokenize
    text = text.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/examples_git/huggingface_tf_example_squad.py", line 18, in <module>
    dataset_features = squad_convert_examples_to_features(examples=examples, tokenizer=tokenizer, max_seq_length=384, doc_stride=128, max_query_length=64, is_training=True, return_dataset='tf')
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/data/processors/squad.py", line 327, in squad_convert_examples_to_features
    disable=not tqdm_enabled,
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tqdm/std.py", line 1129, in __iter__
    for obj in iterable:
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/multiprocessing/pool.py", line 320, in <genexpr>
    return (item for chunk in result for item in chunk)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
AttributeError: 'NoneType' object has no attribute 'strip'

Try 2 - readine data from file, trying to mimic the following official reference- https://colab.research.google.com/github/huggingface/nlp/blob/master/notebooks/Overview.ipynb

import tensorflow as tf
from transformers import TFBertForSequenceClassification, BertTokenizer, \
    squad_convert_examples_to_features, SquadV1Processor
import tensorflow_datasets

model = TFBertForSequenceClassification.from_pretrained("bert-base-cased")
tokenizer = BertTokenizer.from_pretrained("bert-base-cased")

data = tensorflow_datasets.load("squad", data_dir='/data/users/yonatab/zero_shot_data/datasets_refs')
processor = SquadV1Processor()
examples = processor.get_examples_from_dataset(data, evaluate=True)

dataset_features = squad_convert_examples_to_features(examples=examples, tokenizer=tokenizer, max_seq_length=384, doc_stride=128, max_query_length=64, is_training=True, return_dataset='tf')
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
opt = tf.keras.optimizers.Adam(learning_rate=3e-5)

model.compile(optimizer=opt,
              loss={'start_position': loss_fn, 'end_position': loss_fn},
              loss_weights={'start_position': 1., 'end_position': 1.},
              metrics=['accuracy'])

model.fit(dataset_features, epochs=3)

Stacktrace: - the bug is at the fit method

Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/examples_git/minimal_example_for_git.py", line 97, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/examples_git/minimal_example_for_git.py", line 69, in main
    history = model.fit(tfdataset, epochs=1, steps_per_epoch=3)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 235, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 593, in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 706, in _process_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/data_adapter.py", line 702, in __init__
    x = standardize_function(x)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 660, in standardize_function
    standardize(dataset, extract_tensors_from_dataset=False)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2360, in _standardize_user_data
    self._compile_from_inputs(all_inputs, y_input, x, y)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2580, in _compile_from_inputs
    target, self.outputs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1341, in cast_if_floating_dtype_and_mismatch
    if target.dtype != out.dtype:
AttributeError: 'str' object has no attribute 'dtype'

Try 3

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

    processor = SquadV1Processor()
    examples = processor.get_train_examples(args.data_dir, filename=args.train_file)
    dataset_features = squad_convert_examples_to_features(examples=examples, tokenizer=tokenizer, max_seq_length=384,
                                                          doc_stride=128, max_query_length=64, is_training=True,
                                                          return_dataset='tf')

    model = TFBertForQuestionAnswering.from_pretrained("bert-base-cased")

    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
    opt = tf.keras.optimizers.Adam(learning_rate=3e-5)

    model.compile(optimizer=opt,
                  loss={'start_position': loss_fn, 'end_position': loss_fn},
                  loss_weights={'start_position': 1., 'end_position': 1.},
                  metrics=['accuracy'])

    history = model.fit(dataset_features, epochs=1)

Stacktrace: - the bug is at the fit method

Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/examples_git/reading_from_file.py", line 39, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/examples_git/reading_from_file.py", line 32, in main
    history = model.fit(dataset_features, epochs=1)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 235, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 593, in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 706, in _process_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/data_adapter.py", line 702, in __init__
    x = standardize_function(x)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 660, in standardize_function
    standardize(dataset, extract_tensors_from_dataset=False)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2360, in _standardize_user_data
    self._compile_from_inputs(all_inputs, y_input, x, y)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2580, in _compile_from_inputs
    target, self.outputs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1341, in cast_if_floating_dtype_and_mismatch
    if target.dtype != out.dtype:
AttributeError: 'str' object has no attribute 'dtype'

Try 4 - (after first comment here)

I'm using the code of run_tf_squad.py and instead of the VFTrainer i'm trying to use fit.
This is the only change I made - same dataset, same examples, same features. Just trying to use fit.

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
    opt = tf.keras.optimizers.Adam(learning_rate=3e-5)

model.compile(optimizer=opt,
                  loss={'output_1': loss_fn, 'output_2': loss_fn},
                  loss_weights={'output_1': 1., 'output_2': 1.},
                  metrics=['accuracy'])

history = model.fit(train_dataset, validation_data=eval_dataset, epochs=1)

And it's the same problem that occurs:

Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/run_squad_tf.py", line 257, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/run_squad_tf.py", line 242, in main
    history = model.fit(train_dataset, validation_data=eval_dataset, epochs=1)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 235, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 593, in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 706, in _process_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/data_adapter.py", line 702, in __init__
    x = standardize_function(x)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 660, in standardize_function
    standardize(dataset, extract_tensors_from_dataset=False)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2360, in _standardize_user_data
    self._compile_from_inputs(all_inputs, y_input, x, y)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2580, in _compile_from_inputs
    target, self.outputs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1341, in cast_if_floating_dtype_and_mismatch
    if target.dtype != out.dtype:
AttributeError: 'str' object has no attribute 'dtype'

Expected behavior

I want to be able to use fit on my own squad data.

Environment info

  • transformers version: 2.9.1
  • Platform: Linux
  • Python version: 3.6.6
  • PyTorch version (GPU?): - Using tensorflow
  • Tensorflow version (GPU?): 2.1.0
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Edit:
Keras has a new tutorial for it:
https://keras.io/examples/nlp/text_extraction_with_bert/

@jplu
Copy link
Contributor

jplu commented May 23, 2020

Hello!

If you want to train over SQuAD I suggest you to use the run_tf_squad.py example that uses the TF Trainer or to check the following Colab that uses the new nlp framework with a .fit() method.

@yonatanbitton
Copy link
Author

yonatanbitton commented May 24, 2020

Hey.
Did you see my examples?
At "Try 2" I explained the problems using the new nlp framework with a .fit() method
I need to use a custom dataset.

Regarding run_tf_squad.py, I still have problems with it.
I want to use the fit method.
I'm using this code instead of the VFTrainer in the same file run_tf_squad.py.
This is the only change I made - same dataset, same examples, same features. Just trying to use fit.

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
    opt = tf.keras.optimizers.Adam(learning_rate=3e-5)

model.compile(optimizer=opt,
                  loss={'output_1': loss_fn, 'output_2': loss_fn},
                  loss_weights={'output_1': 1., 'output_2': 1.},
                  metrics=['accuracy'])

history = model.fit(train_dataset, validation_data=eval_dataset, epochs=1)

And it's the same problem that occurs:

Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/run_squad_tf.py", line 257, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/run_squad_tf.py", line 242, in main
    history = model.fit(train_dataset, validation_data=eval_dataset, epochs=1)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 235, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 593, in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 706, in _process_inputs
    use_multiprocessing=use_multiprocessing)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/data_adapter.py", line 702, in __init__
    x = standardize_function(x)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 660, in standardize_function
    standardize(dataset, extract_tensors_from_dataset=False)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2360, in _standardize_user_data
    self._compile_from_inputs(all_inputs, y_input, x, y)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2580, in _compile_from_inputs
    target, self.outputs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1341, in cast_if_floating_dtype_and_mismatch
    if target.dtype != out.dtype:
AttributeError: 'str' object has no attribute 'dtype'

I will add this to the post as a failing example - Try 4

@jplu
Copy link
Contributor

jplu commented May 24, 2020

Sorry, misunderstanding, what I meant is that I proposed you to check how the features are built, if you want to use .fit() the features have to be built differently than in squad_convert_examples_to_features, also you have to use TF 2.2. Otherwise if you want to use this method, you have to pass by the trainer.

Also why using TFBertForSequenceClassification instead of TFBertForQuestionAnswering?

@yonatanbitton
Copy link
Author

yonatanbitton commented May 24, 2020

Thank you for the answer. I prefare to use fit, you dont support it?
Anyway, this is the status with the VFTrainer:

I've used tensorflow 2.1.0 and I've now upgradeed to 2.2.0.
I still have problems:

 trainer = TFTrainer(model=model, args=training_args, train_dataset=train_dataset, eval_dataset=dev_dataset)
    print(f"Created TFTrainer")
    trainer.train()

It does create the TFTrainer, but when getting to the .train() cmd it fails:

Created TFTrainer
WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py:364: StrategyBase.experimental_run_v2 (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
renamed to `run`
WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/indexed_slices.py:434: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/my_squad_tf_with_trainer.py", line 112, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/my_squad_tf_with_trainer.py", line 34, in main
    trainer.train()
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py", line 277, in train
    for training_loss in self._training_steps():
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py", line 323, in _training_steps
    self._apply_gradients()
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 627, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 506, in _initialize
    *args, **kwds))
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2446, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2777, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2667, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 981, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 441, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3299, in bound_method_wrapper
    return wrapped_fn(*args, **kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 968, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py:329 _apply_gradients  *
        self.args.strategy.experimental_run_v2(self._step)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py:343 _step  *
        self.optimizer.apply_gradients(list(zip(gradients, vars)))
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/optimization_tf.py:135 apply_gradients  *
        return super(AdamWeightDecay, self).apply_gradients(zip(grads, tvars), name=name,)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:478 apply_gradients  **
        self._create_all_weights(var_list)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:663 _create_all_weights
        self._create_slots(var_list)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adam.py:156 _create_slots
        self.add_slot(var, 'm')
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:716 add_slot
        .format(strategy, var))

    ValueError: Trying to create optimizer slot variable under the scope for tf.distribute.Strategy (<tensorflow.python.distribute.one_device_strategy.OneDeviceStrategy object at 0x7f2d141aec50>), which is different from the scope used for the original variable (<tf.Variable 'tf_bert_for_question_answering/bert/embeddings/word_embeddings/weight:0' shape=(28996, 768) dtype=float32, numpy=
    array([[-0.00054784, -0.04156886,  0.01308366, ..., -0.0038919 ,
            -0.0335485 ,  0.0149841 ],
           [ 0.01688265, -0.03106827,  0.0042053 , ..., -0.01474032,
            -0.03561099, -0.0036223 ],
           [-0.00057234, -0.02673604,  0.00803954, ..., -0.01002474,
            -0.0331164 , -0.01651673],
           ...,
           [-0.00643814,  0.01658491, -0.02035619, ..., -0.04178825,
            -0.049201  ,  0.00416085],
           [-0.00483562, -0.00267701, -0.02901638, ..., -0.05116647,
             0.00449265, -0.01177113],
           [ 0.03134822, -0.02974372, -0.02302896, ..., -0.01454749,
            -0.05249038,  0.02843569]], dtype=float32)>). Make sure the slot variables are created under the same strategy scope. This may happen if you're restoring from a checkpoint outside the scope

Thank you

@jplu
Copy link
Contributor

jplu commented May 24, 2020

This error means that you haven't created the model in the proper scope. Did you use the scope created in the TrainerArgs?

What gives you the following command line without touching to the initial code:

python examples/question-answering/run_tf_squad.py \
    --model_name_or_path bert-base-uncased \
    --output_dir model \
    --max-seq-length 384 \
    --num_train_epochs 2 \
    --per_gpu_train_batch_size 8 \
    --per_gpu_eval_batch_size 16 \
    --do_train \
    --logging_dir logs \
    --mode question-answering \
    --logging_steps 10 \
    --learning_rate 3e-5 \
    --doc_stride 128 \
    --optimizer_name adamw

@yonatanbitton
Copy link
Author

yonatanbitton commented May 24, 2020

That code works, but I need one extra thing: evaluation/prediction on test dataset, and it doesn't work for me.

I took the run_tf_squad.py and added simple changes:

test_examples = processor.get_dev_examples(data_args.data_dir, filename='test-v1.1.json')
test_dataset = (
        squad_convert_examples_to_features(
            examples=test_examples,
            tokenizer=tokenizer,
            max_seq_length=data_args.max_seq_length,
            doc_stride=data_args.doc_stride,
            max_query_length=data_args.max_query_length,
            is_training=False,
            return_dataset="tf",
        )
    )

That is, only adding the test dataset.
Now I want to evalute my final model on it. I tried with both predict and evaluate and it doesn't work.

Try 1 -

results = trainer.evaluate(test_dataset)

Trace:

05/24/2020 10:55:39 - INFO - transformers.trainer_tf -   ***** Running Evaluation *****
05/24/2020 10:55:39 - INFO - transformers.trainer_tf -     Batch size = 16
Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/my_run_squad_tf_with_trainer.py", line 208, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/my_run_squad_tf_with_trainer.py", line 203, in main
    # results = trainer.evaluate(test_dataset)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py", line 246, in evaluate
    output = self._prediction_loop(eval_dataset, description="Evaluation")
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py", line 195, in _prediction_loop
    loss, logits = self._evaluate_steps(features, labels)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 627, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 506, in _initialize
    *args, **kwds))
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2446, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2777, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2667, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 981, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 441, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3299, in bound_method_wrapper
    return wrapped_fn(*args, **kwargs)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 968, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py:171 _evaluate_steps  *
        per_replica_loss, per_replica_logits = self.args.strategy.experimental_run_v2(
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py:400 _run_model  *
        logits = self.model(features, training=training)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/modeling_tf_bert.py:1163 call  *
        outputs = self.bert(inputs, **kwargs)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/modeling_tf_bert.py:548 call  *
        extended_attention_mask = attention_mask[:, tf.newaxis, tf.newaxis, :]
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:984 _slice_helper
        name=name)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:1150 strided_slice
        shrink_axis_mask=shrink_axis_mask)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py:10179 strided_slice
        shrink_axis_mask=shrink_axis_mask, name=name)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:744 _apply_op_helper
        attrs=attr_protos, op_def=op_def)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py:595 _create_op_internal
        compute_device)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:3327 _create_op_internal
        op_def=op_def)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1817 __init__
        control_input_ops, op_def)
    /home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1657 _create_c_op
        raise ValueError(str(e))

    ValueError: Index out of range using input dim 1; input has only 1 dims for '{{node tf_bert_for_question_answering/bert/strided_slice}} = StridedSlice[Index=DT_INT32, T=DT_INT32, begin_mask=9, ellipsis_mask=0, end_mask=9, new_axis_mask=6, shrink_axis_mask=0](per_replica_features, tf_bert_for_question_answering/bert/strided_slice/stack, tf_bert_for_question_answering/bert/strided_slice/stack_1, tf_bert_for_question_answering/bert/strided_slice/stack_2)' with input shapes: [128], [4], [4], [4] and with computed input tensors: input[3] = <1 1 1 1>.

Try 2:

predictions = trainer.predict(test_dataset)

Trace:

05/24/2020 11:06:50 - INFO - transformers.trainer_tf -   ***** Running Prediction *****
05/24/2020 11:06:50 - INFO - transformers.trainer_tf -     Batch size = 16
Traceback (most recent call last):
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/my_run_squad_tf_with_trainer.py", line 208, in <module>
    main()
  File "/home/ec2-user/yonatab/ZeroShot/transformers_experiments/src/my_run_squad_tf_with_trainer.py", line 201, in main
    predictions = trainer.predict(test_dataset)
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py", line 430, in predict
    return self._prediction_loop(test_dataset, description="Prediction")
  File "/home/ec2-user/anaconda3/envs/yonatan_env_tf2/lib/python3.6/site-packages/transformers/trainer_tf.py", line 213, in _prediction_loop
    preds = logits.numpy()
AttributeError: 'tuple' object has no attribute 'numpy'

@jplu
Copy link
Contributor

jplu commented May 24, 2020

That's normal, the evaluation/prediction are not implemented yet. I have to make the example compliant with the SQuAD metric from the nlp framework. It means that for now only training is possible.

But if you want to make this integration yourself and do a PR, you are very welcome to do it :) Otherwise I think to be able to do it in the next 2 coming weeks. Really sorry for that.

@yonatanbitton
Copy link
Author

Thank you for the answers.

That's why I tried to use the normal tensorflow fit and predict methods as shown here https://blog.tensorflow.org/2019/11/hugging-face-state-of-art-natural.html.

Basically I just want to do training and evaluation during training, and then testing on the test dataset.
I succeed to do it with the pytorch model (run_squad.py), and I now tried to do it with the tensorflow model as well. If it will be implemented in the future it will be great, I will wait.

Thanks :) 👍

@jplu
Copy link
Contributor

jplu commented May 24, 2020

I very quickly coded this so it is not really tested but it can gives you an idea of how to use .fit() method. It is based on the Colab version proposed for the nlp framework.

from transformers import (
    BertTokenizerFast,
    TFBertForQuestionAnswering,
)
import tensorflow_datasets as tfds
import tensorflow as tf

ds = tfds.load("squad")

tokenizer = BertTokenizerFast.from_pretrained("bert-base-cased")
model = TFBertForQuestionAnswering.from_pretrained("bert-base-cased")

def get_correct_alignement(context, gold_text, start_idx):
    end_idx = start_idx + len(gold_text)
    if context[start_idx:end_idx] == gold_text:
        return start_idx, end_idx       # When the gold label position is good
    elif context[start_idx-1:end_idx-1] == gold_text:
        return start_idx-1, end_idx-1   # When the gold label is off by one character
    elif context[start_idx-2:end_idx-2] == gold_text:
        return start_idx-2, end_idx-2   # When the gold label is off by two character
    else:
        raise ValueError()

def convert_to_tf_features(example, training=True):
   encodings = tokenizer.encode_plus(example["context"].numpy().decode("utf-8"), example["question"].numpy().decode("utf-8"), pad_to_max_length=True, max_length=512)
    start_positions, end_positions = [], []
    
    if training:
      start_idx, end_idx = get_correct_alignement(example["context"].numpy().decode("utf-8"), example["answers"]["text"][0].numpy().decode("utf-8"), example["answers"]["answer_start"][0].numpy())
      start = encodings.char_to_token(0, start_idx)
      end = encodings.char_to_token(0, end_idx-1)
      
      if start is None or end is None:
        return None, None
      
      start_positions.append(start)
      end_positions.append(end)
    else:
      for i, start, text in enumerate(zip(example["answers"]["answer_start"], example["answers"]["text"])):
        start_idx, end_idx = get_correct_alignement(example["context"].numpy().decode("utf-8"), example["context"].numpy().decode("utf-8"), text.numpy().decode("utf-8"), start.numpy())
        
        start = encodings.char_to_token(0, start_idx)
        end = encodings.char_to_token(0, end_idx-1)
        
        if start is None or end is None:
          return None, None
        
        start_positions.append(start)
        end_positions.append(end)
    
    if start_positions and end_positions:
      encodings.update({'output_1': start_positions,
                        'output_2': end_positions})
    
    return encodings, {'output_1': start_positions, 'output_2': end_positions}

train_features = {}
train_labels = {}
for item in ds["train"]:
  feature, label = convert_to_tf_features(item)
  if feature is not None and label is not None:
    for k, v in feature.items():
      train_features.setdefault(k, []).append(v)
    for k, v in label.items():
      train_labels.setdefault(k, []).append(v)

train_tfdataset = tf.data.Dataset.from_tensor_slices((train_features, train_labels)).batch(8)

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE, from_logits=True)
opt = tf.keras.optimizers.Adam(learning_rate=3e-5)
model.compile(optimizer=opt,
              loss={'output_1': loss_fn, 'output_2': loss_fn},
              loss_weights={'output_1': 1., 'output_2': 1.},
              metrics=['accuracy'])

model.fit(train_tfdataset, epochs=1, steps_per_epoch=3)

@yonatanbitton
Copy link
Author

Thank for the help :)

I've succeded to use your code as reference with my dataset, converting examples to features:

def get_tf_dataset(args, processor, tokenizer, dataset_type):
    filename_by_case = {'train': args.train_file, 'dev': args.dev_file, 'test': args.test_file}
    func_by_case = {'train': processor.get_train_examples, 'dev': processor.get_dev_examples, 'test': processor.get_dev_examples}
    examples = func_by_case[dataset_type](args.data_dir, filename=filename_by_case[dataset_type])

    train_features = {}
    train_labels = {}
    for item in examples:
        feature, label = convert_to_tf_features(item, tokenizer)
        if feature is not None and label is not None:
            for k, v in feature.items():
                train_features.setdefault(k, []).append(v)
            for k, v in label.items():
                train_labels.setdefault(k, []).append(v)

    tfdataset = tf.data.Dataset.from_tensor_slices((train_features, train_labels)).batch(8)
    return tfdataset


def convert_to_tf_features(example, tokenizer, training=True):
    context = example.context_text # example["context"].numpy().decode("utf-8")
    question = example.question_text # example["question"].numpy().decode("utf-8")
    encodings = tokenizer.encode_plus(context, question, pad_to_max_length=True, max_length=512)
    start_positions, end_positions = [], []

    first_answer = example.answers[0] if len(example.answers) > 0 else "" # example["answers"]["text"][0].numpy().decode("utf-8")
    first_answer_start = example.start_position # example["answers"]["answer_start"][0].numpy()
    start_idx, end_idx = get_correct_alignement(context,
                                                first_answer,
                                                first_answer_start)
    start = encodings.char_to_token(0, start_idx)
    end = encodings.char_to_token(0, end_idx - 1) if end_idx > 0 else 0

    if start is None or end is None:
        return None, None

    start_positions.append(start)
    end_positions.append(end)

    if start_positions and end_positions:
        encodings.update({'output_1': start_positions,
                          'output_2': end_positions})

    return encodings, {'output_1': start_positions, 'output_2': end_positions}

I will check how to deal with the impossible answers by another references. In this example its empty string "" when no answer and end_position = 0. Thanks.

@Vanpesy
Copy link

Vanpesy commented Jun 15, 2020

Hi, how did you solve the Try 1 problem?
AttributeError: 'NoneType' object has no attribute 'strip'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants