run_mlm.py Issue | MODEL_FOR_MASKED_LM_MAPPING is None #14366

fansiawang · 2021-11-11T08:13:24Z

Environment info

transformers version: 4.13.0.dev0
Platform: Linux-3.10.0-1160.25.1.el7.x86_64-x86_64-with-centos-7.6.1810-Core
Python version: 3.7.3
PyTorch version (GPU?): not installed (NA)
Tensorflow version (GPU?): 2.7.0 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help

@sgugger @Rocketknight1 @Elysium1436

Information

Model I am using Bert:

The problem arises when using:

[ run_mlm.py ] the official example scripts: (give details below)

The tasks I am working on is:

[ language-modeling ] an official GLUE/SQUaD task: (give the name)

To reproduce

Steps to reproduce the behavior:

prepare the env

python3 -m venv venv 

git clone https://github.com/huggingface/transformers
cd transformers
pip install .

pip install tensorflow
pip install datasets
pip install sklearn

run the script

python run_mlm.py \
--model_name_or_path distilbert-base-cased \
--output_dir output \
--dataset_name wikitext \
--dataset_config_name wikitext-103-raw-v1

get some error information

Traceback (most recent call last):
  File "run_mlm.py", line 63, in <module>
    MODEL_CONFIG_CLASSES = list(MODEL_FOR_MASKED_LM_MAPPING.keys())
AttributeError: 'NoneType' object has no attribute 'keys'

Expected behavior

I want to judge whether two lines of text should be merged into one line.
For example:

input: 

The preparations for the Beijing
Winter Olympics are progressing smoothly and are
fully recognized by the International Olympic
Committee, he said. 

output:

The preparations for the Beijing Winter Olympics are progressing smoothly and are fully recognized by the International Olympic Committee, he said.

I think maybe the masked language model can do this. I insert a [MERGE] or [SPLIT] special token into the gap of two lines and only masked these two tokens when construct masked tokens like this:

source input:
The preparations for the Beijing [MERGE] Winter Olympics are progressing smoothly and are [MERGE]  fully recognized by the International Olympic [MERGE] Committee, he said. 

masked input:
The preparations for the Beijing [mask] Winter Olympics are progressing smoothly and are [mask]  fully recognized by the International Olympic [mask] Committee, he said.

But when I try to execute the original script run_mlm.py by Tutorials, I get the above error. What do I need to do to perform the training correctly? And do you think the task of merging sentences can be solved by language models?

The text was updated successfully, but these errors were encountered:

fansiawang · 2021-11-11T08:33:28Z

I sovled the error by execute pip install -r examples/pytorch/language-modeling/requirements.txt, why should I install the requirements of pytorch example for tensorflow example?

sgugger · 2021-11-11T13:19:30Z

Indeed the TensorFlow examples should use the TF mappings, cc @Rocketknight1

Rocketknight1 · 2021-11-11T14:47:56Z

Thank you for this bug report! We've added a PR to fix it, hopefully it will be merged soon.

Rocketknight1 linked a pull request Nov 11, 2021 that will close this issue

Fixing requirements for TF LM models and use correct model mappings #14372

Merged

Rocketknight1 closed this as completed in #14372 Nov 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run_mlm.py Issue | MODEL_FOR_MASKED_LM_MAPPING is None #14366

run_mlm.py Issue | MODEL_FOR_MASKED_LM_MAPPING is None #14366

fansiawang commented Nov 11, 2021 •

edited

Loading

fansiawang commented Nov 11, 2021

sgugger commented Nov 11, 2021

Rocketknight1 commented Nov 11, 2021

run_mlm.py Issue | MODEL_FOR_MASKED_LM_MAPPING is None #14366

run_mlm.py Issue | MODEL_FOR_MASKED_LM_MAPPING is None #14366

Comments

fansiawang commented Nov 11, 2021 • edited Loading

Environment info

Who can help

Information

To reproduce

Expected behavior

fansiawang commented Nov 11, 2021

sgugger commented Nov 11, 2021

Rocketknight1 commented Nov 11, 2021

fansiawang commented Nov 11, 2021 •

edited

Loading