Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] superGLUE example bug #53

Closed
1 of 2 tasks
xuanricheng opened this issue Jun 8, 2022 · 1 comment · Fixed by #57
Closed
1 of 2 tasks

[BUG] superGLUE example bug #53

xuanricheng opened this issue Jun 8, 2022 · 1 comment · Fixed by #57
Assignees

Comments

@xuanricheng
Copy link
Contributor

xuanricheng commented Jun 8, 2022

Describe the bug

key error from superGLUE example

task_name = 'qqp'
trainer = Trainer(env_type='pytorch',
                 pytorch_device="cuda",
                  epochs=2,
                  batch_size=1,
                  eval_interval=1000,
                  checkpoint_activations=False,
                  fp16=True,
                  log_interval=1,
                  save_dir="./glm_superglue_en",
                  # master_ip='127.0.0.1',
                  # master_port=17755,
                  # num_nodes=1,
                  # num_gpus=2,
                  # hostfile='./hostfile',
                  model_parallel_size=2,
                  deepspeed_config='./deepspeed.json',
                  training_script=__file__)

model = GLMForSingleTokenCloze.from_pretrain(download_path="/mnt/test_10b_models",
                                             model_name="GLM-large-en")

tokenizer = GLM10bENBPETokenizer()

train_dataset = SuperGlueDataset(task_name=task_name,
                                 data_dir='./datasets/',
                                 dataset_type='train',
                                 tokenizer=tokenizer,
                                 cloze_eval=True)
valid_dataset = SuperGlueDataset(task_name=task_name,
                                 data_dir='./datasets/',
                                 dataset_type='dev',
                                 tokenizer=tokenizer,
                                 cloze_eval=True)

cl_args = CollateArguments()
cl_args.cloze_eval = True

if task_name in ['copa', 'wsc', 'record']:
    cl_args.multi_token = True

from flagai.data.dataset import ConstructSuperglueStrategy

collate_fn = ConstructSuperglueStrategy(cl_args,
                                        tokenizer,
                                        task_name=task_name)
trainer.train(model,
              train_dataset=train_dataset,
              valid_dataset=valid_dataset,
              collate_fn=collate_fn,
              metric_methods=[["acc", accuracy_metric]])

Tasks

  • An officially supported task in the examples folder (such as GLUE/Title-generation, ...)
  • My own task or dataset

To Reproduce

Creating qqp dataset from file at ./datasets/ (split=train)
Returning 363846 train examples with label dist.: [('0', 229468), ('1', 134378)]
Creating qqp dataset from file at ./datasets/ (split=dev)
Returning 40430 dev examples with label dist.: [('0', 25545), ('1', 14885)]
Optimizer = Adam
[2022-06-08 17:54:06,911] [INFO] [logger.py:70:log_dist] [Rank -1] loading checkpoints form checkpoints/99
[2022-06-08 17:54:06,912] [INFO] [logger.py:70:log_dist] [Rank -1] WARNING: could not find the metadata file checkpoints/99/latest_checkpointed_iteration.txt
[2022-06-08 17:54:06,912] [INFO] [logger.py:70:log_dist] [Rank -1]     will not load any checkpoints and will start from random
[2022-06-08 17:54:06,912] [INFO] [logger.py:70:log_dist] [Rank -1] working on epoch 0 ...
Traceback (most recent call last):
  File "train_10b_superglue.py", line 59, in <module>
    trainer.train(model,
  File "/opt/conda/lib/python3.8/site-packages/flagai-1.0.1-py3.8.egg/flagai/trainer.py", line 448, in train
    for iteration_, batch in enumerate(train_dataloader):
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 570, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    return self.collate_fn(data)
  File "/opt/conda/lib/python3.8/site-packages/flagai-1.0.1-py3.8.egg/flagai/data/dataset/data_collator/collate_fn.py", line 105, in __call__
    sample = self.pvp.encode(example, {})
  File "/opt/conda/lib/python3.8/site-packages/flagai-1.0.1-py3.8.egg/flagai/data/dataset/superglue/pvp.py", line 195, in encode
    raw_parts_a, raw_parts_b = self.get_parts(example)
  File "/opt/conda/lib/python3.8/site-packages/flagai-1.0.1-py3.8.egg/flagai/data/dataset/superglue/pvp.py", line 1493, in get_parts
    return [text_a], [" Do you mean ", text_b, [self.mask], "."]
  File "/opt/conda/lib/python3.8/site-packages/flagai-1.0.1-py3.8.egg/flagai/data/dataset/superglue/pvp.py", line 99, in mask
    return self.tokenizer.get_command('MASK').Id
  File "/opt/conda/lib/python3.8/site-packages/flagai-1.0.1-py3.8.egg/flagai/data/tokenizer/tokenizer.py", line 172, in get_command
    return self.command_name_map[name]
KeyError: 'MASK'

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots

flagAI_bug

OS (please complete the following information):

  • Version [ v1.0.1]
@marscrazy
Copy link
Contributor

fixed #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants