🐛 Trainer on TPU : KeyError 'getstate' #4327

astariul · 2020-05-13T02:54:57Z

🐛 Bug

Information

Model I am using : ELECTRA base

Language I am using the model on : English

The problem arises when using:

the official example scripts
my own modified scripts

The tasks I am working on is:

an official GLUE/SQUaD task
my own task or dataset

To reproduce

I'm trying to fine-tune a model on Colab TPU, using the new Trainer API. But I'm struggling.

Here is a self-contained Colab notebook to reproduce the error (it's a dummy example).

When running the notebook, I get the following error :

File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 199, in __getattr__
    return self.data[item]
KeyError: '__getstate__'

Full stack trace :

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/distributed/parallel_loader.py", line 172, in _worker
    batch = xm.send_cpu_data_to_device(batch, device)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 624, in send_cpu_data_to_device
    return ToXlaTensorArena(convert_fn, select_fn).transform(data)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 307, in transform
    return self._replace_tensors(inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 301, in _replace_tensors
    convert_fn)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 199, in for_each_instance_rewrite
    return _for_each_instance_rewrite(value, select_fn, fn, rwmap)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 179, in _for_each_instance_rewrite
    result.append(_for_each_instance_rewrite(x, select_fn, fn, rwmap))
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 187, in _for_each_instance_rewrite
    result = copy.copy(value)
  File "/usr/lib/python3.6/copy.py", line 96, in copy
    rv = reductor(4)
  File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 199, in __getattr__
    return self.data[item]
KeyError: '__getstate__'

Any hint on how to make this dummy example work is welcomed.

Environment info

transformers version: 2.9.0
Platform: Linux-4.19.104+-x86_64-with-Ubuntu-18.04-bionic
Python version: 3.6.9
PyTorch version (GPU?): 1.6.0a0+cf82011 (False)
Tensorflow version (GPU?): 2.2.0 (False)
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

@jysohn23 @julien-c

The text was updated successfully, but these errors were encountered:

astariul · 2020-05-14T04:59:40Z

As a temporary work-around, I made the BatchEncoding object pickable :

from transformers.tokenization_utils import BatchEncoding

def red(self):
    return BatchEncoding, (self.data, )

BatchEncoding.__reduce__ = red

Not closing yet, as this seems to be just a work-around and not a real solution.

julien-c · 2020-05-14T11:56:02Z

Cc @mfuntowicz

stale · 2020-07-13T13:00:29Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

wenfeixiang1991 · 2020-07-16T11:57:23Z

Hi，I got same error with transformers 2.9.0， and 2.9.1 following error：

Traceback (most recent call last):
Traceback (most recent call last):
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 203, in getattr
return self.data[item]
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 203, in getattr
return self.data[item]
KeyError: 'getstate'
KeyError: 'getstate'
Traceback (most recent call last):
Traceback (most recent call last):
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 203, in getattr
return self.data[item]
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
KeyError: 'getstate'
File "/Users/kiwi/anaconda/python.app/Contents/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 203, in getattr
return self.data[item]
KeyError: 'getstate'

My code pieces：

dl = DataLoader(data, batch_size=self.batch_size, shuffle=self.shuffle, collate_fn=partial(self.tok_collate),
num_workers=2)
return dl

def tok_collate(self, batch_data):

    encoded = self.tokenizer.batch_encode_plus(
        [x[0] for x in batch_data],
        add_special_tokens=True,
        #return_tensors='pt',
        pad_to_max_length=True)

    for i in range(len(encoded['input_ids'])):
        print("tokens         : {}".format([self.tokenizer.convert_ids_to_tokens(s) for s in encoded['input_ids'][i]]))
        print("input_ids      : {}".format(encoded['input_ids'][i]))
        print("token_type_ids : {}".format(encoded['token_type_ids'][i]))
        print("attention_mask : {}".format(encoded['attention_mask'][i]))
        print('------------')

    if self.predict:
        return encoded
    else:
        labels = torch.tensor([x[1] for x in batch_data])
        # print('labels: ', labels)
        return encoded, labels

wenfeixiang1991 · 2020-07-16T12:05:11Z

@colanim How did you solve it？

astariul · 2020-07-16T23:43:33Z

I think it was fixed in the latest version of transformers.

If you need to work with an older version of transformers, for me the work-around I mentioned earlier was working :

As a temporary work-around, I made the BatchEncoding object pickable :

from transformers.tokenization_utils import BatchEncoding

def red(self):
    return BatchEncoding, (self.data, )

BatchEncoding.__reduce__ = red

LysandreJik · 2020-07-28T08:27:39Z

Indeed, this should have been fixed in the versions v3+. Thanks for opening an issue @colanim.

astariul changed the title ~~🐛 Trainer on TPU :~~ 🐛 Trainer on TPU : KeyError '__getstate__' May 13, 2020

stale bot added the wontfix label Jul 13, 2020

stale bot removed the wontfix label Jul 16, 2020

LysandreJik closed this as completed Jul 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Trainer on TPU : KeyError 'getstate' #4327

🐛 Trainer on TPU : KeyError 'getstate' #4327

astariul commented May 13, 2020 •

edited

astariul commented May 14, 2020

julien-c commented May 14, 2020

stale bot commented Jul 13, 2020

wenfeixiang1991 commented Jul 16, 2020

wenfeixiang1991 commented Jul 16, 2020

astariul commented Jul 16, 2020

LysandreJik commented Jul 28, 2020

🐛 Trainer on TPU : KeyError '__getstate__' #4327

🐛 Trainer on TPU : KeyError '__getstate__' #4327

Comments

astariul commented May 13, 2020 • edited

🐛 Bug

Information

To reproduce

Environment info

astariul commented May 14, 2020

julien-c commented May 14, 2020

stale bot commented Jul 13, 2020

wenfeixiang1991 commented Jul 16, 2020

wenfeixiang1991 commented Jul 16, 2020

astariul commented Jul 16, 2020

LysandreJik commented Jul 28, 2020

🐛 Trainer on TPU : KeyError 'getstate' #4327

🐛 Trainer on TPU : KeyError 'getstate' #4327

astariul commented May 13, 2020 •

edited