'torch._C.PyTorchFileReader' object has no attribute'seek' #994

deadsoul44 · 2021-06-10T15:48:45Z

Hello,

I am using the following model for sentence similarity

https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual/tree/main

word_embedding_model = models.Transformer(bert_model_dir)  # , max_seq_length=512
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())
model = SentenceTransformer(modules=[word_embedding_model, pooling_model], device=device_str)

But, I get this error:

Traceback (most recent call last):

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 306, in _check_seekable

    f.seek(f.tell())

AttributeError:'torch._C.PyTorchFileReader' object has no attribute'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/home/work/anaconda/lib/python3.6/site-packages/transformers/modeling_utils.py", line 1205, in from_pretrained

    state_dict = torch.load(resolved_archive_file, map_location="cpu")

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 584, in load

    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)

  File "/home/work/anaconda/lib/python3.6/site-packages/moxing/framework/file/file_io_patch.py", line 200, in _load

    _check_seekable(f)

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 309, in _check_seekable

    raise_err_msg(["seek", "tell"], e)

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 302, in raise_err_msg

    raise type(e)(msg)

AttributeError:'torch._C.PyTorchFileReader' object has no attribute'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead .

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "code/similarity.py", line 118, in <module>

    word_embedding_model = models.Transformer(bert_model_dir) #, max_seq_length=512

  File "/home/work/anaconda/lib/python3.6/site-packages/sentence_transformers/models/Transformer.py", line 30, in __init__

    self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)

  File "/home/work/anaconda/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 381, in from_pretrained

    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)

  File "/home/work/anaconda/lib/python3.6/site-packages/transformers/modeling_utils.py", line 1208, in from_pretrained

    f"Unable to load weights from pytorch checkpoint file for'{pretrained_model_name_or_path}' "

OSError: Unable to load weights from pytorch checkpoint file for'/home/work/user-job-dir/input/pretrained_models/stsb-xlm-r-multilingual/' at'/home/work/user-job-dir/input /pretrained_models/stsb-xlm-r-multilingual/pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

I checked on web but could not find any solution. What could be the problem? Thank you.

The text was updated successfully, but these errors were encountered:

nreimers · 2021-06-10T16:20:44Z

Which PyTorch version are you using? Have you tried to update it to some recent version (>= 1.6.0)?

deadsoul44 · 2021-06-10T16:22:50Z

I am using 1.6.0

nreimers · 2021-06-10T16:31:14Z

Do other models work?

deadsoul44 · 2021-06-10T16:40:08Z

I will try this one:

https://huggingface.co/sentence-transformers/paraphrase-xlm-r-multilingual-v1/tree/main

I am renaming downloaded model zip file as pytorch_model.bin

nreimers · 2021-06-10T18:14:09Z

Hi,
not sure what you are doing. You can either provide the model name directly, and the code will download the model, or you must download the zip file from here:
https://sbert.net/models/

And unzip it by yourself. No renaming of files.

deadsoul44 · 2021-06-10T19:27:52Z

I get this when no renaming:

OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index', 'flax_model.msgpack'] found in directory /home/work/user-job-dir/input/pretrained_models/paraphrase-xlm-r-multilingual-v1/ or from_tfandfrom_flax set to False.

Unzipping extracts an archive folder and there are these files inside:

I will try to download from your link.

nreimers · 2021-06-10T19:38:35Z

What do you download?

As mentioned, you must use the zip files from here:
https://sbert.net/models/

It should have a 0_Transformer, 1_Pooling and several json files includes.

Also, you can just load it with:
model = SentenceTransformer('path/to/unzipped/folder')

deadsoul44 · 2021-06-10T20:37:27Z

Previously, I downloaded from huggingface. Now, I downloaded from sbert.net

I still get the same error:

I0611 04:23:01.643567 140559902119680 SentenceTransformer.py:39] Load pretrained SentenceTransformer: /home/work/user-job-dir/input/pretrained_models/paraphrase-xlm-r-multilingual-v1/

I0611 04:23:01.644223 140559902119680 SentenceTransformer.py:100] Load SentenceTransformer from folder: /home/work/user-job-dir/input/pretrained_models/paraphrase-xlm-r-multilingual-v1/

Traceback (most recent call last):

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 306, in _check_seekable

    f.seek(f.tell())

AttributeError:'torch._C.PyTorchFileReader' object has no attribute'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/home/work/anaconda/lib/python3.6/site-packages/transformers/modeling_utils.py", line 1205, in from_pretrained

    state_dict = torch.load(resolved_archive_file, map_location="cpu")

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 584, in load

    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)

  File "/home/work/anaconda/lib/python3.6/site-packages/moxing/framework/file/file_io_patch.py", line 200, in _load

    _check_seekable(f)

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 309, in _check_seekable

    raise_err_msg(["seek", "tell"], e)

  File "/home/work/anaconda/lib/python3.6/site-packages/torch/serialization.py", line 302, in raise_err_msg

    raise type(e)(msg)

AttributeError:'torch._C.PyTorchFileReader' object has no attribute'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead .

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "code/similarity.py", line 128, in <module>

    model = SentenceTransformer(bert_model_dir, device=device_str)

  File "/home/work/anaconda/lib/python3.6/site-packages/sentence_transformers/SentenceTransformer.py", line 114, in __init__

    module = module_class.load(os.path.join(model_path, module_config['path']))

  File "/home/work/anaconda/lib/python3.6/site-packages/sentence_transformers/models/Transformer.py", line 105, in load

    return Transformer(model_name_or_path=input_path, **config)

  File "/home/work/anaconda/lib/python3.6/site-packages/sentence_transformers/models/Transformer.py", line 30, in __init__

    self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)

  File "/home/work/anaconda/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 381, in from_pretrained

    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)

  File "/home/work/anaconda/lib/python3.6/site-packages/transformers/modeling_utils.py", line 1208, in from_pretrained

    f"Unable to load weights from pytorch checkpoint file for'{pretrained_model_name_or_path}' "

OSError: Unable to load weights from pytorch checkpoint file for'/home/work/user-job-dir/input/pretrained_models/paraphrase-xlm-r-multilingual-v1/0_Transformer' at'/home/work/user-job- dir/input/pretrained_models/paraphrase-xlm-r-multilingual-v1/0_Transformer/pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

I tried it now on my local machine also and it is working fine.
But, I get this error on company cloud although the same package versions are installed.

nreimers · 2021-06-11T07:15:33Z

Ah ok, that is the problem.

The torch load function requires that the file system supports seek (e.g. see https://www.tutorialspoint.com/python/file_seek.htm)

Apparently, your company cloud file system does not support this elementary file system operation. Hence, torch.load cannot load any file.

Real solution would be to get a better company cloud with a file system that support basic I/O operations.

In PyTorch 1.6, they changed the file format when models are saved. Maybe on your company cloud the old file format works? You can try it with this model:
https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/bert-base-nli-cls-token.zip

It has still the pre PyTorch 1.6. file format.

deadsoul44 · 2021-06-11T10:31:55Z

It seems to be working. Can this model be used for multilingual similarity calculation? Is there any alternative? Thank you.

nreimers · 2021-06-11T10:58:16Z

Sadly not.

But you can convert from the new torch format to the old format like this:

import torch
model = torch.load('pytorch_model.bin')
torch.save('pytorch_model.bin', model, _use_new_zipfile_serialization=False)

Run this on your local machine and then you can push it to your cloud

deadsoul44 · 2021-06-11T12:49:18Z

I get this error when saving the model.

Traceback (most recent call last):
  File "lib\site-packages\torch\serialization.py", line 366, in save
    _legacy_save(obj, opened_file, pickle_module, pickle_protocol)
  File "lib\site-packages\torch\serialization.py", line 426, in _legacy_save
    pickle_module.dump(MAGIC_NUMBER, f, protocol=pickle_protocol)
TypeError: file must have a 'write' attribute

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "code/similarity.py", line 85, in <module>
    torch.save(
  File "lib\site-packages\torch\serialization.py", line 366, in save
    _legacy_save(obj, opened_file, pickle_module, pickle_protocol)
  File "lib\site-packages\torch\serialization.py", line 224, in __exit__
    self.file_like.flush()
AttributeError: 'collections.OrderedDict' object has no attribute 'flush'

Process finished with exit code 1

deadsoul44 · 2021-06-16T13:15:53Z

torch.save(model, 'pytorch_model.bin', _use_new_zipfile_serialization=False)

This is working. Thank you.

deadsoul44 closed this as completed Jun 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'torch._C.PyTorchFileReader' object has no attribute'seek' #994

'torch._C.PyTorchFileReader' object has no attribute'seek' #994

deadsoul44 commented Jun 10, 2021 •

edited

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021 •

edited

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021

nreimers commented Jun 11, 2021

deadsoul44 commented Jun 11, 2021

nreimers commented Jun 11, 2021

deadsoul44 commented Jun 11, 2021

deadsoul44 commented Jun 16, 2021

'torch._C.PyTorchFileReader' object has no attribute'seek' #994

'torch._C.PyTorchFileReader' object has no attribute'seek' #994

Comments

deadsoul44 commented Jun 10, 2021 • edited

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021 • edited

nreimers commented Jun 10, 2021

deadsoul44 commented Jun 10, 2021

nreimers commented Jun 11, 2021

deadsoul44 commented Jun 11, 2021

nreimers commented Jun 11, 2021

deadsoul44 commented Jun 11, 2021

deadsoul44 commented Jun 16, 2021

deadsoul44 commented Jun 10, 2021 •

edited

deadsoul44 commented Jun 10, 2021 •

edited