Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finetune issue #24

Closed
Celestial-Bai opened this issue Apr 24, 2021 · 2 comments
Closed

finetune issue #24

Celestial-Bai opened this issue Apr 24, 2021 · 2 comments

Comments

@Celestial-Bai
Copy link

Hi!
I tried to use DNABERT6 pre-training model to run the finetune process by using our examples. However, there occured an error:
Traceback (most recent call last):
File "run_finetune.py", line 1281, in
main()
File "run_finetune.py", line 1095, in main
train_dataset = load_and_cache_examples(args, args.task_name, tokenizer, evaluate=False)
File "run_finetune.py", line 704, in load_and_cache_examples
features = torch.load(cached_features_file)
File "/.local/lib/python3.6/site-packages/torch/serialization.py", line 527, in load
with _open_zipfile_reader(f) as opened_zipfile:
File "/.local/lib/python3.6/site-packages/torch/serialization.py", line 224, in init
super(_open_zipfile_reader, self).init(torch.C.PyTorchFileReader(name_or_buffer))
RuntimeError: version
<= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x2b7d94ae6193 in /.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: caffe2::serialize::PyTorchStreamReader::init() + 0x1f5b (0x2b7d4ce169eb in /.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #2: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x64 (0x2b7d4ce17c04 in /.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #3: + 0x6c53a6 (0x2b7d493cd3a6 in /.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: + 0x2961c4 (0x2b7d48f9e1c4 in /.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

frame #39: __libc_start_main + 0xf5 (0x2b7bb9c17555 in /lib64/libc.so.6)
frame #40: python() [0x400e02]

Our Pytorch version is 1.4.0, but if we upgrade Pytorch, it will occur "StopIteration"

@hjgwak
Copy link
Contributor

hjgwak commented Apr 26, 2021

As a temporary solution, you can use only 1 GPU by adding the following code into run_finetune.py

os.environ['CUDA_DEVICE_ORDER'] = "PCI_BUS_ID"
os.environ['CUDA_VISIBLE_DEVICES'] = "0"

StopIteration error only occurs when you try multi-GPU processing.
If you use only one GPU, that StopIteration error does not occur.

@Celestial-Bai Celestial-Bai changed the title fintune issue finetune issue Apr 26, 2021
@Celestial-Bai
Copy link
Author

It works! Thank you for your prompt reply! Hope this can bring us much help for our research!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants