Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save model issue #39

Closed
szhsjsu opened this issue Jul 8, 2021 · 10 comments
Closed

save model issue #39

szhsjsu opened this issue Jul 8, 2021 · 10 comments
Labels
bug Something isn't working

Comments

@szhsjsu
Copy link

szhsjsu commented Jul 8, 2021

There's an error when i tried to train the model on my dataset.

Traceback (most recent call last):
File "train.py", line 29, in
auto_device=True # Auto choose CUDA or CPU
File "/home/hsz/.conda/envs/win_env/lib/python3.6/site-packages/pyabsa/functional.py", line 157, in train_atepc
model_path.append(train4atepc(t_config))
File "/home/hsz/.conda/envs/win_env/lib/python3.6/site-packages/pyabsa/tasks/atepc/training/atepc_trainer.py", line 382, in train4atepc
return trainer.train()
File "/home/hsz/.conda/envs/win_env/lib/python3.6/site-packages/pyabsa/tasks/atepc/training/atepc_trainer.py", line 204, in train
self._save_model(self.opt, self.model, save_path, mode=0)
File "/home/hsz/.conda/envs/win_env/lib/python3.6/site-packages/pyabsa/tasks/atepc/training/atepc_trainer.py", line 354, in _save_model
torch.save(model_to_save.cpu(), save_path + args_to_save.model_name + '.model') # save the whole model
File "/home/hsz/.conda/envs/win_env/lib/python3.6/site-packages/torch/serialization.py", line 379, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File "/home/hsz/.conda/envs/win_env/lib/python3.6/site-packages/torch/serialization.py", line 484, in _save
pickler.dump(obj)
TypeError: can't pickle _thread.RLock objects

Seems that one or some objects in self.model that cannot be dumped by pickle.
But if I change the saving mode to any value but 0, the _save_model function works well. So i guess the difference could be the type of model that i saved, which means 'pytorch.bin' can be generated but '.model' file cannot.

Could you help me out, thx

@yangheng95
Copy link
Owner

Hi,

I noticed this issue report in the early, however I test the code on Windows and CentOS, but cannot reproduce this problem. I suppose the reason is the version of pickle. I will fix this problem as soon as I can. Please watch the new release on PyPi.

@szhsjsu
Copy link
Author

szhsjsu commented Jul 8, 2021

Great! Appreciate

@yangheng95
Copy link
Owner

Great! Appreciate

As a suggestion, you can uncomment

# torch.save(model_to_save.cpu().state_dict(),
, to save the state_dict only (comment L354 accordingly). If this works, you need to download BERT-BASE model from hugginface (if you didnt download before, which is hard for poor network.)

@yangheng95
Copy link
Owner

The model loading code is compatible with the model saved in state_dict mode. Please let me know if this works or not, many thanks.

@szhsjsu
Copy link
Author

szhsjsu commented Jul 8, 2021

The model loading code is compatible with the model saved in state_dict mode. Please let me know if this works or not, many thanks.

Got it. I'll try it later and send you the feedback. Thank you for the quick reply and solutions.

@yangheng95
Copy link
Owner

You are welcome! Can you export the package list to help reproduce the problem?

@szhsjsu
Copy link
Author

szhsjsu commented Jul 8, 2021

Package Version


anyio 2.2.0
argon2-cffi 20.1.0
async-generator 1.10
attrs 20.3.0
Babel 2.9.0
backcall 0.2.0
backports.functools-lru-cache 1.6.1
bleach 3.2.1
blis 0.7.4
boto3 1.17.98
botocore 1.20.98
brotlipy 0.7.0
bson 0.5.10
catalogue 2.0.4
certifi 2020.12.5
cffi 1.14.5
chardet 4.0.0
click 7.1.2
colorama 0.4.4
contextvars 2.4
cryptography 3.4.6
cymem 2.0.5
dataclasses 0.8
decorator 4.4.2
defusedxml 0.6.0
elasticsearch 7.12.0
en-core-web-sm 3.0.0
entrypoints 0.3
filelock 3.0.12
gitdb 4.0.7
GitPython 3.1.18
googledrivedownloader 0.4
huggingface-hub 0.0.12
idna 2.10
immutables 0.15
importlib-metadata 3.4.0
ipykernel 5.3.4
ipython 5.8.0
ipython-genutils 0.2.0
Jinja2 2.11.2
jmespath 0.10.0
joblib 1.0.1
json5 0.9.5
jsonschema 3.2.0
jupyter-client 6.1.11
jupyter-core 4.7.1
jupyter-server 1.4.1
jupyterlab 3.0.5
jupyterlab-pygments 0.1.2
jupyterlab-server 2.1.2
kafka 1.3.5
mariadb 1.0.6
MarkupSafe 1.1.1
mistune 0.8.4
murmurhash 1.0.5
nbclassic 0.2.6
nbclient 0.5.1
nbconvert 6.0.7
nbformat 5.1.2
nest-asyncio 1.4.3
networkx 2.5.1
notebook 6.3.0
numpy 1.19.5
packaging 21.0
pandas 1.1.5
pandocfilters 1.4.2
parso 0.8.1
pathy 0.6.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.2.0
pip 21.0.1
preshed 3.0.5
prometheus-client 0.9.0
prompt-toolkit 1.0.15
ptyprocess 0.7.0
pyabsa 0.8.8.0
pycparser 2.20
pydantic 1.7.4
Pygments 2.8.1
pyOpenSSL 20.0.1
pyparsing 2.4.7
pyrsistent 0.17.3
PySocks 1.7.1
python-dateutil 2.8.1
pytorch-transformers 1.2.0
pytz 2020.5
PyYAML 5.4.1
pyzmq 20.0.0
regex 2021.4.4
requests 2.25.1
s3transfer 0.4.2
sacremoses 0.0.45
scikit-learn 0.24.2
scipy 1.5.4
Send2Trash 1.5.0
sentencepiece 0.1.96
seqeval 1.2.2
setuptools 52.0.0.post20210125
simplegeneric 0.8.1
six 1.15.0
sklearn 0.0
smart-open 5.1.0
smmap 4.0.0
sniffio 1.2.0
spacy 3.0.6
spacy-legacy 3.0.6
srsly 2.4.1
termcolor 1.1.0
terminado 0.9.3
testpath 0.4.4
thinc 8.0.7
threadpoolctl 2.1.0
tokenizers 0.10.3
torch 1.9.0+cu111
torchaudio 0.9.0
torchvision 0.10.0+cu111
tornado 6.1
tqdm 4.61.1
traitlets 4.3.3
transformers 4.8.2
typer 0.3.2
typing-extensions 3.7.4.3
update-checker 0.18.0
urllib3 1.26.4
wasabi 0.8.2
wcwidth 0.2.5
webencodings 0.5.1
wheel 0.36.2
xlrd 1.2.0
zipp 3.4.0

You are welcome! Can you export the package list to help reproduce the problem?

@yangheng95
Copy link
Owner

Package Version

anyio 2.2.0
argon2-cffi 20.1.0
async-generator 1.10
attrs 20.3.0
Babel 2.9.0
backcall 0.2.0
backports.functools-lru-cache 1.6.1
bleach 3.2.1
blis 0.7.4
boto3 1.17.98
botocore 1.20.98
brotlipy 0.7.0
bson 0.5.10
catalogue 2.0.4
certifi 2020.12.5
cffi 1.14.5
chardet 4.0.0
click 7.1.2
colorama 0.4.4
contextvars 2.4
cryptography 3.4.6
cymem 2.0.5
dataclasses 0.8
decorator 4.4.2
defusedxml 0.6.0
elasticsearch 7.12.0
en-core-web-sm 3.0.0
entrypoints 0.3
filelock 3.0.12
gitdb 4.0.7
GitPython 3.1.18
googledrivedownloader 0.4
huggingface-hub 0.0.12
idna 2.10
immutables 0.15
importlib-metadata 3.4.0
ipykernel 5.3.4
ipython 5.8.0
ipython-genutils 0.2.0
Jinja2 2.11.2
jmespath 0.10.0
joblib 1.0.1
json5 0.9.5
jsonschema 3.2.0
jupyter-client 6.1.11
jupyter-core 4.7.1
jupyter-server 1.4.1
jupyterlab 3.0.5
jupyterlab-pygments 0.1.2
jupyterlab-server 2.1.2
kafka 1.3.5
mariadb 1.0.6
MarkupSafe 1.1.1
mistune 0.8.4
murmurhash 1.0.5
nbclassic 0.2.6
nbclient 0.5.1
nbconvert 6.0.7
nbformat 5.1.2
nest-asyncio 1.4.3
networkx 2.5.1
notebook 6.3.0
numpy 1.19.5
packaging 21.0
pandas 1.1.5
pandocfilters 1.4.2
parso 0.8.1
pathy 0.6.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.2.0
pip 21.0.1
preshed 3.0.5
prometheus-client 0.9.0
prompt-toolkit 1.0.15
ptyprocess 0.7.0
pyabsa 0.8.8.0
pycparser 2.20
pydantic 1.7.4
Pygments 2.8.1
pyOpenSSL 20.0.1
pyparsing 2.4.7
pyrsistent 0.17.3
PySocks 1.7.1
python-dateutil 2.8.1
pytorch-transformers 1.2.0
pytz 2020.5
PyYAML 5.4.1
pyzmq 20.0.0
regex 2021.4.4
requests 2.25.1
s3transfer 0.4.2
sacremoses 0.0.45
scikit-learn 0.24.2
scipy 1.5.4
Send2Trash 1.5.0
sentencepiece 0.1.96
seqeval 1.2.2
setuptools 52.0.0.post20210125
simplegeneric 0.8.1
six 1.15.0
sklearn 0.0
smart-open 5.1.0
smmap 4.0.0
sniffio 1.2.0
spacy 3.0.6
spacy-legacy 3.0.6
srsly 2.4.1
termcolor 1.1.0
terminado 0.9.3
testpath 0.4.4
thinc 8.0.7
threadpoolctl 2.1.0
tokenizers 0.10.3
torch 1.9.0+cu111
torchaudio 0.9.0
torchvision 0.10.0+cu111
tornado 6.1
tqdm 4.61.1
traitlets 4.3.3
transformers 4.8.2
typer 0.3.2
typing-extensions 3.7.4.3
update-checker 0.18.0
urllib3 1.26.4
wasabi 0.8.2
wcwidth 0.2.5
webencodings 0.5.1
wheel 0.36.2
xlrd 1.2.0
zipp 3.4.0

You are welcome! Can you export the package list to help reproduce the problem?

I cant reproduce the problem, still. However, I found some answers saying it may be caused by locked logger while saving model. I revise some code about logger in V0.8.8.2,please help me review if the code works.

@yangheng95
Copy link
Owner

According to a test report, this problem occurs on some situations has been solved in 0.8.8.2 . Thanks for your help.

@yangheng95 yangheng95 added the bug Something isn't working label Jul 8, 2021
@szhsjsu
Copy link
Author

szhsjsu commented Jul 9, 2021

According to a test report, this problem occurs on some situations has been solved in 0.8.8.2 . Thanks for your help.

It works perfect. Brilliant!

@szhsjsu szhsjsu closed this as completed Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants