Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlairEmbeddings function gives ValueError in python 3.6 (latest Nvidia Pytorch Docker container) #1744

Closed
tylerlekang opened this issue Jul 6, 2020 · 7 comments · Fixed by #1745
Labels
bug Something isn't working

Comments

@tylerlekang
Copy link

Describe the bug
Simply running the code FlairEmbeddings('news-forward') gives a ValueError, in Python 3.6.10 (Conda), which is the python environment included in the most recent PyTorch Docker container from Nvidia. (https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_20-06.html#rel_20-06)

Here is the error message:

Traceback (most recent call last):
  File "fineTune_langModel.py", line 10, in <module>
    language_model = FlairEmbeddings('news-forward').lm
  File "/opt/conda/lib/python3.6/site-packages/flair/embeddings/token.py", line 578, in __init__
    self.lm: LanguageModel = LanguageModel.load_language_model(model)
  File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 202, in load_language_model
    dropout=state["dropout"],
  File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 63, in __init__
    self.to(flair.device)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 465, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 404, in _apply
    for info in torch.__version__.replace("+",".").split('.') if info.isdigit())
ValueError: not enough values to unpack (expected at least 3, got 2)

To Reproduce
Run this code in the container (after pip installing flair):
from flair.embeddings import FlairEmbeddings
FlairEmbeddings('news-forward')

Expected behavior
The function should work with no problems (in particular, the .lm is intended to be given to the LanguageModelTrainer function).

Environment (please complete the following information):

  • Docker container running Linux (all details of Ubuntu, Python, PyTorch, etc. versions is in the Nvidia link above)
  • Flair version is 0.5

Additional context
Running the code in Python 3.7.7 (Conda) gives no problems.

@tylerlekang tylerlekang added the bug Something isn't working label Jul 6, 2020
@alanakbik
Copy link
Collaborator

@tylerlekang thanks for reporting this. Could you print the torch version you get with torch.__version__? Also, can you try updating to Flair 0.5.1?

@tylerlekang
Copy link
Author

@alanakbik torch.__version__ reports 1.6.0a0+9907a3e , which matches as shown in https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_20-06.html#rel_20-06

Used pip install --upgrade flair to upgrade to 0.5.1. The same error persists:

>>> flair.__version__
'0.5.1'
>>>
>>> from flair.embeddings import FlairEmbeddings
>>> FlairEmbeddings('news-forward')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.6/site-packages/flair/embeddings/token.py", line 586, in __init__
    self.lm: LanguageModel = LanguageModel.load_language_model(model)
  File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 202, in load_language_model
    dropout=state["dropout"],
  File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 63, in __init__
    self.to(flair.device)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 465, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 404, in _apply
    for info in torch.__version__.replace("+",".").split('.') if info.isdigit())
ValueError: not enough values to unpack (expected at least 3, got 2)

Did you test with this container? It seems like a common and important container to verify, as it is official Nvidia optimized for PyTorch applications.

Thank you very much for your support! :)

@tylerlekang
Copy link
Author

tylerlekang commented Jul 7, 2020

@alanakbik In the models/language_model.py code, the first line of _apply is (starts at line 402):

major, minor, build, *_ = (int(info)
                                for info in torch.__version__.replace("+",".").split('.') if info.isdigit())

If I simply run torch.__version__.replace("+",".").split('.') in the container (python 3.6.10) it returns ['1', '6', '0a0', '9907a3e']. Then if I run for i in (int(info) for info in torch.__version__.replace("+",".").split('.') if info.isdigit()): print(i) it prints:

1
6

However, on my local machine running vanilla python 3.7.7, the torch version is just 1.5.0. So this may be the problem.

I have no idea why Nvidia has chosen this version of Pytorch with letters in the version number, but they did make this choice and this container is supposed to be an easy solution for highly optimized GPU runs on their hardware.

Do you have any workaround ideas?

@tylerlekang
Copy link
Author

@alanakbik could I just hardcode the major, minor, build numbers, in my local version of language_model.py if there is no workaround?

major = 1
minor = 6
build = 0

It seems the code just checks that the major.minor is >= 1.4 ? But I don't want to mess up any other parts of the code.

@alanakbik
Copy link
Collaborator

Yes, I guess you could just overwrite torch.__version__ as a quick fix by calling this before your script:

import torch

torch.__version__ = '1.5.0'

Meanwhile, I will put in a PR to fix the error.

@tylerlekang
Copy link
Author

@alanakbik just wanting to triple-confirm, that shouldn't cause any problems with the rest of the FlairEmbeddings or LanguageModelTrainer codes? Thank you!

@alanakbik
Copy link
Collaborator

It shouldn't cause any problems on the flair side. We use the string to determine whether and old version of torch is used (<1.4.0) or not, so changing it to another string that is above 1.4.0 won't change anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants