Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to msgpack cause models to fail to load #2996

Closed
jnardone opened this issue Nov 30, 2018 · 7 comments
Closed

Changes to msgpack cause models to fail to load #2996

jnardone opened this issue Nov 30, 2018 · 7 comments
Labels
feat / serialize Feature: Serialization, saving and loading 🔮 thinc spaCy's machine learning library Thinc third-party Third-party packages and services

Comments

@jnardone
Copy link

How to reproduce the behaviour

With release of msgpack 0.6.0 (earlier today) and this specifically:
msgpack/msgpack-python#295

you can't do something like:
nlp = spacy.load('en_core_web_lg')

which gives:

File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 684830 exceeds max_map_len(32768)

which fails here in spacy:

File "vectors.pyx", line 370, in spacy.vectors.Vectors.from_disk.load_key2row

These lines that interact with webpack/webpack-numpy need to be passed non-default values to process larger models. At https://github.com/explosion/spaCy/blob/master/spacy/vectors.pyx#L370 and elsewhere.

OR
you need to restrict webpack to less than 0.6.0 until fixed.

Your Environment

  • Operating System: MacOS, Ubuntu
  • Python Version Used: 3.6.7
  • spaCy Version Used: 2.0.17
  • Environment Information:
@KaNuNSuZoFLu
Copy link

same issue here

@honnibal
Copy link
Member

I pushed a new version of Thinc pinned to msgpack <0.6.0. I think that should take care of the problem?

@ines ines added third-party Third-party packages and services 🔮 thinc spaCy's machine learning library Thinc feat / serialize Feature: Serialization, saving and loading labels Nov 30, 2018
@ines
Copy link
Member

ines commented Nov 30, 2018

Also see #2995!

@imatiach-msft
Copy link

+1 seeing this as well

@daniel347x
Copy link

spacy.load(u'en_core_web_lg') gives the same error, triggered from within msgpack_numpy.py - I think maybe msgpack needs to be updated in spaCy as well?

Thanks ahead of time as I work around this myself temporarily right now so I can get back to work while waiting for an update.

@honnibal
Copy link
Member

Reopening #2995 , as I think it's clearer. tl;dr: Fresh installs should work fine. If you're having problems, you can fix your installation with python -m pip install "msgpack<0.6.0"

@lock
Copy link

lock bot commented Dec 30, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Dec 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / serialize Feature: Serialization, saving and loading 🔮 thinc spaCy's machine learning library Thinc third-party Third-party packages and services
Projects
None yet
Development

No branches or pull requests

6 participants