Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorchWrapper causes error on deserializing #8319

Open
polm opened this issue Jun 9, 2021 · 3 comments
Open

PyTorchWrapper causes error on deserializing #8319

polm opened this issue Jun 9, 2021 · 3 comments
Labels
bug Bugs and behaviour differing from documentation feat / serialize Feature: Serialization, saving and loading feat / transformer Feature: Transformer 🔮 thinc spaCy's machine learning library Thinc

Comments

@polm
Copy link
Contributor

polm commented Jun 9, 2021

How to reproduce the behaviour

Add a PyTorchWrapper component to a language pipeline and then do this:

nlp.to_disk("something")
nlp2 = spacy.load("something")

Motivating case is covered in #8291. This issue touches code in Thinc and spacy-transformers.

The issue is that the model at construction time has fewer layers than after initialize is called. When deserializing Thinc detects this as an issue and throws an error.

Your Environment

  • Operating System:
  • Python Version Used:
  • spaCy Version Used:
  • Environment Information:
@polm polm added feat / serialize Feature: Serialization, saving and loading feat / transformer Feature: Transformer 🔮 thinc spaCy's machine learning library Thinc labels Jun 9, 2021
@svlandeg svlandeg added the bug Bugs and behaviour differing from documentation label Jun 21, 2021
@svlandeg
Copy link
Member

Related issue, cf code in IO tests of explosion/spacy-transformers#277:

nlp.to_disk(file_path)
nlp2 = util.load_model_from_config(orig_config, auto_fill=True, validate=True)
nlp2.initialize(lambda: train_examples)
nlp2.from_disk(file_path)

--> It shouldn't be necessary to call initialize ?

Can this be fixed by implementing an "empty shim" when constructing the Transformer to avoid the shape mismatch?

@svlandeg
Copy link
Member

Update: the transformers bug at least is fixed by explosion/spacy-transformers#285

@oliviercwa
Copy link

Additional repro steps from #9317 to regress the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / serialize Feature: Serialization, saving and loading feat / transformer Feature: Transformer 🔮 thinc spaCy's machine learning library Thinc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants