Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TokenVectorEncoder object is not iterable when running example in 2.0 alpha #1444

Closed
mikeatm opened this issue Oct 21, 2017 · 9 comments
Closed
Labels
examples Code examples in /examples 🌙 nightly Discussion and contributions related to nightly builds

Comments

@mikeatm
Copy link

mikeatm commented Oct 21, 2017

Im trying to run one of the examples in 2.0.0 alpha, for extending a pre existing model with
custom ner tags avaliable here [1],
here is the error i get:

$ python train_new_entity_type.py  en  othersame 
Creating initial model en
Traceback (most recent call last):
  File "train_new_entity_type.py", line 124, in <module>
    plac.call(main)
  File "/home/data/experim/spc/sp2env/lib/python2.7/site-packages/plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "/home/data/experim/spc/sp2env/lib/python2.7/site-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "train_new_entity_type.py", line 106, in main
    train_ner(nlp, train_data, output_directory)
  File "train_new_entity_type.py", line 53, in train_ner
    optimizer = nlp.begin_training(lambda: [])
  File "/home/data/experim/spc/sp2env/lib/python2.7/site-packages/spacy/language.py", line 410, in begin_training
    for name, proc in self.pipeline:
TypeError: 'TokenVectorEncoder' object is not iterable

I expected to get this to work, as its already documented here [2],
all the models and spacy install are recent and fresh installs (21st october).

Your Environment

    Info about spaCy

    Python version     2.7.13         
    Platform           Linux-4.11.12-100.fc24.x86_64-x86_64-with-fedora-24-Twenty_Four
    spaCy version      2.0.0a17       
    Location           /home/data/experim/spc/sp2env/lib/python2.7/site-packages/spacy
    Models             en_core_web_sm, en_core_web_lg
  • Operating System: Fedora Linux
  • Python Version Used: Python 2.7.13 reproducible with 3.5.3
  • spaCy Version Used: 2.0.0a17
  • Environment Information:

[ 1] https://github.com/explosion/spaCy/blob/develop/examples/training/train_new_entity_type.py
[ 2] https://alpha.spacy.io/usage/training#example-new-entity-type

@ines
Copy link
Member

ines commented Oct 21, 2017

I think you might be using an outdated model that still has the tensorizer in the pipeline. The latest alpha version now has a handy command that lets you check that all models are compatible and up to date, and shows you which ones need to be upgraded:

spacy validate

So simply downloading the latest en_core_web_sm or en_core_web_lg model should hopefully fix this.

@ines ines added the 🌙 nightly Discussion and contributions related to nightly builds label Oct 21, 2017
@mikeatm
Copy link
Author

mikeatm commented Oct 21, 2017

I hope this is the case, but here is the output of validate:

$ spacy validate  

    Installed models (spaCy v2.0.0a17)
    /home/data/experim/spc/sp2env/lib/python2.7/site-packages/spacy

    TYPE        NAME                  MODEL                 VERSION                                   
    package     en-core-web-sm        en_core_web_sm        2.0.0a7  ✔      
    package     en-core-web-lg        en_core_web_lg        2.0.0a1  ✔      
    link        en_core_web_lg        en_core_web_lg        2.0.0a1  ✔      
    link        en_core_web_sm        en_core_web_sm        2.0.0a7  ✔ 

I have been really looking forward to adding custom ner tags, so im eager
to get it working.

@ines
Copy link
Member

ines commented Oct 22, 2017

Thanks for updating! I think I found the issue – try removing this line:

nlp.pipeline.append(TokenVectorEncoder(nlp.vocab))

I think we may have forgotten to push the updated version of the example for the latest alpha release and models, sorry about that.

Edit: Since the pipeline architecture has changed and nlp.pipeline entries are now (name, func) tuples, this line also has to be adjusted:

nlp.pipeline.append(NeuralEntityRecognizer(nlp.vocab))

nlp.add_pipe(NeuralEntityRecognizer(nlp.vocab))

Will test this as soon as we have time and adjust it accordingly!

@ines ines added the examples Code examples in /examples label Oct 22, 2017
@jerbob92
Copy link

Thanks @ines
I also had to change the add_label line to:

nlp.pipeline[nlp.pipe_names.index('ner')][1].add_label('ANIMAL')

Not quite sure that's how it's supposed to be done, but it works for me.

@ines
Copy link
Member

ines commented Oct 22, 2017

Ah, thanks! 👍 You should be able to simply use nlp.get_pipe() to get a pipeline component, e.g.:

ner = nlp.get_pipe('ner')
ner.add_label('ANIMAL')

Or, probably cleaner:

ner = NeuralEntityRecognizer(nlp.vocab)
ner.add_label('ANIMAL')
nlp.add_pipe(ner)

@jerbob92
Copy link

That works too, thanks!

@ines
Copy link
Member

ines commented Oct 22, 2017

Yay! Thanks for your help and feedback. If it's all working for you now, feel free to submit a PR to develop btw (otherwise, we're happy to take care of this later as well).

jerbob92 pushed a commit to jerbob92/spaCy that referenced this issue Oct 22, 2017
ines added a commit that referenced this issue Oct 22, 2017
…y-type-example

Fix #1444: fix training new entity type example
@ines ines closed this as completed Oct 22, 2017
@mikeatm
Copy link
Author

mikeatm commented Oct 22, 2017

I can confirm that the fix works.

@lock
Copy link

lock bot commented May 8, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators May 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
examples Code examples in /examples 🌙 nightly Discussion and contributions related to nightly builds
Projects
None yet
Development

No branches or pull requests

3 participants