Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named Entity Recognizer missing predict function #2029

Closed
wpm opened this issue Feb 23, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@wpm
Copy link

commented Feb 23, 2018

The documentation says that named entity recognizer objects have a predict function which returns scores from the model. However, this doesn't appear to be the case.

>>> nlp = spacy.load("en")
>>> from spacy.pipeline import EntityRecognizer
>>> ner = EntityRecognizer(nlp.vocab)
>>> ner.predict([nlp("I live in Florida.")])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-49-4c86d46d1a7a> in <module>()
----> 1 ner.predict([nlp("I live in Florida.")])

AttributeError: 'spacy.pipeline.EntityRecognizer' object has no attribute 'predict'

If this interface has changed, the documentation should be updated.

Also, I can't figure out how to get a probability of an hypothesized named entity without predict. How do I get NER probabilities?

Info about spaCy

  • spaCy version: 2.0.7
  • Platform: Darwin-17.4.0-x86_64-i386-64bit
  • Python version: 3.6.3
  • Models: en
@thetravelingsalesman

This comment has been minimized.

Copy link

commented Mar 13, 2018

Try this:
model = EntityRecognizer(spacy.load(spacy_model))
for eg in model.make_best(DB.get_dataset(dataset)):
print(json.dumps(eg))

@ines ines added the feat / ner label Mar 27, 2018

@wpm

This comment has been minimized.

Copy link
Author

commented Apr 5, 2018

Let's be really clear here since both spaCy and Prodigy have different EntityRecognizer objects.

>>> import spacy
>>> from prodigy.models.ner import EntityRecognizer
>>> nlp = spacy.load("en")
>>> ner = EntityRecognizer(nlp)
>>> list(ner.make_best([{"text":"I love Florida"}]))
[{'_input_hash': 2067731791,
  '_task_hash': -1375766369,
  'spans': [{'end': 14,
    'label': 'GPE',
    'rank': 0,
    'score': 0.9993259674355696,
    'start': 7,
    'text': 'Florida'}],
  'text': 'I love Florida'}]

That definitely returns an entity with a score. Thanks.

@ines, is make_best a documented Prodigy feature? I can't find it in the Prodigy README.

What is the official best way to get scores with your proposed entities? I see this (undocumented?) make_best feature, then I also see @honnibal's discussion on the Prodigy support board about calculating the probabilities by digging through the alternatives in a beam search.

I suspect there isn't a single "right" way to do this, that the answer is "it depends" and is all bound up with the fact that the spaCy parser decoder can either be greedy or do a beam search. But I'm still unclear on the details. Maybe this would be a good topic for an Explosion.ai blog post.

@ines

This comment has been minimized.

Copy link
Member

commented May 7, 2018

@wpm Sorry for only getting to this now – the "official" user-facing way of getting the highest scoring parses in Prodigy is the ner.print-best recipe: https://prodi.gy/docs/recipes#ner-print-best (but of course, you can also reuse the recipe code and call into the model directly if you prefer different usage)

I suspect there isn't a single "right" way to do this, that the answer is "it depends" and is all bound up with the fact that the spaCy parser decoder can either be greedy or do a beam search. But I'm still unclear on the details. Maybe this would be a good topic for an Explosion.ai blog post.

Thanks for the suggestion! There are a still a few inconsistencies and things that we would like to iron out before we make recommendations. Some of the API details might also change again in the future. Being able to test the functionality in Prodigy has been incredibly helpful, though – and the results show that it's working pretty well.

Merging this issue with #2149 btw!

@ines ines closed this May 7, 2018

@arlinajsk

This comment has been minimized.

Copy link

commented Jun 29, 2018

to the original comment from @wpm the documentation clearly says there is a predict method from the spacy EntityRecognizer object but it isn't there: https://spacy.io/api/entityrecognizer#predict
This really caused a lot of confusion. If it only works with the EntityRecognizer in the prodigy lib then please update the documentation.

@lock

This comment has been minimized.

Copy link

commented Jul 29, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jul 29, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.