Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing value of sys_candidate feature when only one duckling entity is found #317

Merged
merged 2 commits into from Jun 1, 2021

Conversation

vembar
Copy link
Contributor

@vembar vembar commented May 28, 2021

For queries that have only one numeric entity identified by duckling, we end up setting the value of sys_candidate|type:{}|granularity:{}|pos:{} feature which indicates the number of sys_entity matches we had of a particular type and at a particular position to 0 (i.e, log(1)). This is problematic since DictVectorizer also fills in 0 for features not found in a query.

This PR ensures we have a non-zero value when this feature should be active for a query.
Here is some validation done with webex assistant. Context: before integrating the new mindmeld release candidate, the accuracy of the entity recognizer for dev and test sets of the select intent were 100 and 98.91 respectively.

With mindmeld version 4.3.5rc5

>>> er = nlp.domains['general'].intents['select'].entity_recognizer
>>> er.fit()
True
>>> er.evaluate(label_set='dev')
<EntityModelEvaluation score: 97.65%, 83 of 85 examples correct>
>>> er.evaluate(label_set='test')
<EntityModelEvaluation score: 97.83%, 90 of 92 examples correct>

With this change

>>> er = nlp.domains['general'].intents['select'].entity_recognizer
>>> er.fit()
True
>>> er.evaluate(label_set='dev')
<EntityModelEvaluation score: 100.00%, 85 of 85 examples correct>
>>> er.evaluate(label_set='test')
<EntityModelEvaluation score: 98.91%, 91 of 92 examples correct>
>>> 

@vembar vembar requested review from serapio, jjjacksn and vijay120 and removed request for jjjacksn May 28, 2021 23:12
@vembar vembar merged commit 611f390 into cisco:master Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants