Fixing value of sys_candidate feature when only one duckling entity is found #317
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For queries that have only one numeric entity identified by duckling, we end up setting the value of
sys_candidate|type:{}|granularity:{}|pos:{}
feature which indicates the number of sys_entity matches we had of a particular type and at a particular position to 0 (i.e, log(1)). This is problematic since DictVectorizer also fills in 0 for features not found in a query.This PR ensures we have a non-zero value when this feature should be active for a query.
Here is some validation done with webex assistant. Context: before integrating the new mindmeld release candidate, the accuracy of the entity recognizer for dev and test sets of the
select
intent were 100 and 98.91 respectively.With mindmeld version 4.3.5rc5
With this change