-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename TextFeaturizer to NaturalLanguageFeaturizer #3030
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! It's encouraging to me how simple of a change this is.
I have two main things moving forward:
- There's a few other places in the documentation that should be updated to add the new primitives: the explanatory part of the text tutorial here as well as the docstring where it lists the primitives (marked in another comment)
- I'm very interested to see how the performance tests change with this update. I did a bunch of performance testing way back when originally adding this component to determine the best primitives to use, and I know there were some I explicitly removed to improve performance. I hope these will reflect real improvement!
Other than that, just left a couple small nits. Looks great though!
evalml/pipelines/components/transformers/preprocessing/natural_language_featurizer.py
Outdated
Show resolved
Hide resolved
evalml/pipelines/components/transformers/preprocessing/natural_language_featurizer.py
Outdated
Show resolved
Hide resolved
evalml/pipelines/components/transformers/preprocessing/natural_language_featurizer.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyliweishih Thanks for this! I think this looks good. For completeness we may want to rename TextTransformer to NaturalLanguageTransformer but I don't think that's a hard requirement to close out this issue.
docs/source/release_notes.rst
Outdated
@@ -22,6 +22,7 @@ Release Notes | |||
* Added an algorithm to ``DelayedFeatureTransformer`` to select better lags :pr:`3005` | |||
* Added test to ensure pickling pipelines preserves thresholds :pr:`3027` | |||
* Added AutoML function to access ensemble pipeline's input pipelines IDs :pr:`3011` | |||
* Added ``NumWords`` and ``NumCharacters`` primitives to ``TextFeaturizer`` and renamed ``TextFeaturizer` to ``NaturalLanguageFeaturizer`` :pr:`3030` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: Move to next release section.
Will update with perf tests soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Looking forward to seeing the perf test results.
@eccabay uploaded the perf test results in the description. Lmk what you think! |
Fixes #2914.
Perf tests look good! Almost identical results with a slight blip in init time for one dataset which should just be some noise.
nlp_primatives.html.zip