-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make TextFeaturizer dependencies optional (tensorflow) #976
Conversation
…he unit tests still work but not included in packaging
…primitives are installed
@@ -3,7 +3,6 @@ pandas>=0.25.0 | |||
scipy>=1.2.1,<1.5.0 # remove upper limit once tensorflow supports 1.5.x | |||
scikit-learn>=0.21.3,!=0.22,<0.23.0 | |||
scikit-optimize | |||
featuretools[nlp_primitives] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to test-requirements.txt
@@ -1,6 +1,5 @@ | |||
catboost==0.23.2 | |||
cloudpickle==1.5.0 | |||
distributed==2.21.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added by tensorflow
from nlp_primitives
. This job doesn't install test-requirements.txt
so this is no longer needed.
@@ -5,3 +5,5 @@ nbval==0.9.3 | |||
graphviz>=0.13 | |||
codecov==2.1.0 | |||
category_encoders>=2.0.0 | |||
featuretools[nlp_primitives] | |||
nlp_primitives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, I needed to have nlp_primitives
listed here in addition to the featuretools
line above it. Otherwise, one unit test test_featurization_only_text
fails on the python 3.8 core-deps job... mysterious ✨🔮
@@ -4,6 +4,9 @@ | |||
|
|||
from evalml.pipelines.components import TextFeaturizer | |||
|
|||
pytest.importorskip('featuretools', reason='Skipping test because featuretools not installed') | |||
pytest.importorskip('nlp_primitives', reason='Skipping test because nlp_primitives not installed') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant since we always install test-dependencies.txt
when we run the unit tests, but including since this is our treatment for xgboost, catboost, etc
Codecov Report
@@ Coverage Diff @@
## main #976 +/- ##
=======================================
Coverage 99.87% 99.87%
=======================================
Files 174 174
Lines 8987 8990 +3
=======================================
+ Hits 8976 8979 +3
Misses 11 11
Continue to review full report at Codecov.
|
Ok this is ready for review! Sorry, pushed before a meeting and forgot to set to draft while I ironed out the tests. Will re-request review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :jeremy-approve:
Thanks @jeremyliweishih . We need to get you a custom emoji for that 😂 |
Fix #957
Adding
nlp_primitives
as a core requirement has meant that we now havetensorflow
as a core requirement. This is bad because that package is >500MB. Since the text featurizer is still under development, let's kickfeaturetools
andnlp_primitives
out of the requirements for now. Will keep in test reqs so that our unit test coverage is maintained.After this PR is merged, users who run
pip install evalml
won't be installingtensorflow
. Users who want to use the text featurizer can then runpip install nlp_primitives
.Moving forward we do have a long-term plan to have
nlp_primitives
be an optional requirement if not a core requirement. I'll be filing that separately.@rwedge FYI since we were just discussing this.
Changes
featuretools
andnlp_primitives
out of the requirements which are packaged with evalml releases. This is fine since this is a feature under development, and isn't currently used in automl, the pipelines or elsewhere in our codebase.import_or_raise
toTextFeaturizer.__init__
to warn any prospective users of missing depsimportorskip
to the text featurizer tests. Should be unnecessary since we havefeaturetools
andnlp_primitives
intest-requirements.txt
now, meaning we'll continue to run the unit test coverage in the CI jobs.text_featurization.py
totext_featurizer.py
to match class name