Fix index bug for TextFeaturizer and LSA#1644
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1644 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 240 240
Lines 18390 18401 +11
=========================================
+ Hits 18382 18393 +11
Misses 8 8
Continue to review full report at Codecov.
|
| return X, y | ||
|
|
||
|
|
||
| @pytest.fixture() |
There was a problem hiding this comment.
Moved this here, since LSA and TextFeaturizer both use it :)
| * Added multiclass check to ``InvalidTargetDataCheck`` for two examples per class :pr:`1596` | ||
| * Fixes | ||
| * Fix thresholding for pipelines in AutoMLSearch to only threshold binary classification pipelines :pr:`1622` :pr:`1626` | ||
| * Fixed thresholding for pipelines in ``AutoMLSearch`` to only threshold binary classification pipelines :pr:`1622` :pr:`1626` |
There was a problem hiding this comment.
Just standardizing :d
There was a problem hiding this comment.
Thanks for the fix @angela97lin !
@angela97lin and I tried to figure out why we're running into this problem now for the first time (the repro on the ticket is just a vanilla use-case of automl with text). We noticed this must have been introduced since the 0.17.0 release since the text demo on latest is "failing" but it's "passing" on stable.
Nothing in our release notes is related to the text featurizer so maybe this has something to do with the latest featuretools release (released two days after 0.17.0). That being said, this fixes the docs locally so let's get it merged!
Closes #1643
Fixes an index bug where if the original input DF has custom indices, NaNs are backfilled. In the original repro, we don't specify indices specifically but automl passes in indices split by our data split.