Update estimators to keep track of input feature names#1794
Update estimators to keep track of input feature names#1794angela97lin merged 20 commits intomainfrom
Conversation
…yx/evalml into 1757_estimator_input_feature_names
Codecov Report
@@ Coverage Diff @@
## main #1794 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 252 252
Lines 20047 20061 +14
=========================================
+ Hits 20039 20053 +14
Misses 8 8
Continue to review full report at Codecov.
|
| X_encoded = self._encode_categories(X, fit=True) | ||
| y_encoded = self._encode_labels(y) | ||
| return super().fit(X_encoded, y_encoded) | ||
| self._component_obj.fit(X_encoded, y_encoded) |
There was a problem hiding this comment.
Updating to just call here, because Estimator will call self.input_feature_names = list(X.columns) but since we've changed the feature names in _encode_categories (since LGBM isn't compatible with weird names), this would override the original names.
| random_state=random_state) | ||
|
|
||
| def fit(self, X, y=None): | ||
| X = _convert_to_woodwork_structure(X) |
There was a problem hiding this comment.
Similar to LGBM, updating to just call implementation, because Estimator will call self.input_feature_names = list(X.columns) but since we've changed the feature names in _encode_categories (since XGBoost isn't compatible with weird names), this would override the original names.
| predictions = clf.predict(X).to_series() | ||
| assert len(predictions) == len(y) | ||
| assert not np.isnan(predictions).all() | ||
| assert (clf.input_feature_names == col_names) |
There was a problem hiding this comment.
Rather than create a whole new test, tacking it on here, where we test passing weird names to estimators :d
chukarsten
left a comment
There was a problem hiding this comment.
I like this PR. I like moving the functionality up into the parent class. I think the convert X/y to woodwork can be moved out and reused a little bit more. I submitted a PR to your PR, check it out and let me know if that works. If not, no big deal.
Refactor to add _manage_woodwork() helper function.
Closes #1757