-
Notifications
You must be signed in to change notification settings - Fork 91
Update ComponentGraph to enforce needing .x and .y for each component specified in the graph #2563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…hs_make_pipelines
…eryx/evalml into 2493_component_graphs_make_pipelines
…eryx/evalml into 2493_component_graphs_make_pipelines
| y_train=y, | ||
| problem_type="binary", | ||
| max_iterations=1, | ||
| allowed_component_graphs=dummy_classifier_linear_component_graph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowed_component_graphs plays no role in the validity of this test, so removing :)
|
|
||
|
|
||
| @pytest.fixture | ||
| def dummy_classifier_dict_component_graph(dummy_classifier_estimator_class): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed these in favor of example_graph and example_regression_graph since they were confusing; they used dummy classifier but were labeled as Random Forest. Didn't provide much, if any, value over the example_graph fixture
| return TransformerA, TransformerB, TransformerC, EstimatorA, EstimatorB, EstimatorC | ||
|
|
||
|
|
||
| @pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to conftest so that it could be used in other files
|
@chukarsten I know you keep hoping for a large PR but hopefully, after splitting out the original PR, this is as big as it gets 😂 |
chukarsten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this is a really great PR and I really appreciate your cleanup as you go, philosophy. Looks really good. Thank you.
| self._i = 0 | ||
| self._compute_order = self.generate_order(self.component_dict) | ||
|
|
||
| def _validate_component_dict(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this function - super readable.
evalml/pipelines/component_graph.py
Outdated
| import pdb | ||
|
|
||
| pdb.set_trace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NoooooooOOOooooooOOoooo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "Imputer": {"numeric_impute_strategy": Categorical(["most_frequent", "mean"])}, | ||
| "Imputer_1": {"numeric_impute_strategy": Categorical(["median", "mean"])}, | ||
| "Random Forest Classifier": {"n_estimators": Categorical([50, 100])}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How'd you get this past black without it exploding into 20+ lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL I just double checked: my line is 88 chars long, and black's limit is 88 chars per line--so I just bareeeely made it 😅
| if is_linear | ||
| else dummy_classifier_dict_component_graph | ||
| ) | ||
| component_graph = {"CG": example_graph} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice

Closes #2481, based on #2490 :)
The output of estimators (and all other components) must be accessed via
.xor.y; simply using the component name will no longer suffice. The convention to defaulting to.xif not specified was chosen when we did not have many (if any) components that outputted an appropriate .y value. Now that we do have more target transformers, I think enforcing explicit edges is helpful in clarity.This PR also deletes some fixtures and does some cleanup for our tests.