Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
943a155
init: moving code around
angela97lin Jul 9, 2021
f87ec33
merge
angela97lin Jul 12, 2021
85f32a9
init
angela97lin Jul 12, 2021
b558919
cap sktime
angela97lin Jul 12, 2021
64d94af
release notes
angela97lin Jul 12, 2021
0929b3b
add test
angela97lin Jul 12, 2021
71c6fb1
Merge branch '2494_component_supports_lists' into 2482_deprecate_list…
angela97lin Jul 12, 2021
948abe9
cleanup: lint and rls notes
angela97lin Jul 12, 2021
f2997f7
moving away from linear_component_graph
angela97lin Jul 13, 2021
55e0e7f
more slow updates
angela97lin Jul 13, 2021
90470ad
init
angela97lin Jul 13, 2021
3f57291
Merge branch '2494_component_supports_lists' into 2493_component_grap…
angela97lin Jul 13, 2021
c390b18
hmmm
angela97lin Jul 13, 2021
01f85e4
init
angela97lin Jul 13, 2021
33bc7eb
Merge branch 'main' into 2493_component_graphs_make_pipelines
angela97lin Jul 13, 2021
a7408e0
push a different version
angela97lin Jul 13, 2021
892426b
init
angela97lin Jul 14, 2021
3e36890
release notes
angela97lin Jul 14, 2021
6865c47
oops fix
angela97lin Jul 14, 2021
e1d8fa5
move out
angela97lin Jul 14, 2021
8144b1a
Merge branch 'main' into 2493_component_graphs_make_pipelines
angela97lin Jul 14, 2021
ee8c9d4
init and merge
angela97lin Jul 14, 2021
ee0f3cb
Merge branch '2493_component_graphs_make_pipelines' of github.com:alt…
angela97lin Jul 14, 2021
d3c0158
clean up and add tests
angela97lin Jul 15, 2021
f36cf6b
update private vars as abstract attrs
angela97lin Jul 15, 2021
ff74f9a
Merge branch 'main' into 2493_component_graphs_make_pipelines
angela97lin Jul 15, 2021
fa5e3ba
add attributes to tests
angela97lin Jul 15, 2021
e002e20
Merge branch '2493_component_graphs_make_pipelines' of github.com:alt…
angela97lin Jul 15, 2021
224209c
more cleanup
angela97lin Jul 15, 2021
a4fc850
add more test cases
angela97lin Jul 15, 2021
6e95729
Merge branch 'main' into 2493_component_graphs_make_pipelines
angela97lin Jul 15, 2021
692a86d
Merge branch 'main' into 2482_deprecate_list_API
angela97lin Jul 15, 2021
feca82e
progress on tests
angela97lin Jul 19, 2021
d438e27
more cleanup tests
angela97lin Jul 19, 2021
4ae856f
Merge branch 'main' into 2482_deprecate_list_API
angela97lin Jul 19, 2021
c062eac
more test cleanup, more to go L:
angela97lin Jul 19, 2021
3bf54bf
merging
angela97lin Jul 20, 2021
e2e6544
fix some tests with fixture, let tests run
angela97lin Jul 20, 2021
69a5c46
:)
angela97lin Jul 20, 2021
9dd26b7
clean up tests
angela97lin Jul 20, 2021
aad921c
Merge branch 'main' into 2482_deprecate_list_API
angela97lin Jul 20, 2021
2ae3d4c
fix other test
angela97lin Jul 20, 2021
1147126
Merge branch '2482_deprecate_list_API' of github.com:alteryx/evalml i…
angela97lin Jul 20, 2021
5fb648f
Merge branch 'main' into 2482_deprecate_list_API
angela97lin Jul 20, 2021
b9facdf
merge
angela97lin Jul 21, 2021
900e374
pipelinebase cleanup and testing docs
angela97lin Jul 21, 2021
d6eeef2
more cleanup and moving methods to avoid circular dep
angela97lin Jul 21, 2021
28d81b6
add tests for component graph to enforce x, y
angela97lin Jul 22, 2021
1d5f3b5
merging
angela97lin Jul 26, 2021
39ee3ac
merge main
angela97lin Jul 26, 2021
699d00e
fix tests from merge
angela97lin Jul 26, 2021
b81aa45
merge and fix tests
angela97lin Jul 26, 2021
d95c5dd
revert moving methods around and do some cleanup
angela97lin Jul 27, 2021
7edac1d
merging main
angela97lin Jul 27, 2021
52024dc
clean up test and release notes
angela97lin Jul 27, 2021
d595244
revert unnecessary changes
angela97lin Jul 27, 2021
a36f835
some general cleanup with tests and fixtures
angela97lin Jul 28, 2021
5b1f51d
init
angela97lin Jul 28, 2021
1e22794
cleanup impl
angela97lin Jul 29, 2021
158bc40
remove unnecessary test
angela97lin Jul 29, 2021
07ad029
merging
angela97lin Jul 29, 2021
f3a3352
oops remove pdb
angela97lin Jul 29, 2021
ab76746
fix not connected case and add test
angela97lin Jul 29, 2021
b193d71
oops forgot to uncomment component_graphs
angela97lin Jul 29, 2021
94f15e8
move test to other test with bad init graph
angela97lin Jul 29, 2021
423b3e0
Merge branch 'main' into 2482_x_y
angela97lin Jul 29, 2021
637c177
Merge branch 'main' into 2482_x_y
angela97lin Jul 30, 2021
f797653
fix test
angela97lin Jul 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Release Notes
* Moved ``get_hyperparameter_ranges`` to ``PipelineBase`` class from automl/utils module :pr:`2546`
* Renamed ``ComponentGraph``'s ``get_parents`` to ``get_inputs`` :pr:`2540`
* Removed ``ComponentGraph.linearized_component_graph`` and ``ComponentGraph.from_list`` :pr:`2556`
* Updated ``ComponentGraph`` to enforce requiring `.x` and `.y` inputs for each component in the graph :pr:`2563`
* Documentation Changes
* Improved detail of ``TextFeaturizer`` docstring and tutorial :pr:`2568`
* Testing Changes
Expand All @@ -24,6 +25,7 @@ Release Notes
* Moved ``get_hyperparameter_ranges`` to ``PipelineBase`` class from automl/utils module :pr:`2546`
* Renamed ``ComponentGraph``'s ``get_parents`` to ``get_inputs`` :pr:`2540`
* Removed ``ComponentGraph.linearized_component_graph`` and ``ComponentGraph.from_list`` :pr:`2556`
* Updated ``ComponentGraph`` to enforce requiring `.x` and `.y` inputs for each component in the graph :pr:`2563`


**v0.29.0 Jul. 21, 2021**
Expand Down
22 changes: 11 additions & 11 deletions docs/source/user_guide/pipelines.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,11 @@
"outputs": [],
"source": [
"component_graph_as_dict = {\n",
" 'Imputer': ['Imputer'],\n",
" 'Encoder': ['One Hot Encoder', 'Imputer'],\n",
" 'Random Forest Clf': ['Random Forest Classifier', 'Encoder'],\n",
" 'Elastic Net Clf': ['Elastic Net Classifier', 'Encoder'],\n",
" 'Final Estimator': ['Logistic Regression Classifier', 'Random Forest Clf', 'Elastic Net Clf']\n",
" 'Imputer': ['Imputer', 'X', 'y'],\n",
" 'Encoder': ['One Hot Encoder', 'Imputer.x', 'y'],\n",
" 'Random Forest Clf': ['Random Forest Classifier', 'Encoder.x', 'y'],\n",
" 'Elastic Net Clf': ['Elastic Net Classifier', 'Encoder.x', 'y'],\n",
" 'Final Estimator': ['Logistic Regression Classifier', 'Random Forest Clf.x', 'Elastic Net Clf.x', 'y']\n",
"}\n",
"\n",
"MulticlassClassificationPipeline(component_graph=component_graph_as_dict)"
Expand Down Expand Up @@ -205,11 +205,11 @@
"outputs": [],
"source": [
"component_graph_as_dict = {\n",
" 'Imputer': ['Imputer'],\n",
" 'Encoder': ['One Hot Encoder', 'Imputer'],\n",
" 'Random Forest Clf': ['Random Forest Classifier', 'Encoder'],\n",
" 'Elastic Net Clf': ['Elastic Net Classifier', 'Encoder'],\n",
" 'Final Estimator': ['Logistic Regression Classifier', 'Random Forest Clf', 'Elastic Net Clf']\n",
" 'Imputer': ['Imputer', 'X', 'y'],\n",
" 'Encoder': ['One Hot Encoder', 'Imputer.x', 'y'],\n",
" 'Random Forest Clf': ['Random Forest Classifier', 'Encoder.x', 'y'],\n",
" 'Elastic Net Clf': ['Elastic Net Classifier', 'Encoder.x', 'y'],\n",
" 'Final Estimator': ['Logistic Regression Classifier', 'Random Forest Clf.x', 'Elastic Net Clf.x', 'y']\n",
"}\n",
"\n",
"nonlinear_pipeline = MulticlassClassificationPipeline(component_graph=component_graph_as_dict)\n",
Expand Down Expand Up @@ -477,4 +477,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}
28 changes: 23 additions & 5 deletions evalml/pipelines/component_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,20 +39,37 @@ def __init__(self, component_dict=None, random_seed=0):
raise ValueError(
"component_dict must be a dictionary which specifies the components and edges between components"
)
self._validate_component_dict()
self.component_instances = {}
self._is_instantiated = False
for component_name, component_info in self.component_dict.items():
if not isinstance(component_info, list):
raise ValueError(
"All component information should be passed in as a list"
)
component_class = handle_component_class(component_info[0])
self.component_instances[component_name] = component_class
self.input_feature_names = {}
self._feature_provenance = {}
self._i = 0
self._compute_order = self.generate_order(self.component_dict)

def _validate_component_dict(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this function - super readable.

for _, component_inputs in self.component_dict.items():
if not isinstance(component_inputs, list):
raise ValueError(
"All component information should be passed in as a list"
)
component_inputs = component_inputs[1:]
has_feature_input = any(
component_input.endswith(".x") or component_input == "X"
for component_input in component_inputs
)
has_target_input = any(
component_input.endswith(".y") or component_input == "y"
for component_input in component_inputs
)
if not (has_feature_input and has_target_input):
raise ValueError(
"All edges must be specified as either an input feature (.x) or input target (.y)."
)

@property
def compute_order(self):
"""The order that components will be computed or called in."""
Expand Down Expand Up @@ -280,7 +297,7 @@ def _compute_features(self, component_list, X, y=None, fit=False):
output = component_instance.predict(input_x)
else:
output = None
output_cache[component_name] = output
output_cache[f"{component_name}.x"] = output
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that components will use estimator outputs as estimator_name.x, not estimator_name to align with how users need to specify :)

return output_cache

def _get_feature_provenance(self, input_feature_names):
Expand Down Expand Up @@ -522,6 +539,7 @@ def generate_order(cls, component_dict):
if len(edges) == 0:
return []
digraph = nx.DiGraph()
digraph.add_nodes_from(list(component_dict.keys()))
digraph.add_edges_from(edges)
if not nx.is_weakly_connected(digraph):
raise ValueError("The given graph is not completely connected")
Expand Down
Loading