Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline v2.0 (refactor) #108

Merged
merged 78 commits into from Nov 4, 2019
Merged

Pipeline v2.0 (refactor) #108

merged 78 commits into from Nov 4, 2019

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented Sep 27, 2019

WIP: laying down the basis for #95

@codecov
Copy link

codecov bot commented Sep 27, 2019

Codecov Report

Merging #108 into master will increase coverage by 0.88%.
The diff coverage is 97.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #108      +/-   ##
==========================================
+ Coverage   95.75%   96.64%   +0.88%     
==========================================
  Files          58       89      +31     
  Lines        1531     2233     +702     
==========================================
+ Hits         1466     2158     +692     
- Misses         65       75      +10
Impacted Files Coverage Δ
evalml/problem_types/utils.py 100% <ø> (ø) ⬆️
evalml/models/auto_base.py 93.47% <ø> (ø) ⬆️
...components/transformers/imputers/simple_imputer.py 100% <100%> (ø)
...s/components/estimators/regressors/rf_regressor.py 100% <100%> (ø)
...lines/components/estimators/regressors/__init__.py 100% <100%> (ø)
...ml/pipelines/classification/logistic_regression.py 100% <100%> (+6.45%) ⬆️
evalml/pipelines/__init__.py 100% <100%> (ø) ⬆️
...lines/components/transformers/imputers/__init__.py 100% <100%> (ø)
evalml/pipelines/classification/xgboost.py 100% <100%> (ø) ⬆️
...alml/pipelines/components/transformers/__init__.py 100% <100%> (ø)
... and 66 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e412fdb...07f1900. Read the comment docs.

@kmax12 kmax12 self-requested a review Oct 28, 2019
Copy link
Contributor

@kmax12 kmax12 left a comment

Looking better. A few more comments here

evalml/pipelines/components/utils.py Show resolved Hide resolved
evalml/pipelines/components/__init__.py Outdated Show resolved Hide resolved
evalml/pipelines/components/component_types.py Outdated Show resolved Hide resolved
evalml/pipelines/pipeline_base.py Outdated Show resolved Hide resolved
evalml/pipelines/pipeline_base.py Outdated Show resolved Hide resolved
from evalml.problem_types import ProblemTypes


class RFRegressionPipeline(PipelineBase):
"""Random Forest Pipeline for regression"""
name = "Random Forest w/ imputation"
name = "Random Forest Regressor w/ One Hot Encoder + Simple Imputer + Select From Model"
model_type = ModelTypes.RANDOM_FOREST
Copy link
Contributor

@kmax12 kmax12 Oct 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we automatically determine the model type since it's annotated on the component?

Copy link
Contributor

@jeremyliweishih jeremyliweishih Oct 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise to the auto generated names: we can auto determine but would require changes to Autobase so we think its best to hold off until then.

from evalml.problem_types import ProblemTypes


class RFRegressionPipeline(PipelineBase):
"""Random Forest Pipeline for regression"""
name = "Random Forest w/ imputation"
name = "Random Forest Regressor w/ One Hot Encoder + Simple Imputer + Select From Model"
model_type = ModelTypes.RANDOM_FOREST
problem_types = [ProblemTypes.REGRESSION]
Copy link
Contributor

@kmax12 kmax12 Oct 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we automatically determine the problem types since it's annotated on the component?

Copy link
Contributor

@jeremyliweishih jeremyliweishih Oct 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above!

evalml/pipelines/components/component_base.py Outdated Show resolved Hide resolved
evalml/pipelines/components/component_base.py Show resolved Hide resolved
@angela97lin
Copy link
Contributor Author

angela97lin commented Oct 29, 2019

Note: refactor feature_importances and self.input_feature_names code :)

Edit: Addressed.

* cleanup components utils

* Switch to category_encoder and cleanup RF Select

* add feature_importance to estimator

* More switching to CE

* Move parameters to class attributes and make test

* lint

* Separate changelog test

* Text cleanup
* wip: addressing comments

* feature names?

* adding fix for no estimator in generate_name

* fixing test

* addressing more comments on feature_importance

* feature_importances fixed and cleaned :)

* oops, missed merge conflict

* change of name

* minor cleanup

* adding basic test for two feature selectors and adding default components in DEFAULT_COMPONENTS

* adding tests for retaining feature names in input_feature_names
@angela97lin angela97lin requested a review from kmax12 Oct 31, 2019
@jeremyliweishih jeremyliweishih requested review from kmax12 and removed request for kmax12 Nov 1, 2019
Copy link
Contributor

@kmax12 kmax12 left a comment

Almost there! just a few last comments

evalml/pipelines/pipeline_base.py Outdated Show resolved Hide resolved
docs/source/changelog.rst Outdated Show resolved Hide resolved
@angela97lin angela97lin requested a review from kmax12 Nov 4, 2019
Copy link
Contributor

@kmax12 kmax12 left a comment

Minor comments. Otherwise looks good to me!

@angela97lin angela97lin requested a review from kmax12 Nov 4, 2019
kmax12
kmax12 approved these changes Nov 4, 2019
Copy link
Contributor

@kmax12 kmax12 left a comment

LGTM

@angela97lin angela97lin merged commit 4a58a11 into master Nov 4, 2019
@angela97lin angela97lin changed the title WIP: Pipeline v2 Pipeline v2.0 (refactor) Nov 4, 2019
This was referenced Nov 13, 2019
@angela97lin angela97lin deleted the pipeline_v2 branch Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants