Skip to content

Upgrade to ww 0.3.1#2181

Merged
freddyaboulton merged 40 commits intomainfrom
2035-use-ww-accessor
May 25, 2021
Merged

Upgrade to ww 0.3.1#2181
freddyaboulton merged 40 commits intomainfrom
2035-use-ww-accessor

Conversation

@freddyaboulton
Copy link
Copy Markdown
Contributor

@freddyaboulton freddyaboulton commented Apr 22, 2021

Pull Request Description

Fixes #2035
Fixes #2019
Fixes #2285
Fixes #1955

Feature branch tracking the woodwork upgrade. Creating it now so that I can track if we ever fall behind main.

Roadmap:

  • Demo datasets, utils, and preprocessing
  • Data Checks
  • Objectives (this will require updating the confusion matrix impl)
  • Components (will skip stacked ensembler unit tests since they take in pipelines)
  • Pipelines (will add stacked ensembler unit tests back in)
  • automl
  • model_understanding
  • Docs

After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

* Update demos, utils, and preprocessing

* Updating make test commands

* Update ww requirement

* Add test that infer_feature_types preserves schema

* Add test that infer_feature_types raises errors with invalid schema

* load_data always returns woodwork info. Deleted return_pandas

* Updating docstrings
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 22, 2021

Codecov Report

Merging #2181 (cac9870) into main (bcfd02f) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff            @@
##             main   #2181     +/-   ##
========================================
- Coverage   100.0%   99.9%   -0.0%     
========================================
  Files         280     280             
  Lines       24425   24360     -65     
========================================
- Hits        24402   24333     -69     
- Misses         23      27      +4     
Impacted Files Coverage Δ
evalml/__init__.py 100.0% <ø> (ø)
evalml/data_checks/data_check.py 100.0% <ø> (ø)
...alml/objectives/binary_classification_objective.py 100.0% <ø> (ø)
evalml/objectives/cost_benefit_matrix.py 100.0% <ø> (ø)
evalml/objectives/fraud_cost.py 100.0% <ø> (ø)
evalml/objectives/lead_scoring.py 100.0% <ø> (ø)
evalml/objectives/sensitivity_low_alert.py 100.0% <ø> (ø)
evalml/pipelines/components/component_base.py 100.0% <ø> (ø)
...tive_tests/test_binary_classification_objective.py 100.0% <ø> (ø)
evalml/utils/__init__.py 100.0% <ø> (ø)
... and 148 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bcfd02f...cac9870. Read the comment docs.

@gsheni
Copy link
Copy Markdown
Contributor

gsheni commented Apr 26, 2021

Just as a heads up (I think i've brought this up before), in infer_feature_types there is a copy call. If a user passes a 2GB dataset to AutoML, will this take up ~4GB of memory while running AutoML Search?

@freddyaboulton
Copy link
Copy Markdown
Contributor Author

@gsheni I totally get what you mean!

Our components already copy the data (I believe this is because we have a convention that we should not modify the user's data) so I don't think the copy in infer_feature_types is "new behavior". This change just makes it so that the copy happens in one place, so we can get rid of the copy in our components:

Imputer

image

LSA

image

I think you have a point that our convention may cause problems for large datasets. But I don't think that's specific to the ww upgrade. I'd be happy to continue that conversation in a different issue!

* Update components - first commit

* Update delayed_features_transformer to not use assign

* Fixing tests

* Skipping Boolean with Nan test in imputers

* Fixing base sampler _prepare_data

* Fixing target imputer null bool test

* Fix test skips

* Addressing comments

* Clean up sampler tests

* Editing docstrings
@freddyaboulton freddyaboulton changed the title Upgrade to ww 0.2.0 Upgrade to ww 0.3.0 May 10, 2021
* Update model understanding module

* Removing unused import
Comment thread .github/workflows/build_conda_pkg.yml Outdated
DOCKERHUB_PASSWORD: ${{ secrets.DOCKERHUB_PASSWORD }}
run: |
git clone -b latest_release_changes --single-branch https://github.com/conda-forge/evalml-core-feedstock
git clone -b ww-update-branch --single-branch https://github.com/conda-forge/evalml-core-feedstock
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change this back but the conda check will be red on this pr. Just fyi that might mean that @dsherry needs to merge this in.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton got it. What do you want to do?

If we're gonna merge it red, please write up an explanation of why its red so others can follow along.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I missed this, why can't we add the woodwork upgrade to latest_release_changes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ParthivNaresh because the check for all of the other PRs will fail since they haven't merged these changes in and the ww version that would be installed is 0.3.1. But since main has already been frozen for a while, I think the cleanest thing to do now is change the ww version in latest_relase_changes and merge this in!

@dsherry
Copy link
Copy Markdown
Contributor

dsherry commented May 21, 2021

Screen Shot 2021-05-21 at 5 27 45 PM 👀
Screen Shot 2021-05-21 at 5 27 52 PM 👀
🤯 😁

Copy link
Copy Markdown
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton incredible!!!

🎊🚢 🎊

This PR's vibe
freddys_woodwork_pr

Copy link
Copy Markdown
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wildest PR I've ever seen. 🙌

Comment thread core-requirements.txt
shap>=0.36.0
texttable>=1.6.2
woodwork==0.0.11
woodwork==0.3.1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we set to >=0.3.1?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was restricted to specific Woodwork version because early on there was breaking changes.
We have been more proactive noting the breaking changes in Woodwork in the release notes, and incrementing the version properly.

@chukarsten
Copy link
Copy Markdown
Contributor

Screen Shot 2021-05-21 at 5 27 45 PM 👀
Screen Shot 2021-05-21 at 5 27 52 PM 👀
🤯 😁

I think it's nuts that the overall net line count is just "-2"

Copy link
Copy Markdown
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your Requiem, @freddyaboulton . Bravo.

Copy link
Copy Markdown
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute legend mate

Comment thread .github/workflows/build_conda_pkg.yml Outdated
DOCKERHUB_PASSWORD: ${{ secrets.DOCKERHUB_PASSWORD }}
run: |
git clone -b latest_release_changes --single-branch https://github.com/conda-forge/evalml-core-feedstock
git clone -b ww-update-branch --single-branch https://github.com/conda-forge/evalml-core-feedstock
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I missed this, why can't we add the woodwork upgrade to latest_release_changes?

@freddyaboulton freddyaboulton merged commit a126c46 into main May 25, 2021
@chukarsten chukarsten mentioned this pull request Jun 2, 2021
@freddyaboulton freddyaboulton deleted the 2035-use-ww-accessor branch June 3, 2021 14:03
@chukarsten chukarsten mentioned this pull request Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

6 participants