Skip to content

Pandas 1.3.0 Upgrade#2442

Merged
chukarsten merged 12 commits into
mainfrom
pandas_1.3.0_check
Jul 9, 2021
Merged

Pandas 1.3.0 Upgrade#2442
chukarsten merged 12 commits into
mainfrom
pandas_1.3.0_check

Conversation

@chukarsten
Copy link
Copy Markdown
Contributor

@chukarsten chukarsten commented Jun 24, 2021

Addresses issues introduced with Pandas 1.3.0 rc1. #2431 Preliminary work required to prepare for Woodwork 0.5.0.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 24, 2021

Codecov Report

Merging #2442 (eebcc38) into main (9868d00) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2442     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        283     283             
  Lines      25568   25571      +3     
=======================================
+ Hits       25466   25469      +3     
  Misses       102     102             
Impacted Files Coverage Δ
evalml/data_checks/class_imbalance_data_check.py 100.0% <ø> (ø)
evalml/data_checks/invalid_targets_data_check.py 100.0% <ø> (ø)
...ta_checks_tests/test_class_imbalance_data_check.py 100.0% <ø> (ø)
evalml/tests/data_checks_tests/test_data_checks.py 100.0% <ø> (ø)
...valml/pipelines/components/estimators/estimator.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9868d00...eebcc38. Read the comment docs.

@chukarsten chukarsten linked an issue Jun 24, 2021 that may be closed by this pull request
@chukarsten chukarsten force-pushed the pandas_1.3.0_check branch from 105d752 to 3065481 Compare July 6, 2021 19:18
@chukarsten chukarsten force-pushed the pandas_1.3.0_check branch from 3065481 to 9864b19 Compare July 8, 2021 00:43
@chukarsten chukarsten changed the title Pandas 1.3.0 check Pandas 1.3.0 Upgrade Jul 8, 2021
Comment thread evalml/data_checks/class_imbalance_data_check.py
Comment thread evalml/pipelines/components/estimators/estimator.py
Copy link
Copy Markdown
Contributor Author

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two places that the pandas upgrade from 1.2.5 to 1.3.0 manifested themselves were:

  1. The columns of XGBoost's DMatrix created from the new DataFrame resulted in the integer column labels being prepended extra spaces so that they would be strings with exactly two characters. Not sure why this happens with the new Pandas, but I think this is a pandas/XGBoost interaction, so might file an issue there.
  2. The order, but not the values, of some of the datacheck validation results has changed. The affected tests are rewritten so the order doesn't matter.

Copy link
Copy Markdown
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Copy Markdown
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me @chukarsten ! Thanks for doing this.

Comment thread evalml/data_checks/class_imbalance_data_check.py
Comment thread evalml/pipelines/components/estimators/estimator.py
Copy link
Copy Markdown
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that you needed another green seal of approval, but LGTM!

Comment thread evalml/data_checks/class_imbalance_data_check.py
@chukarsten chukarsten merged commit 9ed86bb into main Jul 9, 2021
@chukarsten chukarsten deleted the pandas_1.3.0_check branch July 9, 2021 17:59
@chukarsten chukarsten mentioned this pull request Jul 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test compatibility with upcoming pandas release 1.3.0

5 participants