Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade pandas supported range to >=2.0, <2.3 #4079

Merged
merged 16 commits into from
Apr 11, 2024
Merged

Upgrade pandas supported range to >=2.0, <2.3 #4079

merged 16 commits into from
Apr 11, 2024

Conversation

shchur
Copy link
Collaborator

@shchur shchur commented Apr 10, 2024

Issue #, if available: #3989

Description of changes:

  • Raise pandas upper bound to 2.2 (inclusive)
  • Update time series pandas-related functionality & tests to work with the new pandas offset aliases (breaking change in pandas 2.2)
    • To ensure compatibility with both pandas 2.0/2.1 (old aliases) and 2.2 (new aliases), we adopt the following strategy:
      • All aliases are mapped to the new version (e.g., we always use YE instead of A in our code).
      • For libraries such as GluonTS that don't support pandas 2.2, we replace all aliases with dummy values (W) that are compatible with both old/new style, and take care of all the frequency-related logic in AutoGluon (e.g., time features, lag selection)
      • We cap GluonTS dependency to >=0.14.0,<0.14.4 to ensure that pandas 2.2 can be installed
    • Most changes in this PR are related to the tests, ensuring that they can run with both 2.0, 2.1 and 2.2.
  • Fix warning in CategoryFeatureGenerator

Testing:

To do:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@shchur
Copy link
Collaborator Author

shchur commented Apr 10, 2024

/platform_tests ref=pandas-2.2
Platform Tests Output

@yinweisu
Copy link
Collaborator

Previous CI Run Current CI Run

1 similar comment
@yinweisu
Copy link
Collaborator

Previous CI Run Current CI Run

Copy link

Job PR-4079-aff7f69 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4079/aff7f69/index.html

@shchur
Copy link
Collaborator Author

shchur commented Apr 10, 2024

/platform_tests ref=pandas-2.2
Platform Tests Output

@yinweisu
Copy link
Collaborator

Previous CI Run Current CI Run

@tonyhoo
Copy link
Collaborator

tonyhoo commented Apr 10, 2024

/platform_tests ref=pandas-2.2

Copy link

Job PR-4079-fbbb99b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4079/fbbb99b/index.html

@shchur
Copy link
Collaborator Author

shchur commented Apr 11, 2024

/platform_tests ref=pandas-2.2
Platform Tests Output

@yinweisu
Copy link
Collaborator

Previous CI Run Current CI Run
idna==3.6 idna==3.7
idna==3.6 idna==3.7

@shchur shchur changed the title [WIP] Upgrade pandas supported range to >=2.0, <2.3 Upgrade pandas supported range to >=2.0, <2.3 Apr 11, 2024
Copy link

Job PR-4079-b91cab8 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4079/b91cab8/index.html

@shchur shchur requested a review from tonyhoo April 11, 2024 09:48
@@ -132,6 +132,7 @@ def _drop_duplicate_features_categorical(cls, X: DataFrame, keep: Union[str, boo
# Converts [5, 'a', np.nan, 5] to [0, 1, 2, 0], these would be considered duplicates since they carry the same information.

# Have to convert to object dtype because category dtype for unknown reasons will refuse to replace NaNs.
# TODO: Fix FutureWarning
X_cur = X[features_to_check].astype("object").replace(mapping_features_val_dict_cur).astype(np.int64)
Copy link
Collaborator Author

@shchur shchur Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Innixma this line raises FutureWarning:

  /local/home/shchuro/workspace/autogluon/features/src/autogluon/features/generators/drop_duplicates.py:135: FutureWarning: Downcasting behavior in `r
eplace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-
in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
    X_cur = X[features_to_check].astype("object").replace(mapping_features_val_dict_cur).astype(np.int64)

Do you have any recommendations on how to fix it while keeping the correct behavior in pandas 2.0/2.1/2.2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will investigate and send PR

@yinweisu
Copy link
Collaborator

Previous CI Run Current CI Run
gluonts==0.14.3 gluonts==0.14.4
pandas==2.2.2 pandas==2.1.4
gluonts==0.14.3 gluonts==0.14.4
pandas==2.2.2 pandas==2.1.4

Copy link
Contributor

@canerturkmen canerturkmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

"min": "T",
"ms": "L",
"us": "U",
# sub-daily
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow...

Copy link

Job PR-4079-898abc4 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4079/898abc4/index.html

Copy link
Contributor

@zhiqiangdon zhiqiangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May not have enough time to test this before 1.1 release.

Copy link
Contributor

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Will do follow-up PRs to addres FutureWarnings

@Innixma Innixma merged commit 66e49bc into master Apr 11, 2024
43 of 56 checks passed
@shchur shchur deleted the pandas-2.2 branch April 11, 2024 18:05
LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants