Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPS: Add warning if pyarrow is not installed #56896

Merged
merged 14 commits into from Jan 19, 2024

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Jan 15, 2024

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

~~Not sure how to add a test for this - importing our warning asserters would cause the warning the be raised! 🤣 ~~

EDIT: Nvm, figured it out.

(I've checked by hand that the warning is both raised when you import pandas directly and import a submodule of pandas e.g. pandas.arrays.
I also double checked that the warning only occurs once and not on every import.)

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, added some minor comments/suggestions.

Comment on lines 205 to 206
# Don't allow users to use pandas.os
del os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not run this this at the end with the other modules? Seems like it's easier to maintain if there is all the clean up at the end or next to the import, but no big deal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not really.

Personally I like to keep the del closed to where it's being used, but no strong preference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think os is not being used in the code before beibg closed. As said, not important, but to me personally it makes sense to open and close around the block where it is being used, or open and close at the beginning and the end if it's an import that is used in several places. Now os is being opened at the beginning, and closed in the middle, in a kind of random place if I'm not wrong. No huge difference, but I think it'll help maintain the code if closed at the end (or we move the import and del around where it is being used).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put the os delete with the deletion of warnings/pa_version_under10p1.

pandas/__init__.py Outdated Show resolved Hide resolved
pandas/__init__.py Show resolved Hide resolved
pandas/__init__.py Outdated Show resolved Hide resolved
pandas/__init__.py Outdated Show resolved Hide resolved
@mroeschke mroeschke added Dependencies Required and optional dependencies Warnings Warnings that appear or should be added to pandas labels Jan 16, 2024
from pandas.compat._optional import VERSIONS

try:
import pyarrow # noqa: F401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an else block to del pyarrow so pandas.pyarrow is not created?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx

@lithomas1
Copy link
Member Author

also cc @pandas-dev/pandas-core for feedback on the warning message itself.


warnings.warn(
f"""
Pyarrow will become a required dependency of pandas in the next major release of pandas,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Pyarrow will become a required dependency of pandas in the next major release of pandas,
Pyarrow will become a required dependency of pandas in the next major release (pandas 3.0),

I think we can be explicit about what this next major release will be

Copy link
Contributor

@bashtage bashtage Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the next major release also the next minor release, i.e., 2.2 then 3.0? I noticed that the dev wheels recently switched to reporting 3.0.0.dev0 recently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the next major release also the next minor release, i.e., 2.2 then 3.0? I noticed that the dev wheels recently switched to reporting 3.0.0.dev0 recently.

No, 2.2 is coming out sometime this week.
But yes, 3.0.0 will be the release after that.

please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
""",
DeprecationWarning,
stacklevel=2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make this stacklevel=1?

Because this code is top-level and not inside a function or method, it one less than the typical stacklevel of 2 we use in a direct function/method.
Setting it as 1 would avoid that it triggers a warning when you call a function that has an inline import pandas (not sure how important this is, though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning doesn't pop up in the REPL when I import pandas if I do stacklevel=1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, OK forget this. I did test it, but should have done something wrong because now I can't reproduce getting the warning with plain import and stacklevel 1

warnings.warn(
f"""
Pyarrow will become a required dependency of pandas in the next major release of pandas,
(to allow more performant data types and better interoperability with other libraries)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe explicitly mention the new default string dtype, since that's still the only one that is used by default that requires pyarrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

pandas/__init__.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with it. Will let others make the final decision.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice @lithomas1, looks perfect

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like ci/deps/actions-311-numpydev.yaml looks like it needs pyarrow otherwise LGTM

@lithomas1 lithomas1 added this to the 2.2 milestone Jan 19, 2024
@lithomas1 lithomas1 merged commit 5c2a407 into pandas-dev:main Jan 19, 2024
49 of 50 checks passed
@lithomas1
Copy link
Member Author

Merging for now, happy to address any followups in another PR.

lithomas1 added a commit that referenced this pull request Jan 19, 2024
…ot installed) (#56963)

Backport PR #56896: DEPS: Add warning if pyarrow is not installed

Co-authored-by: Thomas Li <47963215+lithomas1@users.noreply.github.com>
@ml-evs
Copy link

ml-evs commented Jan 22, 2024

Sorry to comment here rather than an issue but this feels like quite a niche gotcha -- just a heads up that the warning added here starts with a newline which makes it annoying to filter out (as filterwarnings always matches the start of the message).

e.g., neither

"ignore:Pyarrow will become a required dependency of pandas.*:DeprecationWarning"

nor

"ignore:.*Pyarrow will become a required dependency of pandas.*:DeprecationWarning"

nor

"ignore:.*pandas.*:DeprecationWarning"

will match the message, and what is required is

"ignore:\\nPyarrow will become a required dependency of pandas.*:DeprecationWarning"

I guess changing it now might cause more pain, but I suspect you may get a few more questions on this before v3 (unless I am singularly confused by this -- a quick GH search shows a few PRs that also took a while to get this correct in other packages)!

phofl added a commit to phofl/pandas that referenced this pull request Feb 21, 2024
lithomas1 pushed a commit that referenced this pull request Feb 22, 2024
* Revert "DEPS: Add warning if pyarrow is not installed (#56896)"

This reverts commit 5c2a407

* Add whatsnew

* Update

* Update doc/source/whatsnew/v2.2.1.rst

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>

---------

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
phofl added a commit to phofl/pandas that referenced this pull request Feb 22, 2024
* Revert "DEPS: Add warning if pyarrow is not installed (pandas-dev#56896)"

This reverts commit 5c2a407

* Add whatsnew

* Update

* Update doc/source/whatsnew/v2.2.1.rst

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>

---------

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>

(cherry picked from commit 4ed67ac)
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
* add warning

* DEPS: Add warning if pyarrow is not installed

* formatting

* Update __init__.py

* adjustments

* adjustments

* updates

* address code review

* Update __init__.py

* add pyarrow to npdev build

* ignore non numpy related deprecationwarnings/futurewarning

* ignore non numpy related deprecationwarnings/futurewarning

* Update actions-311-numpydev.yaml
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
* Revert "DEPS: Add warning if pyarrow is not installed (pandas-dev#56896)"

This reverts commit 5c2a407

* Add whatsnew

* Update

* Update doc/source/whatsnew/v2.2.1.rst

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>

---------

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies Required and optional dependencies Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants