Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VOTE: Core Team Vote on Keeping/Removing pyarrow warning #57424

Open
Dr-Irv opened this issue Feb 14, 2024 · 25 comments
Open

VOTE: Core Team Vote on Keeping/Removing pyarrow warning #57424

Dr-Irv opened this issue Feb 14, 2024 · 25 comments
Assignees

Comments

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Feb 14, 2024

@pandas-dev/pandas-core

At the development meeting on February 14, we agreed to take a vote on whether to remove the DeprecationWarning about pyarrow being required in version 2.2.1. We agreed that the decision about whether pyarrow will still be required with version 3.0 is delayed.

Core team should vote below on one of these 2 options:
OPTION 1: Keep the DeprecationWarning in Version 2.2.1
OPTION 2: Remove the DeprecationWarning in Version 2.2.1
OPTION 3: Indifferent (equivalent to a +0 on up/down vote issues)

Voting will close at Noon Eastern Time on February 20, 2024. In the comments, choose OPTION 1 or OPTION 2 or OPTION 3. The decision will be based on which option receives the most votes. If OPTION 3 receives the most votes, then either OPTION 1 or OPTION 2 will be chosen based on which has the most votes. If both of those receive the same number of votes, I don't know what we will do!

For reference:
Current warning that users see when importing pandas in version 2.2.0:

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

Github issue with feedback: #54466
Github issue with discussion about not requiring pyarrow: #57073

I'll list the reasons for keeping/removing the warning here, based on my recall of the discussion. Others can feel free to add additional reasons in the comments, or correct my memory.

Reasons for keeping the warning:

  • pandas 2.2.0 has only been out for 1 month, so we may obtain more feedback, as many people may have not upgraded yet
  • There may be additional reasons for not requiring pyarrow that we have not considered
  • If we remove the warning, then users might infer that we have decided to not require pyarrow in version 3.0

Reasons for removing the warning:

  • Too many people who are not affected by requring pyarrow are confused by the warning
  • We have enough feedback already to make a decision
  • It's too noisy for a variety of use cases
@phofl
Copy link
Member

phofl commented Feb 14, 2024

Option 2

@MarcoGorelli
Copy link
Member

2, I think enough feedback has been collected

@WillAyd
Copy link
Member

WillAyd commented Feb 14, 2024

Option 1

1 similar comment
@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 14, 2024

Option 1

@simonjayhawkins
Copy link
Member

Option 2

(Side Note: I'm not sure the size and capabilities of pyarrow-core and whether Option 1 with an updated message about the dependency would change the feedback received)

@datapythonista
Copy link
Member

Option 1, keep the warning

@mroeschke
Copy link
Member

Option 2

1 similar comment
@jorisvandenbossche
Copy link
Member

Option 2

@bashtage
Copy link
Contributor

Option 1, retain the warning.

@gfyoung
Copy link
Member

gfyoung commented Feb 14, 2024

Option 2

@twoertwein
Copy link
Member

twoertwein commented Feb 15, 2024

Option 3

(it's an annoying warning but also a major changeand DeprecationWarning are not printed by default)

@jreback
Copy link
Contributor

jreback commented Feb 15, 2024

Option 1

people complain about everything - it's a good warning and useful

@attack68
Copy link
Contributor

attack68 commented Feb 15, 2024

Option 1.
Its annoying and I think it should be removed for the final 2.2.x release but for now its only been out for 1 month so keep it.

@ziyixi
Copy link

ziyixi commented Feb 15, 2024

Option 2
Lots of users are affected by the warning, no matter whether they directly rely on pandas or not. I don't think it's end side user's responsibility to depress these warnings.

Edit (@phofl): This is a core team vote, so please refrain from commenting here

@fangchenli
Copy link
Member

Option 1

1 similar comment
@alimcmaster1
Copy link
Member

Option 1

@rhshadrach
Copy link
Member

Option 2

@rhshadrach
Copy link
Member

@twoertwein

(it's an annoying warning but also a major change and DeprecationWarning are not printed by default)

Not trying to sway anyone's vote, but I do think it's important to know that this is not always true. It will print if you run python foo.py and foo.py imports pandas directly prior to importing any other module that imports pandas. Likewise, it will print when you import pandas directly in a jupyter notebook - again only if prior to importing any other module that imports pandas.

If you first import another module that imports pandas, you will not see the warning by default.

@twoertwein
Copy link
Member

You are right! (For some reason, I get a different DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated on python 3.12 only when changing the warning level).

This is probably not the place for a new option, but changing it to an ImportWarning (which is not printed by default) might be a nice middle ground.

I changed my vote to "Option 3" (indifferent). Option 2 doesn't make sense to me as a large percentage of "users" (downloads) is not yet using 2.2: https://www.pepy.tech/projects/pandas

@jbrockmendel
Copy link
Member

Option 2

@lithomas1
Copy link
Member

Option 2

conditional on backing out making pyarrow required as per PDEP 10.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 20, 2024

Option 2

conditional on backing out making pyarrow required as per PDEP 10.

@lithomas1 are you saying that you only support option 2 if pyarrow is no longer required? But given that we have not yet decided whether to reverse PDEP-10, does that change your vote?

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 20, 2024

The final tally is

  • Option 1: 8
  • Option 2: 9
  • Option 3: 1

Since I did say this would be a majority vote, we should remove the warning for 2.2.1.

Having said that, @jorisvandenbossche and I have discussed that we really don't have a process for revoking parts of a PDEP. In other words, PDEP-10 says a warning would be issued from 2.2 onwards. By removing the warning, we are changing the outcome of the PDEP via an ad-hoc voting process created to resolve this particular issue. So I'm not entirely comfortable with making this decision based on a difference of 1 vote. I'm not sure how others feel about the procedural aspect of this decision, where a simple majority determines the revocation of part of a PDEP.

@lithomas1 lithomas1 self-assigned this Feb 21, 2024
@lithomas1
Copy link
Member

I'll make the PR to remove the warning. I didn't have time yesterday, so will put off the release until Friday.

@lithomas1
Copy link
Member

This'll also give us some time to think through this decision some more, in case people are getting worried about the simple majority thing.

Can someone else update the PDEP?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests