Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Test Python 3.12 #53743

Merged
merged 29 commits into from Jul 25, 2023
Merged

CI: Test Python 3.12 #53743

merged 29 commits into from Jul 25, 2023

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Jun 20, 2023

@lithomas1 lithomas1 added Build Library building on various platforms CI Continuous Integration labels Jun 20, 2023
@lithomas1 lithomas1 marked this pull request as draft June 20, 2023 18:29
@lithomas1 lithomas1 removed the request for review from mroeschke June 20, 2023 19:22
@lithomas1
Copy link
Member Author

For those interested, we have
208 failed, 172066 passed, 18996 skipped, 982 xfailed, 84 xpassed, 56 warnings in 2448.38s (0:40:48) =

Not bad, given most are parametrized, but definitely much more than last year 😭 .

@mroeschke
Copy link
Member

I pushed up a fix to use timezone.utc. I think datetime.UTC is > PY311

@lithomas1
Copy link
Member Author

Thanks for updating this, and sorry for losing track of this PR.

I have the Docker image for Python 3.12 set up on my laptop, so I'll try to get some debugging in tomorrow for the Index failures, unless you beat me to it again :).

@lithomas1
Copy link
Member Author

lithomas1 commented Jul 11, 2023

Looks like the major change is

slice objects are now hashable, allowing them to be used as dict keys and set items. (Contributed by Will Bradshaw, Furkan Onder, and Raymond Hettinger in gh-101264.)

@lithomas1
Copy link
Member Author

Down to 4 tests

FAILED pandas/tests/computation/test_eval.py::TestEval::test_true_false_logic - DeprecationWarning: Bitwise inversion '~' on bool is deprecated. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
FAILED pandas/tests/frame/indexing/test_where.py::TestDataFrameIndexingWhere::test_where_invalid - DeprecationWarning: Bitwise inversion '~' on bool is deprecated. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
FAILED pandas/tests/series/indexing/test_getitem.py::test_getitem_generator - AssertionError: Series are different

Series length are different
[left]:  0, Index([], dtype='object')
[right]: 14, Index(['1bZFplgxJt', 'dgf6MOzmcc', 'nXA7LxB4pD', '7xialahw4M', 'eHSMzF5vTv',
       'ohno0T7Zu6', 'sK2uiqyfCh', 'ccmUonUyaI', 'KNYdKnS7IO', 'jQ9UjUVpex',
       'NGazj6cYBr', 'XHqmZevyyF', 'SQXHWlGTY5', 'ZmyLvG2mLO'],
      dtype='object')
FAILED pandas/tests/tseries/offsets/test_year.py::test_add_out_of_pydatetime_range

I think I regressed one of the indexing tests in my fixes.

pandas/core/indexing.py Outdated Show resolved Hide resolved
@lithomas1 lithomas1 marked this pull request as ready for review July 14, 2023 23:24
@lithomas1 lithomas1 added this to the 2.1 milestone Jul 14, 2023
@lithomas1
Copy link
Member Author

This should be ready now.

cc @jbrockmendel @phofl for indexing changes

Not sure who to ping for eval/sql changes.

@@ -165,7 +168,7 @@ def __contains__(self, key: Any) -> bool:
hash(key)
try:
self.get_loc(key)
except (KeyError, TypeError, ValueError):
except (KeyError, TypeError, ValueError, InvalidIndexError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what cases get here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if is_hashable(key):
# Convert generator to list before going through hashable part
# (We will iterate through the generator there to check for slices)
if is_iterator(key):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does changing the order matter? the current order is pretty fine-tuned for perf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check for slices in iterators by iterating through them, since they aren't allowed as hash keys.
(It doesn't make sense to have slices as keys in an Index and it breaks way too many things).
This exhausts the generator.

IIUC, this should only lower perf for generators?
(at least the is_iterator docstring only mentions that it will only return True for generators not list and co.)

lithomas1 and others added 3 commits July 24, 2023 14:05
Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
@lithomas1 lithomas1 requested a review from mroeschke July 25, 2023 17:14
.circleci/config.yml Outdated Show resolved Hide resolved
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment and a small test failure otherwise looks good

@lithomas1 lithomas1 merged commit c9a8f95 into pandas-dev:main Jul 25, 2023
69 of 70 checks passed
@lithomas1 lithomas1 deleted the test-py312 branch July 25, 2023 23:36
os: [ubuntu-22.04, macOS-latest, windows-latest]
# TODO: Disable macOS for now, Github Actions bug where python is not
# symlinked correctly to 3.12
# xref https://github.com/actions/setup-python/issues/701
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the linked issue is now closed; can this be reenabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matt enabled this again a while back, I think.

if is_iterator(key):
key = list(key)

if is_hashable(key) and not isinstance(key, slice):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very late review comment: would it make sense to make is_hashable_non_slice or something? it wouldn't surprise me if many places that use is_hashable current assume non-slice but didn't get updated by this PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure, I just updated the places that broke in the tests.

Is there a case where we would actually want is_hashable for a slice to equal True, though?
(I was thinking it might be cleaner to make is_hashable always return False for slices. This would be a breaking API change, though).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think having is_hashable be anything other than a try/except around hash would cause problems. im suggesting a new function to de-duplicate the hashable-but-not-slice checks this introduces

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made an issue #55152.

I don't have the same time that I had in summer to work on pandas, but I'll try to have a look (no promises, though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms CI Continuous Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants