Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise NotImplementedError for datetime strings with UTC offset #14070

Merged
merged 10 commits into from
Sep 14, 2023

Conversation

mroeschke
Copy link
Contributor

Description

Avoids e.g. DatetimeIndex(["2022-07-22 00:00:00+02:00"]) from dropping the +02:00 since timezones are not supported

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke added bug Something isn't working Python Affects Python cuDF API. non-breaking Non-breaking change labels Sep 8, 2023
@mroeschke mroeschke requested a review from a team as a code owner September 8, 2023 18:47
try:
pd_arbitrary = pd.to_datetime(arbitrary)
except pd.errors.OutOfBoundsDatetime:
# Not an issue in pandas>=2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment mean that this error won't be raised when we upgrade to pandas 2.0? If so, one suggestion I would have is that we implement a utility function that will raise an exception everywhere that we have a pandas<2.0 code path if the detected pandas version is 2.0. Then when we actually upgrade pandas we can validate that we changed everywhere that needed changing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so actually this doesn't work yet in pandas (it was related to non-nanosecond support in pandas 2.0, but it's not implemented everywhere yet). I linked the issue to this exception path

@@ -2150,6 +2150,12 @@ def test_daterange_pandas_compatibility():
assert_eq(expected, actual)


@pytest.mark.filterwarnings("ignore:parsing timezone:DeprecationWarning")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, added

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style suggestion

except pd.errors.OutOfBoundsDatetime:
# https://github.com/pandas-dev/pandas/issues/55096
pass
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (non-blocking): ‏style

FWIW I tend to prefer not using else on try/except (I can never remember if it applies to the try or the except), so perhaps move this validation into the try block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing. Moved to the try block

@@ -2150,6 +2150,12 @@ def test_daterange_pandas_compatibility():
assert_eq(expected, actual)


def test_strings_with_utc_offset_not_implemented():
with pytest.warns(DeprecationWarning, match="parsing timezone"):
with pytest.raises(NotImplementedError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is cudf.

@@ -2150,6 +2150,12 @@ def test_daterange_pandas_compatibility():
assert_eq(expected, actual)


def test_strings_with_utc_offset_not_implemented():
with pytest.warns(DeprecationWarning, match="parsing timezone"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one is pandas in the pd.to_datetime call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is in cupy. I'll add a comment referencing this

@mroeschke
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 1bfeee7 into rapidsai:branch-23.10 Sep 14, 2023
54 checks passed
@mroeschke mroeschke deleted the bug/to_datetime/utcoffsets branch September 15, 2023 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants