Skip to content

Commit

Permalink
Backport PR #52566 on branch 2.0.x (BUG: DataFrame reductions casting…
Browse files Browse the repository at this point in the history
… ts resolution always to nanoseconds) (#52618)

Backport PR #52566: BUG: DataFrame reductions casting ts resolution always to nanoseconds

Co-authored-by: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
  • Loading branch information
meeseeksmachine and phofl committed Apr 12, 2023
1 parent 2e6c99a commit 5cdf5f3
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 2 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Bug fixes
- Bug in :meth:`Series.describe` not returning :class:`ArrowDtype` with ``pyarrow.float64`` type with numeric data (:issue:`52427`)
- Fixed segfault in :meth:`Series.to_numpy` with ``null[pyarrow]`` dtype (:issue:`52443`)
- Bug in :func:`pandas.testing.assert_series_equal` where ``check_dtype=False`` would still raise for datetime or timedelta types with different resolutions (:issue:`52449`)
- Bug in :meth:`DataFrame.max` and related casting different :class:`Timestamp` resolutions always to nanoseconds (:issue:`52524`)
- Bug in :meth:`ArrowDtype.__from_arrow__` not respecting if dtype is explicitly given (:issue:`52533`)
- Bug in :func:`read_csv` casting PyArrow datetimes to NumPy when ``dtype_backend="pyarrow"`` and ``parse_dates`` is set causing a performance bottleneck in the process (:issue:`52546`)
- Bug in :class:`arrays.DatetimeArray` constructor returning an incorrect unit when passed a non-nanosecond numpy datetime array (:issue:`52555`)
Expand Down
4 changes: 2 additions & 2 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -1414,9 +1414,9 @@ def find_common_type(types):

# take lowest unit
if all(is_datetime64_dtype(t) for t in types):
return np.dtype("datetime64[ns]")
return np.dtype(max(types))
if all(is_timedelta64_dtype(t) for t in types):
return np.dtype("timedelta64[ns]")
return np.dtype(max(types))

# don't mix bool / int or float or complex
# this is different from numpy, which casts bool with float/int as int
Expand Down
36 changes: 36 additions & 0 deletions pandas/tests/frame/test_reductions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1508,6 +1508,42 @@ def test_reductions_skipna_none_raises(
with pytest.raises(ValueError, match=msg):
getattr(obj, all_reductions)(skipna=None)

@td.skip_array_manager_invalid_test
def test_reduction_timestamp_smallest_unit(self):
# GH#52524
df = DataFrame(
{
"a": Series([Timestamp("2019-12-31")], dtype="datetime64[s]"),
"b": Series(
[Timestamp("2019-12-31 00:00:00.123")], dtype="datetime64[ms]"
),
}
)
result = df.max()
expected = Series(
[Timestamp("2019-12-31"), Timestamp("2019-12-31 00:00:00.123")],
dtype="datetime64[ms]",
index=["a", "b"],
)
tm.assert_series_equal(result, expected)

@td.skip_array_manager_not_yet_implemented
def test_reduction_timedelta_smallest_unit(self):
# GH#52524
df = DataFrame(
{
"a": Series([pd.Timedelta("1 days")], dtype="timedelta64[s]"),
"b": Series([pd.Timedelta("1 days")], dtype="timedelta64[ms]"),
}
)
result = df.max()
expected = Series(
[pd.Timedelta("1 days"), pd.Timedelta("1 days")],
dtype="timedelta64[ms]",
index=["a", "b"],
)
tm.assert_series_equal(result, expected)


class TestNuisanceColumns:
@pytest.mark.parametrize("method", ["any", "all"])
Expand Down

0 comments on commit 5cdf5f3

Please sign in to comment.