Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix replace_list #27720

Merged
merged 3 commits into from
Aug 5, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.25.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ ExtensionArray

Other
^^^^^

- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` when replacing timezone-aware timestamps using a dict-like replacer (:issue:`27720`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this close any issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not AFAICT. Searching the tracker doesn't show anything obvious and the xfail comment points to a closed PR.

-
-
-
Expand Down
5 changes: 2 additions & 3 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -6658,9 +6658,8 @@ def replace(
else:

# need a non-zero len on all axes
for a in self._AXIS_ORDERS:
if not len(self._get_axis(a)):
return self
if not self.size:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need this check any more or would this happen implicitly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I was mostly happy about avoiding the use of AXIS_ORDERS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general question. is NDFrame supposed to represent a n-dimensional data structure or is it a base class for Series and DataFrame?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NDFrame is just a base class (of Series & DataFrame)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was also backing Panel / Panel4D, but no longer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm thinking should the generic nature of NDFrame be maintained to facilitate a third party Panel implementation, or is that ruled out-of-scope for pandas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm thinking should the generic nature of NDFrame be maintained to facilitate a third party Panel implementation, or is that ruled out-of-scope for pandas?

no that's out of scope. the reason we removed Panel is because of all of the complexitiy related to > 2ndim. xarray is a better platform for that.

generic is just the collection of common api between Series/DataFrame

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy complained about FrameOrSeries, was OK with NDFrame

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in pandas._typing FrameOrSeries is a typevar of Series and DataFrame. This is not applicable to core.generic.

in #27646 FrameOrSeries is defined as a typevar bound by NDFrame so that a series returns a series, a DataFrame returns a DataFrame, a subclassed DataFrame returns a subclassed DataFrame etc.

NDFrame is a nominal type, so allows any subclass of NDFrame to be returned.

return self

new_data = self._data
if is_dict_like(to_replace):
Expand Down
7 changes: 4 additions & 3 deletions pandas/core/internals/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

import numpy as np

from pandas._libs import internals as libinternals, lib
from pandas._libs import Timedelta, Timestamp, internals as libinternals, lib
from pandas.util._validators import validate_bool_kwarg

from pandas.core.dtypes.cast import (
Expand Down Expand Up @@ -602,9 +602,10 @@ def comp(s, regex=False):
"""
if isna(s):
return isna(values)
if hasattr(s, "asm8"):
if isinstance(s, (Timedelta, Timestamp)) and getattr(s, "tz", None) is None:

return _compare_or_regex_search(
maybe_convert_objects(values), getattr(s, "asm8"), regex
maybe_convert_objects(values), s.asm8, regex
)
return _compare_or_regex_search(values, s, regex)

Expand Down
43 changes: 17 additions & 26 deletions pandas/tests/indexing/test_coercion.py
Original file line number Diff line number Diff line change
Expand Up @@ -1029,17 +1029,15 @@ def test_replace_series(self, how, to_key, from_key):

tm.assert_series_equal(result, exp)

# TODO(jbrockmendel) commented out to only have a single xfail printed
@pytest.mark.xfail(
reason="GH #18376, tzawareness-compat bug in BlockManager.replace_list"
@pytest.mark.parametrize("how", ["dict", "series"])
@pytest.mark.parametrize(
"to_key",
["timedelta64[ns]", "bool", "object", "complex128", "float64", "int64"],
)
# @pytest.mark.parametrize('how', ['dict', 'series'])
# @pytest.mark.parametrize('to_key', ['timedelta64[ns]', 'bool', 'object',
# 'complex128', 'float64', 'int64'])
# @pytest.mark.parametrize('from_key', ['datetime64[ns, UTC]',
# 'datetime64[ns, US/Eastern]'])
# def test_replace_series_datetime_tz(self, how, to_key, from_key):
def test_replace_series_datetime_tz(self):
@pytest.mark.parametrize(
"from_key", ["datetime64[ns, UTC]", "datetime64[ns, US/Eastern]"]
)
def test_replace_series_datetime_tz(self, how, to_key, from_key):
how = "series"
from_key = "datetime64[ns, US/Eastern]"
to_key = "timedelta64[ns]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these three assignments not be removed now?

Expand All @@ -1061,23 +1059,16 @@ def test_replace_series_datetime_tz(self):

tm.assert_series_equal(result, exp)

# TODO(jreback) commented out to only have a single xfail printed
@pytest.mark.xfail(
reason="different tz, currently mask_missing raises SystemError", strict=False
@pytest.mark.parametrize("how", ["dict", "series"])
@pytest.mark.parametrize(
"to_key",
["datetime64[ns]", "datetime64[ns, UTC]", "datetime64[ns, US/Eastern]"],
)
# @pytest.mark.parametrize('how', ['dict', 'series'])
# @pytest.mark.parametrize('to_key', [
# 'datetime64[ns]', 'datetime64[ns, UTC]',
# 'datetime64[ns, US/Eastern]'])
# @pytest.mark.parametrize('from_key', [
# 'datetime64[ns]', 'datetime64[ns, UTC]',
# 'datetime64[ns, US/Eastern]'])
# def test_replace_series_datetime_datetime(self, how, to_key, from_key):
def test_replace_series_datetime_datetime(self):
how = "dict"
to_key = "datetime64[ns]"
from_key = "datetime64[ns]"

@pytest.mark.parametrize(
"from_key",
["datetime64[ns]", "datetime64[ns, UTC]", "datetime64[ns, US/Eastern]"],
)
def test_replace_series_datetime_datetime(self, how, to_key, from_key):
index = pd.Index([3, 4], name="xxx")
gfyoung marked this conversation as resolved.
Show resolved Hide resolved
obj = pd.Series(self.rep[from_key], index=index, name="yyy")
assert obj.dtype == from_key
Expand Down