Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.min/max with axis=1 and uniform datetime64[ns, tz] types does not return NaNs #24759

Merged
merged 13 commits into from Jan 16, 2019

Conversation

mroeschke
Copy link
Member

Essentially, check that all blocks are are datetimetz type and the same timezone before performing max/min since datetimetz blocks are only 1D

@mroeschke mroeschke changed the title BUG: DataFrame.min/max with axis=1 does not return NaNs BUG: DataFrame.min/max with axis=1 and uniform datetime64[ns, tz] types does not return NaNs Jan 14, 2019
@@ -5164,6 +5164,11 @@ def _is_datelike_mixed_type(self):
f = lambda: self._data.is_datelike_mixed_type
return self._protect_consolidate(f)

@property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you shouldn't need this as _is_datelike_mixed_type should cover this case, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_is_datelike_mixed_type is not specific to tz-aware blocks nor checks the actual timezone which we need here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you instead use self._is_homogenous_type and a check that any (or the first) dtype is datetimetz?

@@ -196,3 +196,15 @@ def test_tz_localize_convert_copy_inplace_mutate(self, copy, method, tz):
index=date_range('20131027', periods=5,
freq='1H', tz=tz))
tm.assert_frame_equal(result, expected)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should go with the appropriate tests, I think in test_nanops or maybe tests/reductions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Moved to tests/reductions

@gfyoung gfyoung added Bug Timezones Timezone data dtype labels Jan 14, 2019
@codecov
Copy link

codecov bot commented Jan 14, 2019

Codecov Report

Merging #24759 into master will decrease coverage by 49.46%.
The diff coverage is 27.27%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #24759       +/-   ##
===========================================
- Coverage   92.38%   42.92%   -49.47%     
===========================================
  Files         166      166               
  Lines       52363    52373       +10     
===========================================
- Hits        48376    22480    -25896     
- Misses       3987    29893    +25906
Flag Coverage Δ
#multiple ?
#single 42.92% <27.27%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 35.74% <100%> (-61.19%) ⬇️
pandas/core/internals/managers.py 66.33% <14.28%> (-29.73%) ⬇️
pandas/core/generic.py 39.93% <33.33%> (-56.7%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.35%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-95.46%) ⬇️
... and 126 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 453fa85...a21a7ac. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 14, 2019

Codecov Report

Merging #24759 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24759      +/-   ##
==========================================
+ Coverage   92.38%   92.38%   +<.01%     
==========================================
  Files         166      166              
  Lines       52363    52366       +3     
==========================================
+ Hits        48377    48380       +3     
  Misses       3986     3986
Flag Coverage Δ
#multiple 90.81% <100%> (ø) ⬆️
#single 42.92% <100%> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 96.92% <100%> (ø) ⬆️
pandas/core/ops.py 94.28% <0%> (ø) ⬆️
pandas/core/dtypes/common.py 96.78% <0%> (ø) ⬆️
pandas/core/indexes/interval.py 95.27% <0%> (ø) ⬆️
pandas/core/arrays/interval.py 93.12% <0%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2bf1ff...190db02. Read the comment docs.

if axis == 1 and self._is_mixed_type and self._is_datelike_mixed_type:
if (axis == 1 and self._is_datelike_mixed_type
and (not self._is_homogeneous_type
and not self._data.blocks[0].is_datetimetz)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really don't like the last condition as this is reaching into internals. are the prior ones not sufficient?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative is is_datetimetz64_dtype(self.dtypes[0]). I don't have a preference on which is cleaner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted for your suggestion @TomAugspurger. Looks a little cleaner from first inspection.

@jreback jreback added this to the 0.24.0 milestone Jan 16, 2019
@jreback jreback merged commit 8ce64b7 into pandas-dev:master Jan 16, 2019
@jreback
Copy link
Contributor

jreback commented Jan 16, 2019

thanks @mroeschke

@mroeschke mroeschke deleted the tz_aware_reduction_ops branch January 16, 2019 05:11
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame with tz-aware data and max(axis=1) returns NaN
4 participants