standardize signature for Index reductions, implement nanmean for datetime64 dtypes #24293

jbrockmendel · 2018-12-15T04:09:54Z

Same basic goal as #23890, but focusing on nanops (a module I'm not all that familiar with) first, with the array methods later, instead of vice-versa.

I think the only user-facing change (i.e. need for whatsnew) is that Series[datetime64].mean() now works.

I need to see if there are any more tests from #23890 or #24024 that can be copied here.

closes Series.max incorrect for skipna=False with missing datetime64[ns] data. #24265
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

…etime64 dtypes

pep8speaks · 2018-12-15T04:10:02Z

Hello @jbrockmendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on December 28, 2018 at 23:37 Hours UTC

jbrockmendel · 2018-12-15T04:11:26Z

possibly closes #21583

jbrockmendel · 2018-12-15T17:58:29Z

test_nanops is a PITA in to troubleshoot. code de-duplication at the expense of clarity may need re-thinking at some point.

codecov · 2018-12-15T18:02:14Z

Codecov Report

Merging #24293 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #24293      +/-   ##
==========================================
+ Coverage   92.22%   92.22%   +<.01%     
==========================================
  Files         162      162              
  Lines       51824    51842      +18     
==========================================
+ Hits        47795    47813      +18     
  Misses       4029     4029

Flag	Coverage Δ
#multiple	`90.63% <100%> (ø)`	⬆️
#single	`43% <45.71%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/nanops.py	`95.14% <100%> (+0.09%)`	⬆️
pandas/core/indexes/datetimelike.py	`97.29% <100%> (ø)`	⬆️
pandas/core/base.py	`97.66% <100%> (+0.02%)`	⬆️
pandas/core/dtypes/missing.py	`93.18% <100%> (+0.07%)`	⬆️
pandas/core/indexes/range.py	`97.34% <100%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b0fa8e...04cf1f7. Read the comment docs.

codecov · 2018-12-15T18:02:14Z

Codecov Report

Merging #24293 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #24293      +/-   ##
==========================================
- Coverage   92.32%   92.31%   -0.01%     
==========================================
  Files         166      166              
  Lines       52298    52320      +22     
==========================================
+ Hits        48285    48301      +16     
- Misses       4013     4019       +6

Flag	Coverage Δ
#multiple	`90.73% <100%> (-0.01%)`	⬇️
#single	`43.04% <31.03%> (-0.03%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/nanops.py	`94.9% <100%> (-0.16%)`	⬇️
pandas/core/indexes/datetimelike.py	`97.59% <100%> (+0.05%)`	⬆️
pandas/core/base.py	`97.72% <100%> (+0.02%)`	⬆️
pandas/core/series.py	`93.73% <100%> (+0.01%)`	⬆️
pandas/core/dtypes/missing.py	`93.18% <100%> (+0.07%)`	⬆️
pandas/core/indexes/range.py	`97.34% <100%> (+0.01%)`	⬆️
pandas/core/dtypes/cast.py	`88.72% <0%> (-0.67%)`	⬇️
pandas/core/arrays/datetimelike.py	`95.45% <0%> (-0.2%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f1c1dc...aa4028a. Read the comment docs.

jreback · 2018-12-15T18:17:47Z

needs. parameterization pass

pandas/core/base.py

pandas/core/dtypes/missing.py

jreback · 2018-12-15T19:11:59Z

pandas/core/indexes/datetimelike.py

@@ -286,7 +286,7 @@ def min(self, axis=None, *args, **kwargs):
        except ValueError:
            return self._na_value

-    def argmin(self, axis=None, *args, **kwargs):
+    def argmin(self, axis=None, skipna=True, *args, **kwargs):


can these inherit the doc-string?

I'll take a look. Might also get rid of *args while at it

Hmm the docstrings here and for the corresponding methods in base.IndexOpsMixin are kind of clunky. This may merit a separate look, @datapythonista ?

I don't know well what's the class hierarchy and whether makes sense to inherit. But we'll have to add Parameters, Returns and Examples section here if this is public.

jreback · 2018-12-15T19:12:24Z

pandas/core/indexes/range.py

@@ -297,12 +297,14 @@ def _minmax(self, meth):

        return self._start + self._step * no_steps

-    def min(self):
+    def min(self, axis=None, skipna=True):
        """The minimum value of the RangeIndex"""


same. why don't these inherit the doc-string

jreback · 2018-12-15T19:13:10Z

pandas/core/nanops.py

@@ -212,6 +213,10 @@ def _get_values(values, skipna, fill_value=None, fill_value_typ=None,
            mask = isna(values)

    dtype = values.dtype
+    if is_datetime64tz_dtype(orig_values):


this is a very very odd place to do this as no other dtype specific things are here

Semi-agree. Most of nanops is numpy dtype/ndarray specific, so making datetime64tz work is klunky. That's why I thought it made more sense to implement these directly in DTA in #23890. But conditional on implementing this here in nanops, this check is necessary.

my point is this is in the wrong place. nanops already handles dtypes just not here.

Can you be more specific as to where you want this? The place where dtype is currently defined is one line above this, so doing this anywhere else seems weird to me.

so this should go in _view_if_needed (or maybe just remove that function and inline it here is ok

did you do this?

Haven't yet. view_if_needed is called ~25 lines below this. It isn't obvious to me that it is OK to move it up to here. If it is, then sure, its a one-liner that can be inlined.

yes move it, as I said above you have dtype inference in 2 places for no reason.

pandas/core/nanops.py

jreback · 2018-12-15T21:17:52Z

pandas/core/nanops.py

@@ -212,6 +213,10 @@ def _get_values(values, skipna, fill_value=None, fill_value_typ=None,
            mask = isna(values)

    dtype = values.dtype
+    if is_datetime64tz_dtype(orig_values):


my point is this is in the wrong place. nanops already handles dtypes just not here.

pandas/core/nanops.py

pandas/core/arrays/datetimelike.py

jbrockmendel · 2018-12-19T17:10:45Z

If everyone is happy with this, I can get started on a follow-up that will trim reductions off of #24024.

TomAugspurger · 2018-12-19T17:15:19Z

pandas/core/base.py

@@ -951,22 +957,36 @@ def max(self):
        >>> idx.max()
        ('b', 2)
        """
+        nv.validate_minmax_axis(axis)
        return nanops.nanmax(self.values)


Why isn't skipna passed? Is that a followup item?

I had considered this a follow-up item, but am now leaning towards fixing these now. Good catch.

This turns out to be more involved than expected, requires some changes in Series._reduce and IndexOpsMixin._reduce. Neither of which is all that invasive, but changing them entails changing/fixing/testing behavior for other reductions I hadn't planned on doing in this PR.

Currently going through to get that all done here. I'm also OK with doing this in two phases so as to take things out of #24024 sooner rather than later.

pandas/core/base.py

…duction3

jbrockmendel · 2018-12-20T00:02:41Z

Updated to pass skipna in the appropriate places in Index and DatetimeIndexOpsMixin methods. Updated a handful of tests, but I'm not confident we got thorough coverage

jbrockmendel · 2018-12-20T02:04:45Z

Travis fail is hypothesis

…duction3

jbrockmendel · 2018-12-23T17:45:12Z

rebased

pandas/core/base.py

jreback · 2018-12-23T19:16:03Z

pandas/core/base.py

        """
        Return the maximum value of the Index.

+        Parameters


also we have similar things in pandas/core/base, these should change as well.

jreback · 2018-12-23T19:17:39Z

pandas/core/indexes/datetimelike.py

            # quick check
            if len(i8) and self.is_monotonic:
                if i8[-1] != iNaT:
                    return self._box_func(i8[-1])

+            if not len(self):


jreback · 2018-12-23T19:17:45Z

pandas/core/indexes/datetimelike.py

            # quick check
            if len(i8) and self.is_monotonic:
                if i8[-1] != iNaT:
                    return self._box_func(i8[-1])

+            if not len(self):
+                return self._na_value
+
            if self.hasnans:


jreback · 2018-12-23T19:18:05Z

pandas/core/nanops.py

@@ -212,6 +213,10 @@ def _get_values(values, skipna, fill_value=None, fill_value_typ=None,
            mask = isna(values)

    dtype = values.dtype
+    if is_datetime64tz_dtype(orig_values):


did you do this?

pandas/tests/reductions/test_reductions.py

jreback · 2018-12-23T19:19:53Z

pandas/tests/reductions/test_reductions.py

@@ -558,6 +599,8 @@ def test_minmax_nat_series(self, nat_ser):
        # GH#23282
        assert nat_ser.min() is pd.NaT
        assert nat_ser.max() is pd.NaT


can parameterize over skipna in a lot of the code you added to tests

We just implemented test_reductions, still need to move over the reduction tests from tests.indexes and tests.frames. After they're all in one place is when to really focus on parametrization.

jreback · 2018-12-24T00:35:14Z

pandas/core/nanops.py

@@ -212,6 +213,10 @@ def _get_values(values, skipna, fill_value=None, fill_value_typ=None,
            mask = isna(values)

    dtype = values.dtype
+    if is_datetime64tz_dtype(orig_values):


yes move it, as I said above you have dtype inference in 2 places for no reason.

jreback · 2018-12-24T00:36:31Z

pandas/core/series.py

@@ -3493,6 +3494,9 @@ def _reduce(self, op, name, axis=0, skipna=True, numeric_only=None,
        # dispatch to ExtensionArray interface
        if isinstance(delegate, ExtensionArray):
            return delegate._reduce(name, skipna=skipna, **kwds)
+        elif is_datetime64_dtype(delegate):


should this be first?

it isn't clear to me that it would make a difference

pandas/tests/reductions/test_reductions.py

…duction3

jbrockmendel · 2018-12-24T23:22:06Z

as I said above you have dtype inference in 2 places for no reason.

Just pushed, collected all of that at the top of _get_values.

…duction3

jreback · 2018-12-27T21:51:18Z

pandas/core/base.py

+        Parameters
+        ----------
+        axis : {None}
+            Dummy argument for consistency with Series


not sure we use this terminology here

axis : {0) Axis for the function to be applied on.

I think @jbrockmendel is listing the set of possible values, which we do use. In this case, that set is just the single value None, so it looks strange.

I think

axis : int, optional For compatibility with NumPy. Only 0 or None are allowed.

is clearer.

@TomAugspurger verbiage would be good

jreback · 2018-12-27T21:52:32Z

pandas/core/base.py

+        axis : {None}
+            Dummy argument for consistency with Series
+        skipna : bool, default True
+
        See Also


should be consisten about the See Also, e.g. make sure in all, and add a refernce to the Series.min function as well (as appropriate)

jreback · 2018-12-27T21:57:01Z

pandas/core/indexes/datetimelike.py


+        i8 = self.asi8
+        try:
            # quick check
            if len(i8) and self.is_monotonic:


these quick checks make sense for Index as they are immutable, but may not make much sense here (but i guess can evaluate later)

pandas/core/nanops.py

…duction3

jreback · 2018-12-28T14:21:11Z

code looks good. if you can update the doc-strings then would be good.

…duction3

jbrockmendel · 2018-12-29T01:13:14Z

I think docstring comments have been addressed; bears double-checking

jreback · 2018-12-29T14:26:43Z

thanks @jbrockmendel

I think we need a whatsnew for this #24265, if you can add in next PR

jreback · 2018-12-29T14:27:51Z

@datapythonista @jorisvandenbossche @TomAugspurger its worth investigating if we can somehow share doc-strings with EA, Index, Series for some operations by templating.

…etime64 dtypes (pandas-dev#24293)

standardize signature for Index reductions, implement nanmean for dat…

f38ffe1

…etime64 dtypes

suppress warnings

04cf1f7

jreback requested changes Dec 15, 2018

View reviewed changes

jreback added Datetime Datetime data dtype API Design labels Dec 15, 2018

requested edits

3efab79

jreback requested changes Dec 15, 2018

View reviewed changes

implement view

a652439

jreback requested changes Dec 18, 2018

View reviewed changes

pandas/core/arrays/datetimelike.py Outdated Show resolved Hide resolved

TomAugspurger reviewed Dec 19, 2018

View reviewed changes

pandas/core/base.py Outdated Show resolved Hide resolved

jbrockmendel added 2 commits December 19, 2018 09:48

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

ce760d3

…duction3

pass skipna, tests

2380af6

jbrockmendel mentioned this pull request Dec 20, 2018

REF/TST: collect reduction tests #24367

Merged

jbrockmendel added 2 commits December 19, 2018 18:07

fixup isort

4157f0b

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

50642ae

…duction3

jreback requested changes Dec 23, 2018

View reviewed changes

jbrockmendel added 2 commits December 23, 2018 14:03

fixup rebase screwups

0baedf3

move len-self checks

6e0e69f

fixup more rebase screwups

e2c301b

jreback requested changes Dec 24, 2018

View reviewed changes

jbrockmendel added 2 commits December 24, 2018 15:14

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

fce67ac

…duction3

do values viewing in one place

4b4979f

jbrockmendel added 2 commits December 27, 2018 13:37

unpack Series

82b8cdf

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

ffe6ada

…duction3

jreback requested changes Dec 27, 2018

View reviewed changes

jbrockmendel added 3 commits December 27, 2018 16:41

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

7f7693f

…duction3

implement _extract_datetimelike_values_and_dtype

07c3102

fix mask

6c93410

jbrockmendel added 3 commits December 28, 2018 15:31

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

4620c56

…duction3

requested docstring edits, remoe duplicated view

4777b75

Merge branch 'master' of https://github.com/pandas-dev/pandas into re…

aa4028a

…duction3

jreback added this to the 0.24.0 milestone Dec 29, 2018

jreback approved these changes Dec 29, 2018

View reviewed changes

jreback merged commit fe696e4 into pandas-dev:master Dec 29, 2018

jbrockmendel deleted the reduction3 branch December 29, 2018 15:31

jorisvandenbossche mentioned this pull request Jan 24, 2019

Disable M8 in nanops #24907

Merged

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

standardize signature for Index reductions, implement nanmean for dat…

dcf8129

…etime64 dtypes (pandas-dev#24293)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

standardize signature for Index reductions, implement nanmean for dat…

a96af41

…etime64 dtypes (pandas-dev#24293)

standardize signature for Index reductions, implement nanmean for datetime64 dtypes #24293

standardize signature for Index reductions, implement nanmean for datetime64 dtypes #24293

Conversation

jbrockmendel commented Dec 15, 2018 • edited Loading

pep8speaks commented Dec 15, 2018 • edited Loading

Comment last updated on December 28, 2018 at 23:37 Hours UTC

jbrockmendel commented Dec 15, 2018

jbrockmendel commented Dec 15, 2018

codecov bot commented Dec 15, 2018

Codecov Report

codecov bot commented Dec 15, 2018 • edited Loading

Codecov Report

jreback commented Dec 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Dec 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Dec 20, 2018

jbrockmendel commented Dec 20, 2018

jbrockmendel commented Dec 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Dec 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 28, 2018

jbrockmendel commented Dec 29, 2018

jreback commented Dec 29, 2018

jreback commented Dec 29, 2018

jbrockmendel commented Dec 15, 2018 •

edited

Loading

pep8speaks commented Dec 15, 2018 •

edited

Loading

codecov bot commented Dec 15, 2018 •

edited

Loading