ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

AaronCritchley · 2018-02-13T00:45:49Z

closes series.agg(np.nansum) etc. returns a series, scalar expected #19629
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

As per the issue, here's proof of solution:

Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 12:04:33) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> pd.Series([1,2,3,4]).agg(np.sum)
10
>>> pd.Series([1,2,3,4]).agg(np.nansum)
10

Where should I add tests for these changes? I wasn't sure on the best fit and how to most effectively test. Also, is a whatsnew entry needed here? If yes, any guidance on what it should be?

TomAugspurger · 2018-02-13T03:05:20Z

Best is probably pandas/tests/test_nanops.py.

A whatsnew entry under bug fixes would be good, under the "Numeric" bug fixes section, saying that

:meth:`~DataFrame.agg` now correctly handles numpy NaN-aware methods like :meth:`numpy.nansum` (:issue:`19629`)

pep8speaks · 2018-02-13T22:48:18Z

Hello @AaronCritchley! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on April 25, 2018 at 11:23 Hours UTC

codecov · 2018-02-13T22:48:31Z

Codecov Report

Merging #19670 into master will increase coverage by 0.06%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #19670      +/-   ##
==========================================
+ Coverage   91.77%   91.84%   +0.06%     
==========================================
  Files         153      153              
  Lines       49257    49300      +43     
==========================================
+ Hits        45207    45279      +72     
+ Misses       4050     4021      -29

Flag	Coverage Δ
#multiple	`90.23% <100%> (+0.06%)`	⬆️
#single	`41.9% <100%> (+0.02%)`	⬆️

Impacted Files	Coverage Δ
pandas/core/base.py	`96.83% <100%> (+0.03%)`	⬆️
pandas/util/testing.py	`84.59% <0%> (-0.21%)`	⬇️
pandas/core/internals.py	`95.53% <0%> (-0.05%)`	⬇️
pandas/core/indexes/datetimes.py	`95.73% <0%> (-0.04%)`	⬇️
pandas/core/accessor.py	`98.7% <0%> (-0.02%)`	⬇️
pandas/core/dtypes/concat.py	`99.16% <0%> (-0.02%)`	⬇️
pandas/core/strings.py	`98.32% <0%> (-0.02%)`	⬇️
pandas/core/indexes/multi.py	`95.06% <0%> (-0.02%)`	⬇️
pandas/core/window.py	`96.29% <0%> (-0.01%)`	⬇️
pandas/core/frame.py	`97.16% <0%> (-0.01%)`	⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ec74e5...c9ddaff. Read the comment docs.

AaronCritchley · 2018-02-13T22:54:30Z

Hey @TomAugspurger - thanks so much for your help. Are the tests I've put in OK or would you rather me be more exhaustive?

I considered adding in some tests with np.nan values but in some functions, like np.nancumsum, that changes the output and I thought it would be messy adding in np.nan tests on some, but not all functions.

jreback · 2018-02-15T12:16:40Z

pandas/tests/test_nanops.py

@@ -1004,6 +1004,95 @@ def prng(self):
        return np.random.RandomState(1234)


+class TestNumpyNaNFunctions(object):


this can be a single function that is parameterized over all of the methods

Sure, will do this soon - could you help with why the build is failing?
From looking at CircleCI it seems like it's not recognizing that np.nanprod is valid, do I need to remove the nanprod case for compat or something? Sorry if I'm being dense here.

you might need to skip for older versions of numpy

not sure when certain ones were added

topper-123 · 2018-02-19T17:28:15Z

np.nanprod was added in numpy 1.10, while pandas supports numpy 1.9.

So in the parametrization you need to add a skipif part wrt. np.nanprod. See here for an example.

…19629-np-nan-funcs-to-cython-map

AaronCritchley · 2018-03-02T00:56:43Z

pandas/core/base.py

    }

+    # np.nanprod was added in np version 1.10.0, we currently support >= 1.9
+    try:


If you have a preferred implementation for this, let me know and I'll happily change, explicitly checking version seemed ugly 😄

you can just check _np_version_under1p10 and add it conditionally

Awesome, didn't know this was a thing, thank you

AaronCritchley · 2018-03-02T00:58:21Z

@topper-123 thank you so much - I ended up changing the implementation rather than just xfailing the test, as the failure in the build was being hit in a non-test scenario, if you have better suggestions I'm happy to take them on!

jreback · 2018-03-02T01:17:53Z

pandas/tests/test_nanops.py

+        (np.cumsum, np.nancumsum)
+    ]
+
+    def test_np_nan_functions(self):


parametrize these

My bad, didn't realise you meant pytest.mark.parametrize, implemented your suggestion now 😄

jreback · 2018-03-02T01:32:31Z

pandas/tests/test_nanops.py

+                               data.agg(nan_method),
+                               check_exact=True)
+
+    @pytest.mark.parametrize("standard, nan_method", [


u can parametrize over df and series (or make into fixtures)

then can inline the compare function

I'm not sure on a good way to handle the np compat tests if we took this approach, I'd be open to suggestions though

@pytest.fixture(params=[Series(....), DataFrame()]) def obj(request): return request.param

jreback · 2018-03-07T14:37:25Z

can you rebase

…19629-np-nan-funcs-to-cython-map

TomAugspurger · 2018-03-16T11:57:46Z

@AaronCritchley are these failures related to your changes? https://travis-ci.org/pandas-dev/pandas/jobs/351535340#L2351

AaronCritchley · 2018-03-21T17:31:48Z

Hey @TomAugspurger, yep, I believe they are, I need to figure out the issue as everything passes locally. Open to suggestions if it's super obvious to you but happy to dive further if not. Seems like different behaviour in np.nanmax in that particular ci configuration.

Also trying to get in the suggestions made by Jeff around using fixtures for the series / df once I've figured it out the above 😄

…19629-np-nan-funcs-to-cython-map

AaronCritchley · 2018-03-22T19:17:33Z

Refactored to make use of fixtures and rebased, still need to dig into the 3.5 build failure and figure out what's happening, help appreciated 😄

TomAugspurger · 2018-03-22T19:24:35Z

pandas/tests/test_nanops.py

+])
+def test_np_nan_functions(standard, nan_method, nan_test_object):
+    _compare_nan_method_output(nan_test_object, standard, nan_method)
+    _compare_nan_method_output(nan_test_object, standard, nan_method)


Are these two lines identical?

I think it'd be clearly to just write out the tm.assert_almost_equal(nan_test_object.agg(standard), nan_test_object.agg(nan_method))

Yep, this was me being silly, fixed!

jreback · 2018-03-22T23:17:56Z

looks fine. ping on green.

jreback · 2018-04-11T09:56:26Z

can you rebase

jreback · 2018-04-21T22:09:41Z

can you rebase

…19629-np-nan-funcs-to-cython-map

…onCritchley/pandas into ENH-19629-np-nan-funcs-to-cython-map

AaronCritchley · 2018-04-25T17:23:54Z

Hey @jreback, I've rebased, the CI pipeline is failing a single test on a single env - I'm looking into it but was having a hard time recreating the failed test. If it's anything obvious help would be much appreciated, I'll continue to dig and try to figure it out.

EDIT: I can recreate the test now, which is good, but still haven't figured out the cause, will update if I find anything.

AaronCritchley · 2018-05-07T21:26:32Z

Going to close this in case any other contributors are able to pick it up, if it doesn't get picked up I'll try to fix again at a later date 😄

Adding numpy nansun/nanmean, etc etc to _cython_table

0d39286

TomAugspurger added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Numeric Operations Arithmetic, Comparison, and Logical operations labels Feb 13, 2018

TomAugspurger added this to the 0.23.0 milestone Feb 13, 2018

Adding in tests, implementing suggested whatsnew entry

47936c7

Fixing flake8 errors

d2671e6

jreback requested changes Feb 15, 2018

View reviewed changes

jreback removed this from the 0.23.0 milestone Feb 15, 2018

AaronCritchley added 2 commits March 2, 2018 00:50

PR comments, support for np.nanprod

29ccb18

Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…

b702747

…19629-np-nan-funcs-to-cython-map

AaronCritchley commented Mar 2, 2018

View reviewed changes

jreback requested changes Mar 2, 2018

View reviewed changes

AaronCritchley added 3 commits March 2, 2018 01:19

skipping if nanprod not implemented

130f767

Checking np version explicitly

0e4657a

Using pytest params

64c0d93

jreback requested changes Mar 2, 2018

View reviewed changes

AaronCritchley added 4 commits March 9, 2018 20:46

Updating for np 1.12

fdaeaf9

Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…

cabb307

…19629-np-nan-funcs-to-cython-map

Fixing bad indentation

0c5a2ae

Moving compat test functions inline to prevent build time issue

7157161

Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…

93e332d

…19629-np-nan-funcs-to-cython-map

Making use of fixtures for series and df generation

8c2a5dd

TomAugspurger reviewed Mar 22, 2018

View reviewed changes

Fixing silly formatting, removing external function to compare

5326d56

AaronCritchley added 3 commits April 17, 2018 01:32

Merge branch 'master' into ENH-19629-np-nan-funcs-to-cython-map

bdd2917

Restoring whatsnew to normal after messy merge

528e12b

More whatsnew cleanup

e88ac10

AaronCritchley added 2 commits April 25, 2018 12:16

Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…

40e4dff

…19629-np-nan-funcs-to-cython-map

Merge branch 'ENH-19629-np-nan-funcs-to-cython-map' of github.com:Aar…

c9ddaff

…onCritchley/pandas into ENH-19629-np-nan-funcs-to-cython-map

AaronCritchley closed this May 7, 2018

topper-123 mentioned this pull request May 18, 2018

ENH: add np.nan funcs to _cython_table #21123

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

AaronCritchley commented Feb 13, 2018 •

edited

Loading

TomAugspurger commented Feb 13, 2018

pep8speaks commented Feb 13, 2018 •

edited

Loading

codecov bot commented Feb 13, 2018 •

edited

Loading

AaronCritchley commented Feb 13, 2018

jreback Feb 15, 2018

AaronCritchley Feb 15, 2018

jreback Feb 15, 2018

topper-123 commented Feb 19, 2018 •

edited

Loading

AaronCritchley Mar 2, 2018

jreback Mar 2, 2018

AaronCritchley Mar 2, 2018

AaronCritchley commented Mar 2, 2018 •

edited

Loading

jreback Mar 2, 2018

AaronCritchley Mar 2, 2018

jreback Mar 2, 2018

AaronCritchley Mar 9, 2018

jreback Mar 10, 2018

jreback commented Mar 7, 2018

TomAugspurger commented Mar 16, 2018

AaronCritchley commented Mar 21, 2018

AaronCritchley commented Mar 22, 2018

TomAugspurger Mar 22, 2018

AaronCritchley Mar 22, 2018

jreback commented Mar 22, 2018

jreback commented Apr 11, 2018

jreback commented Apr 21, 2018

AaronCritchley commented Apr 25, 2018 •

edited

Loading

AaronCritchley commented May 7, 2018

		@@ -1004,6 +1004,95 @@ def prng(self):
		return np.random.RandomState(1234)


		class TestNumpyNaNFunctions(object):

ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

Conversation

AaronCritchley commented Feb 13, 2018 • edited Loading

TomAugspurger commented Feb 13, 2018

pep8speaks commented Feb 13, 2018 • edited Loading

Comment last updated on April 25, 2018 at 11:23 Hours UTC

codecov bot commented Feb 13, 2018 • edited Loading

Codecov Report

AaronCritchley commented Feb 13, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

topper-123 commented Feb 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AaronCritchley commented Mar 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 7, 2018

TomAugspurger commented Mar 16, 2018

AaronCritchley commented Mar 21, 2018

AaronCritchley commented Mar 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 22, 2018

jreback commented Apr 11, 2018

jreback commented Apr 21, 2018

AaronCritchley commented Apr 25, 2018 • edited Loading

AaronCritchley commented May 7, 2018

AaronCritchley commented Feb 13, 2018 •

edited

Loading

pep8speaks commented Feb 13, 2018 •

edited

Loading

codecov bot commented Feb 13, 2018 •

edited

Loading

topper-123 commented Feb 19, 2018 •

edited

Loading

AaronCritchley commented Mar 2, 2018 •

edited

Loading

AaronCritchley commented Apr 25, 2018 •

edited

Loading