New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix nansum overflow on Windows with bottleneck #15507

Closed
wants to merge 6 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@JaviSorribes

JaviSorribes commented Feb 25, 2017

Bottleneck used in nansum produces overflow error on Windows. Temporary fix that could be undone once the error in bottleneck is fixed and released, to avoid platform specificity.

  • closes #15453
  • tests added / passed
  • passes git diff upstream/master | flake8 --diff
  • whatsnew entry
@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Feb 26, 2017

Codecov Report

Merging #15507 into master will decrease coverage by -0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #15507      +/-   ##
==========================================
- Coverage    91.1%   91.03%   -0.07%     
==========================================
  Files         136      136              
  Lines       49102    49103       +1     
==========================================
- Hits        44736    44703      -33     
- Misses       4366     4400      +34
Impacted Files Coverage Δ
pandas/core/nanops.py 97.96% <100%> (-0.21%)
pandas/io/gbq.py 40% <0%> (-46.67%)
pandas/util/decorators.py 67.92% <0%> (-1.31%)
pandas/io/pytables.py 93.05% <0%> (-0.86%)
pandas/tools/merge.py 91.78% <0%> (-0.35%)
pandas/core/frame.py 97.74% <0%> (-0.1%)
pandas/core/series.py 94.93% <0%> (-0.04%)
pandas/core/algorithms.py 94.48% <0%> (ø)
pandas/compat/pickle_compat.py 68.29% <0%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23889d3...7b95a88. Read the comment docs.

codecov-io commented Feb 26, 2017

Codecov Report

Merging #15507 into master will decrease coverage by -0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #15507      +/-   ##
==========================================
- Coverage    91.1%   91.03%   -0.07%     
==========================================
  Files         136      136              
  Lines       49102    49103       +1     
==========================================
- Hits        44736    44703      -33     
- Misses       4366     4400      +34
Impacted Files Coverage Δ
pandas/core/nanops.py 97.96% <100%> (-0.21%)
pandas/io/gbq.py 40% <0%> (-46.67%)
pandas/util/decorators.py 67.92% <0%> (-1.31%)
pandas/io/pytables.py 93.05% <0%> (-0.86%)
pandas/tools/merge.py 91.78% <0%> (-0.35%)
pandas/core/frame.py 97.74% <0%> (-0.1%)
pandas/core/series.py 94.93% <0%> (-0.04%)
pandas/core/algorithms.py 94.48% <0%> (ø)
pandas/compat/pickle_compat.py 68.29% <0%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23889d3...7b95a88. Read the comment docs.

@cpcloud cpcloud self-assigned this Feb 26, 2017

@cpcloud cpcloud requested a review from jreback Feb 26, 2017

@cpcloud cpcloud added this to the 0.20.0 milestone Feb 26, 2017

@jreback

The biggest issue here is this semantically changes things. IOW, now since we are always using pandas nansum (rather than bottleneck nansum if installed), we will always yield Series([np.nan]).sum() -> np.nan (rather than 0 if bottleneck is installed). So I think we actually have to change this for all platforms or then this becomes even more confusing.

Show outdated Hide outdated pandas/core/nanops.py Outdated
@@ -327,6 +327,12 @@ def test_nansum(self):
self.check_funs(nanops.nansum, np.sum, allow_str=False,
allow_date=False, allow_tdelta=True, check_dtype=False)
def test_nansum_overflow(self):
s = Series([2**31])

This comment has been minimized.

@jreback

jreback Feb 26, 2017

Contributor

needs testing both with and w/o bottleneck (to yield the same results), see other tests for how to do this.

@jreback

jreback Feb 26, 2017

Contributor

needs testing both with and w/o bottleneck (to yield the same results), see other tests for how to do this.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Feb 26, 2017

Contributor

@jorisvandenbossche @shoyer

w.r.t. #9422 I think we need to simply change this (in effect what this PR does), but for all platforms (and not just windows).

Contributor

jreback commented Feb 26, 2017

@jorisvandenbossche @shoyer

w.r.t. #9422 I think we need to simply change this (in effect what this PR does), but for all platforms (and not just windows).

@cpcloud cpcloud changed the title from Fix nansum overflow on Windows with bottleneck to BUG: Fix nansum overflow on Windows with bottleneck Feb 27, 2017

@cpcloud

This comment has been minimized.

Show comment
Hide comment
@cpcloud

cpcloud Feb 27, 2017

Member

@JaviSorribes Looking at @jreback's comments, it looks like we should just disable nansum bottleneck calls altogether.

Member

cpcloud commented Feb 27, 2017

@JaviSorribes Looking at @jreback's comments, it looks like we should just disable nansum bottleneck calls altogether.

@JaviSorribes

This comment has been minimized.

Show comment
Hide comment
@JaviSorribes

JaviSorribes Feb 28, 2017

So I understand @jreback and @cpcloud that we should not use bottleneck at all for nansum? Checkout the image attached for what I think this would look like.

Also, what about all the other nan operations? Should we consider possible overflow in any of them?

image

JaviSorribes commented Feb 28, 2017

So I understand @jreback and @cpcloud that we should not use bottleneck at all for nansum? Checkout the image attached for what I think this would look like.

Also, what about all the other nan operations? Should we consider possible overflow in any of them?

image

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 1, 2017

Contributor

xref #15542

Contributor

jreback commented Mar 1, 2017

xref #15542

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Mar 1, 2017

Member

We indeed just need to take a decision on #9422, as we have left this lingering too long.
But, I am not necessarily convinced that we should change everything to the native pandas way (returning NaN), instead of changing pandas to return 0 to align with numpy/bottleneck (and also R, matlab, ..).
I don't think there was really agreement on this in #9422 (several have argued for both positions), but I also don't know how to best proceed on taking a decision.

Member

jorisvandenbossche commented Mar 1, 2017

We indeed just need to take a decision on #9422, as we have left this lingering too long.
But, I am not necessarily convinced that we should change everything to the native pandas way (returning NaN), instead of changing pandas to return 0 to align with numpy/bottleneck (and also R, matlab, ..).
I don't think there was really agreement on this in #9422 (several have argued for both positions), but I also don't know how to best proceed on taking a decision.

@jreback jreback modified the milestone: 0.20.0 Mar 23, 2017

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback May 7, 2017

Contributor

The fixes are related to fixing #9422 are a bit bigger in scope than this. That said If you want to rebase / add some additional tests could take this as a precurser.

Contributor

jreback commented May 7, 2017

The fixes are related to fixing #9422 are a bit bigger in scope than this. That said If you want to rebase / add some additional tests could take this as a precurser.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Aug 20, 2017

Contributor

closing this. as #9422 needs resolution.

Contributor

jreback commented Aug 20, 2017

closing this. as #9422 needs resolution.

@jreback jreback closed this Aug 20, 2017

jreback added a commit to jreback/pandas that referenced this pull request Sep 22, 2017

jreback added a commit to jreback/pandas that referenced this pull request Sep 22, 2017

jreback added a commit to jreback/pandas that referenced this pull request Sep 22, 2017

jreback added a commit to jreback/pandas that referenced this pull request Sep 23, 2017

jreback added a commit to jreback/pandas that referenced this pull request Sep 25, 2017

jreback added a commit to jreback/pandas that referenced this pull request Sep 28, 2017

jreback added a commit to jreback/pandas that referenced this pull request Oct 2, 2017

jreback added a commit to jreback/pandas that referenced this pull request Oct 6, 2017

jreback added a commit to jreback/pandas that referenced this pull request Oct 6, 2017

jreback added a commit to jreback/pandas that referenced this pull request Oct 9, 2017

jreback added a commit to jreback/pandas that referenced this pull request Oct 10, 2017

jreback added a commit that referenced this pull request Oct 10, 2017

kchomski-reef added a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

alanbato added a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

No-Stream added a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment