Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: SparseArray numeric ops misc fixes #12910

Closed
wants to merge 1 commit into from

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Apr 17, 2016

  • no existing issue
  • tests added / passed
  • passes git diff upstream/master | flake8 --diff
  • whatsnew entry

Fixed following 3 issues occurred on the current master.

1. addition ignores rhs fill_value

pd.SparseArray([1., 1.]) + pd.SparseArray([1., 0.], fill_value=0.)
# [2.0, nan]
# Fill: nan
# IntIndex
# Indices: array([0], dtype=int32)

Expected:
# [2.0, 1.0]

2. mod raises AttributeError

pd.SparseArray([1, 1]) % pd.SparseArray([1, np.nan])
# AttributeError: 'module' object has no attribute 'sparse_nanmod'

3. pow outputs incorrect result wiht 1.0 ** np.nan

pd.SparseArray([1., 1.]) ** pd.SparseArray([1., np.nan])
# [1.0, nan]
# Fill: nan
# IntIndex
# Indices: array([0], dtype=int32)

Expected:
# [1.0, 1.0]

# NumPy result
np.array([1., 1.]) ** np.array([1, np.nan])
# array([ 1.,  1.])

@sinhrks sinhrks added Bug Numeric Operations Arithmetic, Comparison, and Logical operations Sparse Sparse Data Type labels Apr 17, 2016
@sinhrks sinhrks added this to the 0.18.1 milestone Apr 17, 2016
@sinhrks sinhrks changed the title BUG: SparseArray misc fixes BUG: SparseArray numeric ops misc fixes Apr 17, 2016
@@ -59,7 +59,12 @@ def wrapper(self, other):


def _sparse_array_op(left, right, op, name):
if np.isnan(left.fill_value):
if (np.isnan(left.fill_value) and np.isnan(right.fill_value) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this begs the question of why we don't just always pass the fill values to the ufuncs (e.g. sparse_sub etc), which can then decide (based on np.isnan or (isnull better)) on left and/or rhs whether to use it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes removing nanop simplifies the logic. Because data is sparse, it shouldn't affect to performance in most cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to do that in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let me try.

@jreback
Copy link
Contributor

jreback commented Apr 17, 2016

whoosh you blew away lots of code. must have been work-arounds built in there for maybe an old cython or something. Just want to be sure (since tests were removed) everything still working (as may not be testing some of the numeric ops as much, though did see you added some tests)

@sinhrks
Copy link
Member Author

sinhrks commented Apr 17, 2016

The removed test calls _nan funcs which is no longer exists, and remaining logic is tested with _op_tests below. I'm willing to add more tests if you have anything.

@jreback
Copy link
Contributor

jreback commented Apr 18, 2016

thanks!

@jreback jreback closed this in 3cc4198 Apr 18, 2016
@sinhrks sinhrks deleted the sparse_ops branch April 18, 2016 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Numeric Operations Arithmetic, Comparison, and Logical operations Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants