BUG: ufunc is not applied to sparse.fill_value #13853

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants
Member

sinhrks commented Jul 30, 2016

  • tests added / passed
  • passes git diff upstream/master | flake8 --diff
  • whatsnew entry

When ufunc is applied to sparse, it is not applied to fill_value. Thus results are incorrect.

on current master:

np.abs(pd.SparseArray([1, -2, -1], fill_value=-2))
# [1.0, -2, 1.0]
# Fill: -2
# IntIndex
# Indices: array([0, 2], dtype=int32)

np.add(pd.SparseArray([1, -2, -1], fill_value=-2), 1)
# [2.0, -2, 0.0]
# Fill: -2
# IntIndex
# Indices: array([0, 2], dtype=int32)

cc @gfyoung

sinhrks added this to the 0.19.0 milestone Jul 30, 2016

codecov-io commented Jul 30, 2016 edited

Current coverage is 85.27% (diff: 93.33%)

Merging #13853 into master will decrease coverage by <.01%

@@             master     #13853   diff @@
==========================================
  Files           139        139          
  Lines         50020      50031    +11   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          42657      42666     +9   
- Misses         7363       7365     +2   
  Partials          0          0          

Powered by Codecov. Last update 97de42a...a14f573

@gfyoung gfyoung commented on the diff Jul 31, 2016

pandas/sparse/tests/test_array.py
@@ -829,6 +829,51 @@ def test_numpy_mean(self):
tm.assertRaisesRegexp(ValueError, msg, np.mean,
SparseArray(data), out=out)
+ def test_abs(self):
+ sparse = SparseArray([1, np.nan, 2, np.nan, -2])
+ result = SparseArray([1, np.nan, 2, np.nan, 2])
+ tm.assert_sp_array_equal(abs(sparse), result)
+ tm.assert_sp_array_equal(np.abs(sparse), result)
+
+ sparse = SparseArray([1, -1, 2, -2], fill_value=1)
+ result = SparseArray([1, 2, 2], sparse_index=sparse.sp_index,
@gfyoung

gfyoung Jul 31, 2016

Member

Can you explain why this result is correct?

@sinhrks

sinhrks Jul 31, 2016

Member

Because assert_sp_array_equal compares sparse internal representation, it is for prepare correct internal repr. You can see the result is correct from its dense repr.

# test case
sparse = pd.SparseArray([1, -1, 2, -2], fill_value=1)
abs(sparse).to_dense()
# array([ 1.,  1.,  2.,  2.])
# result
pd.SparseArray([1, 2, 2], sparse_index=sparse.sp_index, fill_value=1).to_dense()
# array([ 1.,  1.,  2.,  2.])
@gfyoung

gfyoung Aug 1, 2016

Member

Okay, good to know. I wasn't 100% clear on how the sparse comparison worked. Thanks!

@gfyoung gfyoung commented on an outdated diff Jul 31, 2016

pandas/sparse/tests/test_array.py
@@ -829,6 +829,51 @@ def test_numpy_mean(self):
tm.assertRaisesRegexp(ValueError, msg, np.mean,
SparseArray(data), out=out)
+ def test_abs(self):
@gfyoung

gfyoung Jul 31, 2016 edited

Member
  1. For sanity purposes, include a test where sparse has just positive numbers and a positive fill_value.

  2. Brief comment about why we have this test would be useful since there is no associated bug with this PR.

@gfyoung gfyoung commented on an outdated diff Jul 31, 2016

pandas/sparse/tests/test_array.py
+ tm.assert_sp_array_equal(abs(sparse), result)
+ tm.assert_sp_array_equal(np.abs(sparse), result)
+
+ sparse = SparseArray([1, -1, 2, -2], fill_value=1)
+ result = SparseArray([1, 2, 2], sparse_index=sparse.sp_index,
+ fill_value=1)
+ tm.assert_sp_array_equal(abs(sparse), result)
+ tm.assert_sp_array_equal(np.abs(sparse), result)
+
+ sparse = SparseArray([1, -1, 2, -2], fill_value=-1)
+ result = SparseArray([1, 2, 2], sparse_index=sparse.sp_index,
+ fill_value=1)
+ tm.assert_sp_array_equal(abs(sparse), result)
+ tm.assert_sp_array_equal(np.abs(sparse), result)
+
+ def test_ufunc1(self):
@gfyoung

gfyoung Jul 31, 2016 edited

Member
  1. For org purposes, I might prefer combining both tests just because we want to make it clear that we are testing ufunc in general and not specifically sin and add.

  2. Brief comment about why we have this test would be useful since there is no associated bug with this PR.

Member

gfyoung commented Aug 1, 2016

LGTM

cc @jreback

@jreback jreback commented on an outdated diff Aug 1, 2016

pandas/sparse/array.py
@@ -213,6 +213,17 @@ def kind(self):
elif isinstance(self.sp_index, IntIndex):
return 'integer'
+ def __array_wrap__(self, out_arr, context=None):
+ if isinstance(context, tuple) and len(context) == 3:
@jreback

jreback Aug 1, 2016

Contributor

can you put a comment here of what this is doing

Contributor

jreback commented Aug 1, 2016

I understand why this is needed, but it feels a tad unnatural. I am not sure a user will be expecting that the fill value will have the ufunc be applied here. Can we add a section to the docs showing this?

Member

sinhrks commented Aug 1, 2016

Sure, added small section.

@sinhrks sinhrks BUG: ufunc is not applied to sparse.fill_value
a14f573

jreback closed this in 5f47608 Aug 3, 2016

sinhrks deleted the sinhrks:sparse_ufunc branch Aug 3, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment