Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Unable to use Series created with reindex_like with numpy.logical_and() #2388

Closed
durden opened this Issue · 12 comments

3 participants

@durden

I'm using numpy 1.6.2, pandas 0.9.1, and Python 2.7.2. I see strange behavior when using numpy.logical_and() depending on how I create a Series object. For example:

>>> import numpy
>>> import pandas
>>> series = pandas.Series([1, 2, 3])
>>> x = pandas.Series([True]).reindex_like(series).fillna(True)
>>> y = pandas.Series(True, index=series.index)
>>> series
0    1
1    2
2    3
>>> x
0    True
1    True
2    True
>>> y
0    True
1    True
2    True
>>> numpy.logical_and(series, y)
0    True
1    True
2    True
>>> numpy.logical_and(series, x)
Traceback (most recent call last):
  File "<ipython-input-10-e2050a2015bf>", line 1, in <module>
    numpy.logical_and(series, x)
AttributeError: logical_and

What is the difference between x and y here that is causing the AttributeError?

Also, I originally posted this as a question on stackoverflow. There are comments saying this works with pandas 0.9.0 and numpy 1.8. I haven't verified this for myself yet. However, my scenario is using the most recent stable releases of both projects.

@changhiskhan
Collaborator

x is object type and y is bool dtype. this works if you do numpy.logical_and(series, x.astype(bool))

reindex_like is going to introduce a bunch of NaNs so that's going to convert the Series into a bool dtype

maybe we should call maybe_convert_objects at the end of fillna?

@durden

Did something change recently to change this in 0.9.1? I've been told that the snippet above works in 0.9.0. I don't know how to get this exact version to test for myself though.

As a side note, any time a Series has NaN it is automatically a bool dtype? So, anytime I call fillna() on a Series it will implicitly convert the type?

@changhiskhan
Collaborator

Right now fillna does NOT convert the type but reindex_like does. Because NaN is a float, after reindex_like the Series becomes mixed type so gets converted to object dtype

@wesm
Owner

To be clear: this is a wart due to pandas's "best efforts" implementation of missing data using NumPy. I would expect the same code to fail on 0.9.0

@wesm
Owner

Is there a reason why series & y is not an option? That should work

@durden

Oh, I see what you mean. So is this technically a bug then?

@durden

The reason I'm not using series & y is because I'm actually taking a dictionary of operations to perform and combining them myself. I might be approaching the problem in the wrong way (I'm trying to replace legacy custom code with some Pandas functionality).

You can find more information about my exact situation on this stackoverflow post.

@durden

Maybe this subtle issue should be mentioned in the docs for reindex_like()?

Also, how can I get 0.9.0 and test this? The person responding on my stackoverflow post claimed this worked with pandas 0.9.0 AND numpy 1.8. So, not sure what the difference in numpy is from 1.8 and 1.6.2 so might not be 'broken' in pandas 0.9.0.

@wesm
Owner

Shouldn't you use operator.and_ instead of numpy.logical_and which goes through NumPy's ufunc machinery (and fails)?

http://pypi.python.org/pypi/pandas/0.9.0

@durden

Yes, I can use operator.and_. I was only using numpy.logical_and because I assumed that the numpy version would be faster and possibly more efficient. Maybe this is not necessarily the case?

@durden

Should I close this? I guess it's not actually a bug, just a subtle side-effect of reindex_like(). Maybe should just be noted in docs and closed?

@wesm
Owner

Yeah let's close the issue. If you get energetic and want to add a caveat in the docs about using ufuncs on boolean arrays that have had missing data, go for it. Maybe on the gotchas page

@wesm wesm closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.