Unable to use Series created with reindex_like with numpy.logical_and() #2388

Closed
durden opened this Issue Nov 29, 2012 · 12 comments

Comments

Projects
None yet
3 participants
@durden
Contributor

durden commented Nov 29, 2012

I'm using numpy 1.6.2, pandas 0.9.1, and Python 2.7.2. I see strange behavior when using numpy.logical_and() depending on how I create a Series object. For example:

>>> import numpy
>>> import pandas
>>> series = pandas.Series([1, 2, 3])
>>> x = pandas.Series([True]).reindex_like(series).fillna(True)
>>> y = pandas.Series(True, index=series.index)
>>> series
0    1
1    2
2    3
>>> x
0    True
1    True
2    True
>>> y
0    True
1    True
2    True
>>> numpy.logical_and(series, y)
0    True
1    True
2    True
>>> numpy.logical_and(series, x)
Traceback (most recent call last):
  File "<ipython-input-10-e2050a2015bf>", line 1, in <module>
    numpy.logical_and(series, x)
AttributeError: logical_and

What is the difference between x and y here that is causing the AttributeError?

Also, I originally posted this as a question on stackoverflow. There are comments saying this works with pandas 0.9.0 and numpy 1.8. I haven't verified this for myself yet. However, my scenario is using the most recent stable releases of both projects.

@changhiskhan

This comment has been minimized.

Show comment Hide comment
@changhiskhan

changhiskhan Nov 29, 2012

Contributor

x is object type and y is bool dtype. this works if you do numpy.logical_and(series, x.astype(bool))

reindex_like is going to introduce a bunch of NaNs so that's going to convert the Series into a bool dtype

maybe we should call maybe_convert_objects at the end of fillna?

Contributor

changhiskhan commented Nov 29, 2012

x is object type and y is bool dtype. this works if you do numpy.logical_and(series, x.astype(bool))

reindex_like is going to introduce a bunch of NaNs so that's going to convert the Series into a bool dtype

maybe we should call maybe_convert_objects at the end of fillna?

@durden

This comment has been minimized.

Show comment Hide comment
@durden

durden Nov 29, 2012

Contributor

Did something change recently to change this in 0.9.1? I've been told that the snippet above works in 0.9.0. I don't know how to get this exact version to test for myself though.

As a side note, any time a Series has NaN it is automatically a bool dtype? So, anytime I call fillna() on a Series it will implicitly convert the type?

Contributor

durden commented Nov 29, 2012

Did something change recently to change this in 0.9.1? I've been told that the snippet above works in 0.9.0. I don't know how to get this exact version to test for myself though.

As a side note, any time a Series has NaN it is automatically a bool dtype? So, anytime I call fillna() on a Series it will implicitly convert the type?

@changhiskhan

This comment has been minimized.

Show comment Hide comment
@changhiskhan

changhiskhan Nov 29, 2012

Contributor

Right now fillna does NOT convert the type but reindex_like does. Because NaN is a float, after reindex_like the Series becomes mixed type so gets converted to object dtype

Contributor

changhiskhan commented Nov 29, 2012

Right now fillna does NOT convert the type but reindex_like does. Because NaN is a float, after reindex_like the Series becomes mixed type so gets converted to object dtype

@wesm

This comment has been minimized.

Show comment Hide comment
@wesm

wesm Nov 29, 2012

Member

To be clear: this is a wart due to pandas's "best efforts" implementation of missing data using NumPy. I would expect the same code to fail on 0.9.0

Member

wesm commented Nov 29, 2012

To be clear: this is a wart due to pandas's "best efforts" implementation of missing data using NumPy. I would expect the same code to fail on 0.9.0

@wesm

This comment has been minimized.

Show comment Hide comment
@wesm

wesm Nov 29, 2012

Member

Is there a reason why series & y is not an option? That should work

Member

wesm commented Nov 29, 2012

Is there a reason why series & y is not an option? That should work

@durden

This comment has been minimized.

Show comment Hide comment
@durden

durden Nov 29, 2012

Contributor

Oh, I see what you mean. So is this technically a bug then?

Contributor

durden commented Nov 29, 2012

Oh, I see what you mean. So is this technically a bug then?

@durden

This comment has been minimized.

Show comment Hide comment
@durden

durden Nov 29, 2012

Contributor

The reason I'm not using series & y is because I'm actually taking a dictionary of operations to perform and combining them myself. I might be approaching the problem in the wrong way (I'm trying to replace legacy custom code with some Pandas functionality).

You can find more information about my exact situation on this stackoverflow post.

Contributor

durden commented Nov 29, 2012

The reason I'm not using series & y is because I'm actually taking a dictionary of operations to perform and combining them myself. I might be approaching the problem in the wrong way (I'm trying to replace legacy custom code with some Pandas functionality).

You can find more information about my exact situation on this stackoverflow post.

@durden

This comment has been minimized.

Show comment Hide comment
@durden

durden Nov 29, 2012

Contributor

Maybe this subtle issue should be mentioned in the docs for reindex_like()?

Also, how can I get 0.9.0 and test this? The person responding on my stackoverflow post claimed this worked with pandas 0.9.0 AND numpy 1.8. So, not sure what the difference in numpy is from 1.8 and 1.6.2 so might not be 'broken' in pandas 0.9.0.

Contributor

durden commented Nov 29, 2012

Maybe this subtle issue should be mentioned in the docs for reindex_like()?

Also, how can I get 0.9.0 and test this? The person responding on my stackoverflow post claimed this worked with pandas 0.9.0 AND numpy 1.8. So, not sure what the difference in numpy is from 1.8 and 1.6.2 so might not be 'broken' in pandas 0.9.0.

@wesm

This comment has been minimized.

Show comment Hide comment
@wesm

wesm Nov 29, 2012

Member

Shouldn't you use operator.and_ instead of numpy.logical_and which goes through NumPy's ufunc machinery (and fails)?

http://pypi.python.org/pypi/pandas/0.9.0

Member

wesm commented Nov 29, 2012

Shouldn't you use operator.and_ instead of numpy.logical_and which goes through NumPy's ufunc machinery (and fails)?

http://pypi.python.org/pypi/pandas/0.9.0

@durden

This comment has been minimized.

Show comment Hide comment
@durden

durden Nov 29, 2012

Contributor

Yes, I can use operator.and_. I was only using numpy.logical_and because I assumed that the numpy version would be faster and possibly more efficient. Maybe this is not necessarily the case?

Contributor

durden commented Nov 29, 2012

Yes, I can use operator.and_. I was only using numpy.logical_and because I assumed that the numpy version would be faster and possibly more efficient. Maybe this is not necessarily the case?

@durden

This comment has been minimized.

Show comment Hide comment
@durden

durden Nov 30, 2012

Contributor

Should I close this? I guess it's not actually a bug, just a subtle side-effect of reindex_like(). Maybe should just be noted in docs and closed?

Contributor

durden commented Nov 30, 2012

Should I close this? I guess it's not actually a bug, just a subtle side-effect of reindex_like(). Maybe should just be noted in docs and closed?

@wesm

This comment has been minimized.

Show comment Hide comment
@wesm

wesm Dec 2, 2012

Member

Yeah let's close the issue. If you get energetic and want to add a caveat in the docs about using ufuncs on boolean arrays that have had missing data, go for it. Maybe on the gotchas page

Member

wesm commented Dec 2, 2012

Yeah let's close the issue. If you get energetic and want to add a caveat in the docs about using ufuncs on boolean arrays that have had missing data, go for it. Maybe on the gotchas page

@wesm wesm closed this Dec 2, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment