Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use Series created with reindex_like with numpy.logical_and() #2388

Closed
durden opened this issue Nov 29, 2012 · 12 comments
Closed

Unable to use Series created with reindex_like with numpy.logical_and() #2388

durden opened this issue Nov 29, 2012 · 12 comments
Milestone

Comments

@durden
Copy link
Contributor

durden commented Nov 29, 2012

I'm using numpy 1.6.2, pandas 0.9.1, and Python 2.7.2. I see strange behavior when using numpy.logical_and() depending on how I create a Series object. For example:

>>> import numpy
>>> import pandas
>>> series = pandas.Series([1, 2, 3])
>>> x = pandas.Series([True]).reindex_like(series).fillna(True)
>>> y = pandas.Series(True, index=series.index)
>>> series
0    1
1    2
2    3
>>> x
0    True
1    True
2    True
>>> y
0    True
1    True
2    True
>>> numpy.logical_and(series, y)
0    True
1    True
2    True
>>> numpy.logical_and(series, x)
Traceback (most recent call last):
  File "<ipython-input-10-e2050a2015bf>", line 1, in <module>
    numpy.logical_and(series, x)
AttributeError: logical_and

What is the difference between x and y here that is causing the AttributeError?

Also, I originally posted this as a question on stackoverflow. There are comments saying this works with pandas 0.9.0 and numpy 1.8. I haven't verified this for myself yet. However, my scenario is using the most recent stable releases of both projects.

@changhiskhan
Copy link
Contributor

x is object type and y is bool dtype. this works if you do numpy.logical_and(series, x.astype(bool))

reindex_like is going to introduce a bunch of NaNs so that's going to convert the Series into a bool dtype

maybe we should call maybe_convert_objects at the end of fillna?

@durden
Copy link
Contributor Author

durden commented Nov 29, 2012

Did something change recently to change this in 0.9.1? I've been told that the snippet above works in 0.9.0. I don't know how to get this exact version to test for myself though.

As a side note, any time a Series has NaN it is automatically a bool dtype? So, anytime I call fillna() on a Series it will implicitly convert the type?

@changhiskhan
Copy link
Contributor

Right now fillna does NOT convert the type but reindex_like does. Because NaN is a float, after reindex_like the Series becomes mixed type so gets converted to object dtype

@wesm
Copy link
Member

wesm commented Nov 29, 2012

To be clear: this is a wart due to pandas's "best efforts" implementation of missing data using NumPy. I would expect the same code to fail on 0.9.0

@wesm
Copy link
Member

wesm commented Nov 29, 2012

Is there a reason why series & y is not an option? That should work

@durden
Copy link
Contributor Author

durden commented Nov 29, 2012

Oh, I see what you mean. So is this technically a bug then?

@durden
Copy link
Contributor Author

durden commented Nov 29, 2012

The reason I'm not using series & y is because I'm actually taking a dictionary of operations to perform and combining them myself. I might be approaching the problem in the wrong way (I'm trying to replace legacy custom code with some Pandas functionality).

You can find more information about my exact situation on this stackoverflow post.

@durden
Copy link
Contributor Author

durden commented Nov 29, 2012

Maybe this subtle issue should be mentioned in the docs for reindex_like()?

Also, how can I get 0.9.0 and test this? The person responding on my stackoverflow post claimed this worked with pandas 0.9.0 AND numpy 1.8. So, not sure what the difference in numpy is from 1.8 and 1.6.2 so might not be 'broken' in pandas 0.9.0.

@wesm
Copy link
Member

wesm commented Nov 29, 2012

Shouldn't you use operator.and_ instead of numpy.logical_and which goes through NumPy's ufunc machinery (and fails)?

http://pypi.python.org/pypi/pandas/0.9.0

@durden
Copy link
Contributor Author

durden commented Nov 29, 2012

Yes, I can use operator.and_. I was only using numpy.logical_and because I assumed that the numpy version would be faster and possibly more efficient. Maybe this is not necessarily the case?

@durden
Copy link
Contributor Author

durden commented Nov 30, 2012

Should I close this? I guess it's not actually a bug, just a subtle side-effect of reindex_like(). Maybe should just be noted in docs and closed?

@wesm
Copy link
Member

wesm commented Dec 2, 2012

Yeah let's close the issue. If you get energetic and want to add a caveat in the docs about using ufuncs on boolean arrays that have had missing data, go for it. Maybe on the gotchas page

@wesm wesm closed this as completed Dec 2, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants