Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamp cannot be compared to np.int64: impact on series binary operations #8058

Closed
l736x opened this issue Aug 18, 2014 · 6 comments · Fixed by #8070
Closed

Timestamp cannot be compared to np.int64: impact on series binary operations #8058

l736x opened this issue Aug 18, 2014 · 6 comments · Fixed by #8070
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@l736x
Copy link
Contributor

l736x commented Aug 18, 2014

Binary operations between two series fail if the .name attribute of the first is a Timestamp and the .name of the second is a numpy.int64.
It seems a corner case, but it's pretty common when the series are slices of dataframes.
It happens independently if, say, + or .add() is used.
It doesn't happen if the order is reversed.
I guess it boils down to Timestamps not having a sufficiently wide equal method.

In [1]: import pandas as pd

In [2]: pd.__version
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-2d9155319574> in <module>()
----> 1 pd.__version

AttributeError: 'module' object has no attribute '__version'

In [3]: df = pd.DataFrame(randn(5,2))

In [4]: a = df[0]

In [5]: b = pd.Series(randn(5))

In [6]: b.name = pd.Timestamp('2000-01-01')

In [7]: a / b
Out[7]:
0    -1.019117
1    -1.455596
2     0.716716
3    -0.052173
4   -10.725603
dtype: float64

In [8]: b / a
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-fc1551a96997> in <module>()
----> 1 b / a

/home/ldeleo/venv/test/lib/python2.7/site-packages/pandas/core/ops.pyc in wrapper(left, right, name)
    488         if isinstance(rvalues, pd.Series):
    489             rindex = getattr(rvalues,'index',rvalues)
--> 490             name = _maybe_match_name(left, rvalues)
    491             lvalues = getattr(lvalues, 'values', lvalues)
    492             rvalues = getattr(rvalues, 'values', rvalues)

/home/ldeleo/venv/test/lib/python2.7/site-packages/pandas/core/common.pyc in _maybe_match_name(a, b)
   2927     a_name = getattr(a, 'name', None)
   2928     b_name = getattr(b, 'name', None)
-> 2929     if a_name == b_name:
   2930         return a_name
   2931     return None

/home/ldeleo/venv/test/lib/python2.7/site-packages/pandas/tslib.so in pandas.tslib._Timestamp.__richcmp__ (pandas/tslib.c:12147)()

TypeError: Cannot compare type 'Timestamp' with type 'int64'
@l736x
Copy link
Contributor Author

l736x commented Aug 18, 2014

Related to #4982 and #4983

@jreback jreback added the Dtypes label Aug 18, 2014
@jreback jreback added this to the 0.15.1 milestone Aug 18, 2014
@jreback
Copy link
Contributor

jreback commented Aug 18, 2014

I think should raise TypeError, as comparing a timestamp and an int is very ambiguous depending on the units.

That said I think its easy to put a try: except: around this comparison, catch the TypeError (and ignore it).

might also create a small function in core/common.py to make this more transparent (as it might be used elsewhere)

@shoyer
Copy link
Member

shoyer commented Aug 18, 2014

@jreback Actually, I think __equals__ should return False. Well behaved Python objects (OK, np.ndarray excepted) should never raise a TypeError for equality checks.

e.g., even on Python 3:

In [1]: None < 23
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-d368f011d7a8> in <module>()
----> 1 None < 23

TypeError: unorderable types: NoneType() < int()

In [2]: None == 23
Out[2]: False

@l736x
Copy link
Contributor Author

l736x commented Aug 18, 2014

I agree with @shoyer , I think in general it makes sense that Timestamp == anything_else returns just False.
Plus, in this way you don't have to patch the _maybe_match_name() function, nor create an ad hoc function to mask the comparison. (my 2 cents)

@jreback
Copy link
Contributor

jreback commented Aug 18, 2014

@l736x well go ahead and put up a pr for that
and see if anything breaks

@l736x
Copy link
Contributor Author

l736x commented Aug 18, 2014

I'm sorry but I'm a humble reporter of bugs.
I totally lack the competence to touch that code, especially because, as I understand, it's in the Cython part, which I plainly do not know (I stop where pdb stops)
I was just suggesting what seemed to me a simpler way to handle the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants