Timestamp cannot be compared to np.int64: impact on series binary operations #8058

Closed
l736x opened this Issue Aug 18, 2014 · 6 comments

Comments

Projects
None yet
3 participants

l736x commented Aug 18, 2014

Binary operations between two series fail if the .name attribute of the first is a Timestamp and the .name of the second is a numpy.int64.
It seems a corner case, but it's pretty common when the series are slices of dataframes.
It happens independently if, say, + or .add() is used.
It doesn't happen if the order is reversed.
I guess it boils down to Timestamps not having a sufficiently wide equal method.

In [1]: import pandas as pd

In [2]: pd.__version
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-2d9155319574> in <module>()
----> 1 pd.__version

AttributeError: 'module' object has no attribute '__version'

In [3]: df = pd.DataFrame(randn(5,2))

In [4]: a = df[0]

In [5]: b = pd.Series(randn(5))

In [6]: b.name = pd.Timestamp('2000-01-01')

In [7]: a / b
Out[7]:
0    -1.019117
1    -1.455596
2     0.716716
3    -0.052173
4   -10.725603
dtype: float64

In [8]: b / a
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-fc1551a96997> in <module>()
----> 1 b / a

/home/ldeleo/venv/test/lib/python2.7/site-packages/pandas/core/ops.pyc in wrapper(left, right, name)
    488         if isinstance(rvalues, pd.Series):
    489             rindex = getattr(rvalues,'index',rvalues)
--> 490             name = _maybe_match_name(left, rvalues)
    491             lvalues = getattr(lvalues, 'values', lvalues)
    492             rvalues = getattr(rvalues, 'values', rvalues)

/home/ldeleo/venv/test/lib/python2.7/site-packages/pandas/core/common.pyc in _maybe_match_name(a, b)
   2927     a_name = getattr(a, 'name', None)
   2928     b_name = getattr(b, 'name', None)
-> 2929     if a_name == b_name:
   2930         return a_name
   2931     return None

/home/ldeleo/venv/test/lib/python2.7/site-packages/pandas/tslib.so in pandas.tslib._Timestamp.__richcmp__ (pandas/tslib.c:12147)()

TypeError: Cannot compare type 'Timestamp' with type 'int64'

l736x commented Aug 18, 2014

Related to #4982 and #4983

jreback added the Dtypes label Aug 18, 2014

jreback added this to the 0.15.1 milestone Aug 18, 2014

Contributor

jreback commented Aug 18, 2014

I think should raise TypeError, as comparing a timestamp and an int is very ambiguous depending on the units.

That said I think its easy to put a try: except: around this comparison, catch the TypeError (and ignore it).

might also create a small function in core/common.py to make this more transparent (as it might be used elsewhere)

Member

shoyer commented Aug 18, 2014

@jreback Actually, I think __equals__ should return False. Well behaved Python objects (OK, np.ndarray excepted) should never raise a TypeError for equality checks.

e.g., even on Python 3:

In [1]: None < 23
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-d368f011d7a8> in <module>()
----> 1 None < 23

TypeError: unorderable types: NoneType() < int()

In [2]: None == 23
Out[2]: False

l736x commented Aug 18, 2014

I agree with @shoyer , I think in general it makes sense that Timestamp == anything_else returns just False.
Plus, in this way you don't have to patch the _maybe_match_name() function, nor create an ad hoc function to mask the comparison. (my 2 cents)

Contributor

jreback commented Aug 18, 2014

@l736x well go ahead and put up a pr for that
and see if anything breaks

l736x commented Aug 18, 2014

I'm sorry but I'm a humble reporter of bugs.
I totally lack the competence to touch that code, especially because, as I understand, it's in the Cython part, which I plainly do not know (I stop where pdb stops)
I was just suggesting what seemed to me a simpler way to handle the problem.

@jreback jreback modified the milestone: 0.15.1, 0.15.0 Aug 19, 2014

jreback added the Bug label Aug 19, 2014

jreback closed this in #8070 Aug 19, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment