New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd.merge fails on datetime columns with tzinfo #11405

Closed
miraculixx opened this Issue Oct 21, 2015 · 1 comment

Comments

Projects
None yet
2 participants
@miraculixx

miraculixx commented Oct 21, 2015

Since pandas-0.17 a merge on a datetime column fails if the datetime is tz-aware, see example below. Possibly related to #9663?

import pandas as pd
from datetime import datetime
from dateutil.tz import gettz
import sys, os
import traceback as tbm
# works
a = pd.DataFrame({'created' : [datetime(2015,10,10), 
                               datetime(2015,10,20)], 
                  'count' : [1,2]})
b = pd.DataFrame({'created' : [datetime(2015,10,10), 
                               datetime(2015,10,20)], 
                  'count' : [1,2]})
pd.merge(a, b, how='outer')
# doesn't work (used to work on pandas-0.16.2)
try:
    utc = gettz('UTC')
    a = pd.DataFrame({'created' : [datetime(2015,10,10, tzinfo=utc), 
                                   datetime(2015,10,20, tzinfo=utc)], 
                      'count' : [1,2]})
    b = pd.DataFrame({'created' : [datetime(2015,10,10, tzinfo=utc), 
                                   datetime(2015,10,20, tzinfo=utc)], 
                      'count' : [1,2]})
    pd.merge(a, b, how='outer')
except Exception as e:
    print "Yeah, doesn't work: %s" % e   
    _, _, tb = sys.exc_info()
    stack = lambda n : tbm.extract_tb(tb, 99)[n][0:]
    print "called from", stack(0)
    print "failing statement", stack(-1)

=>

Yeah, doesn't work: type object argument after * must be a sequence, not itertools.imap
called from ('<ipython-input-194-3c3669b26a55>', 23, '<module>', u"pd.merge(a, b, how='outer')")
failing statement ('/.../local/lib/python2.7/site-packages/pandas/tools/merge.py', 516, '_get_join_indexers', 'llab, rlab, shape = map(list, zip( * map(fkeys, left_keys, right_keys)))')

the culprit seems to be in the call to _factorize_keys though I couldn't quite figure out what goes wrong.

Version info

$ python --version
Python 2.7.6
$ pip freeze | grep pandas
pandas==0.17.0
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 22, 2015

Contributor

nah, this was not implemented (with the new tz dtypes) and not tested, fixed in #11410

Contributor

jreback commented Oct 22, 2015

nah, this was not implemented (with the new tz dtypes) and not tested, fixed in #11410

jreback added a commit to jreback/pandas that referenced this issue Oct 23, 2015

jreback added a commit that referenced this issue Oct 23, 2015

Merge pull request #11410 from jreback/tz_merge
Bug in merging datetime64[ns, tz] dtypes #11405
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment