New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: joining empty series with dtype: datetime64[ns, UTC] #18447

Closed
frigaardj opened this Issue Nov 23, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@frigaardj

frigaardj commented Nov 23, 2017

Code Sample

>>> s1 = pd.Series(pd.to_datetime([], utc=True))
>>> s2 = pd.Series([1,2,3])
>>> pd.concat([s1, s2], axis=1)

IndexError: cannot do a non-empty take from an empty axes.


>>> s1 = pd.Series(pd.to_datetime([], utc=False))
>>> s2 = pd.Series([1,2,3])
>>> pd.concat([s1, s2], axis=1)

Empty DataFrame
Columns: [0, 1]
Index: []


>>> s1 = pd.Series(pd.to_datetime([], utc=True))
>>> s2 = pd.Series([])
>>> pd.concat([s1, s2], axis=1)

Empty DataFrame
Columns: [0, 1]
Index: []


>>> df1 = pd.DataFrame(columns=['a', 'b'])
>>> df2 = pd.DataFrame(np.random.random((2, 2)), columns=['c', 'd'])
>>> df1['a'] = pd.to_datetime(df1['b'], utc=True) 
>>> pd.concat([df1, df2], axis=1)

IndexError: cannot do a non-empty take from an empty axes.


>>> df1.join(df2, how='outer')

IndexError: cannot do a non-empty take from an empty axes.


>>> df1['a'] = pd.to_datetime(df1['b'], utc=False) 
>>> pd.concat([df1, df2], axis=1)

    a    b         c         d
0 NaT  NaN  0.777252  0.657679
1 NaT  NaN  0.274332  0.981532

Problem description

When trying to concatenate multiple series (or dataframes) along axis 1, if one of them is empty and has a UTC datetime column, the concatenation will fail with IndexError. This applies to joins as well. If you set the datetime column to be non-utc (i.e. tz-naive), it works as expected. If you concatenate 2 empty objects, one of which has a UTC datetime column, it works as expected.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.12.9-300.fc26.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.utf8
LOCALE: en_GB.UTF-8

pandas: 0.21.0
pytest: 3.0.7
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: 0.19.1
pyarrow: 0.7.1
xarray: 0.9.6
IPython: 6.2.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.4.0

@jreback

This comment has been minimized.

Contributor

jreback commented Nov 25, 2017

this is relateed to #12396

should be straightforward to fix, if you can do a PR!

@jreback jreback added this to the Next Major Release milestone Nov 25, 2017

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Jan 21, 2018

jreback added a commit to jreback/pandas that referenced this issue May 12, 2018

@jreback jreback closed this in 1dcddba May 13, 2018

topper-123 pushed a commit to topper-123/pandas that referenced this issue May 13, 2018

topper-123 pushed a commit to topper-123/pandas that referenced this issue May 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment