BUG: index name lost with timedelta ops #9926

sinhrks · 2015-04-18T00:36:11Z

import pandas as pd 

dtidx = pd.DatetimeIndex(['2011-01-01'], freq='D', name='dtidx')
(dtidx + 1).name
# dtidx

# NG
(dtidx + pd.Timedelta('1 day')).name
# None

tdidx = pd.TimedeltaIndex(['1 day'], freq='D', name='tdidx')
(tdidx + 1).name
# tdidx

# NG
(tdidx + pd.Timedelta('1 day')).name
# None

ref: #9862

The text was updated successfully, but these errors were encountered:

sinhrks · 2015-04-18T01:22:54Z

Would like to define expected bahavior. Based on normal index, I understand first (left-side) index name should be prioritized.

idx1 = pd.Index([1], name='idx1')
idx2 = pd.Index([1], name='idx2')
(idx1 + 1).name
# idx1
(idx1 + idx1).name
# idx1
(idx1 + idx2).name
# idx1

But in case of datetime-likes, I feel it is natural that prioritize the name of DatetimeIndex. I'd like to ask whether I can prepare a fix based on following behavior.

left side	right side	prioritized name
`DatetimeIndex`	`DatetimeIndex`	name of left side
`TimedeltaIndex`	`TimedeltaIndex`	name of left side
`DatetimeIndex`	`TimedeltaIndex`	name of left side (`DatetimeIndex`)
`TimedeltaIndex`	`DatetimeIndex`	name of right side (`DatetimeIndex`)

dtidx = pd.DatetimeIndex(['2011-01-01'], freq='D', name='dtidx')
tdidx = pd.TimedeltaIndex(['1 day'], freq='D', name='tdidx')

dtidx + tdidx
# <class 'pandas.tseries.index.DatetimeIndex'>
# [2011-01-02]
# Length: 1, Freq: None, Timezone: None

(dtidx + tdidx).name
# dtidx

tdidx + dtidx
# <class 'pandas.tseries.index.DatetimeIndex'>
# [2011-01-02]
# Length: 1, Freq: None, Timezone: None

(tdidx + dtidx).name
# dtidx

jreback · 2015-04-18T01:33:12Z

no priority
append / op on an index has to have the same name (or None)
otherwise will be set to None

see Index.append

it may be a bug if add ops don't follow this pattern

sinhrks · 2015-04-18T11:06:59Z

Thanks, could you check following understanding is correct for both set (intersection, etc) and arithmetic ops (addition, etc)?

In case of Index + Index, name is preserved if left and right index have the same name. Otherwise, name is reset to None.
In case of Index + scalar or scalar + Index, name of the index is preserved.

Based on above understanding, normal index behaves incorrectly. All the below ops should reset the name to None.

idx1 = pd.Index([1, 2, 3], name='idx1')
idx2 = pd.Index([1, 2, 3], name='idx2')

result = idx1 + idx2
result, result.name
# (Int64Index([2, 4, 6], dtype='int64'), 'idx1')

result = idx1.__add__(idx2)
result, result.name
# (Int64Index([2, 4, 6], dtype='int64'), 'idx1')

result = idx1.intersection(idx2)
result, result.name
# (Int64Index([1, 2, 3], dtype='int64'), 'idx1')

shoyer · 2015-04-19T00:14:23Z

@sinhrks Yes, I think you correctly understand this now.

Here are some notes on this from @cpcloud: blaze/blaze#458 (comment)

sinhrks added Bug Timedelta Timedelta data type labels Apr 18, 2015

jreback mentioned this issue Apr 18, 2015

BUG: losing Index/Series names master issue #9862

Closed

12 tasks

sinhrks mentioned this issue Apr 22, 2015

FIX: interesction and union changed index names. fixes #9943 partly #9862 #9965

Closed

sinhrks mentioned this issue May 16, 2015

BUG: Index.name is lost during timedelta ops #10158

Merged

jreback added this to the 0.17.0 milestone May 18, 2015

jreback closed this as completed in #10158 May 18, 2015

jorisvandenbossche modified the milestones: 0.17.0, 0.16.2 Jun 2, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: index name lost with timedelta ops #9926

BUG: index name lost with timedelta ops #9926

sinhrks commented Apr 18, 2015

sinhrks commented Apr 18, 2015

jreback commented Apr 18, 2015

sinhrks commented Apr 18, 2015

shoyer commented Apr 19, 2015

BUG: index name lost with timedelta ops #9926

BUG: index name lost with timedelta ops #9926

Comments

sinhrks commented Apr 18, 2015

sinhrks commented Apr 18, 2015

jreback commented Apr 18, 2015

sinhrks commented Apr 18, 2015

shoyer commented Apr 19, 2015