BUG: index name lost when indexing with another index #9943

sergeny · 2015-04-20T03:15:12Z

Very subtle. Index name stays when using .ix [ list ], but gets lost when using .ix[ Int64Index ].

import pandas as pd
from pandas.util.testing import assert_frame_equal
import numpy as np

assert pd.__version__ == '0.16.0'
df = pd.DataFrame([np.nan, np.nan], columns = ['tags'], index=pd.Int64Index([4815961, 4815962], dtype='int64', name='id'))

assert str(df) == '         tags\nid           \n4815961   NaN\n4815962   NaN'
# OK.

L = [4815962]

assert list(L) == list(df.index.intersection(L))
# succeeds. It's just a type difference


print df.ix[L].tags.index.name
#>>> 'id'
print df.ix[df.index.intersection(L)].tags.index.name
#>>>




assert  df.ix[L].tags.index.name == df.ix[df.index.intersection(L)].tags.index.name
# assertion failure. Should really succeed.

assert_frame_equal(df.ix[L], df.ix[df.index.intersection(L)])
# assertion failure. Should really succeed.
# AssertionError: attr is not equal [names]: FrozenList([u'id']) != FrozenList([None])

The text was updated successfully, but these errors were encountered:

jreback · 2015-04-20T12:56:55Z

There are 2 things going on here:

.intersection (and prob .union) are zonking the name if its None. Note that this was just fixed for .append, so same fix prob applies BUG: losing Index/Series names master issue #9862
You are effectively reindexing one index with another; We should preserve the name in this case. Discussion: Should reindex keep the index name if new index has none #9885

@shoyer

shoyer · 2015-04-20T17:13:10Z

I agree, indexing should never change index names.

…tly pandas-dev#9862

…ev#9943 partly pandas-dev#9862

…andas-dev#9943 partly pandas-dev#9862

…as-dev#9943 partly pandas-dev#9862

Dr-Irv · 2018-02-22T18:43:09Z

@jreback When this was opened, you wrote:

.intersection (and prob .union) are zonking the name if its None.

I'd like to confirm that the behavior for .union and .intersection should be different. Namely, for .union, if the names are the same, or only one is specified, but the other is not, then take the name, but if the names are different, set the name to None. But for .intersection, only take the names if they are exactly the same.

If that is the case, then the behavior reported initially is what is expected, because the intersection is being taken between an Index with a name, and a list (that has no name).

Dr-Irv · 2018-02-23T18:25:48Z

Following up on my question to @jreback above about the behaviors, I think the behavior of .union and .intersection with respect to names needs to be the same, using the pattern for intersection. Namely, if the names are the same, use that name, otherwise return None. The reason is that a chained union operation can then give odd results if the order of the union changes. For example, let's say you have 3 indexes as follows:

i1 = pd.Index([1,2], name='i1')
i2 = pd.Index([3,4], name='i2')
i3 = pd.Index([5,6], name='i3')

And then you compute i1.union(i2.union(i3)), under the "intersection" behavior, the resulting name of this index is None. But if we use the "union" behavior, then the name of the result is "i1". However, changing the order of the union, as in (i1.union(i2)).union(i3), using the "union" behavior, the resulting name would be "i3".

In fact, pandas 0.22.0 has a bug in the following case (which only occurs when one of the indexes in the union operation is empty, or if taking the union or intersection of 2 indexes that are the same, but have different names):

j1 = pd.Index([1,2], name='j1')
j2 = pd.Index([], name='j2')
j3 = pd.Index([], name='j3')

In this case, j1.union(j2).union(j3) returns Int64Index([1, 2], dtype='int64', name='j3'), while just changing the order to j3.union(j1).union(j2) returns Int64Index([1, 2], dtype='int64', name='j2').

I hope to have straightened this out when I get things right in the pull request #19849.

jreback · 2018-02-23T23:02:58Z

we have very specific matching behavior for Index operations

meaning that if they match you get the name back; if they are different you get None (this includes the case of one has None one has a value)

so not sure this should be any different

Dr-Irv · 2018-02-23T23:07:26Z

@jreback I agree, and my latest version of the pull request #19849 addresses this and has the behavior you describe. There were some boundary cases that were not handled that way (see the j1,j2,j3) example above as an example.

) Closes gh-9862. xref gh-9943.

gfyoung · 2018-11-06T18:22:55Z

#19849 is now merged. Not sure how it affects the status of this issue? @jreback

…das-dev#19849) Closes pandas-devgh-9862. xref pandas-devgh-9943.

mroeschke · 2021-04-18T06:31:58Z

It appears that #19849 was supposed to close this issue, but if that's not the case happy to reopen.

sergeny mentioned this issue Apr 20, 2015

BUG: losing Index/Series names master issue #9862

Closed

12 tasks

jreback added this to the Next Major Release milestone Apr 20, 2015

jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode Difficulty Novice labels Apr 20, 2015

hsperr added a commit to hsperr/pandas that referenced this issue Apr 22, 2015

interesction and union changed index names. fixes pandas-dev#9943 par…

f2565bd

…tly pandas-dev#9862

hsperr added a commit to hsperr/pandas that referenced this issue Apr 22, 2015

FIX: interesction and union don't index names anymore. fixes pandas-d…

4e4d6f5

…ev#9943 partly pandas-dev#9862

hsperr added a commit to hsperr/pandas that referenced this issue Apr 22, 2015

FIX: interesction and union don't change index names anymore. fixes p…

c1db455

…andas-dev#9943 partly pandas-dev#9862

hsperr mentioned this issue Apr 22, 2015

FIX: interesction and union changed index names. fixes #9943 partly #9862 #9965

Closed

jreback modified the milestones: 0.16.1, Next Major Release Apr 22, 2015

hsperr added a commit to hsperr/pandas that referenced this issue Apr 23, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

91ced29

…as-dev#9943 partly pandas-dev#9862

hsperr added a commit to hsperr/pandas that referenced this issue Apr 23, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

a04c9cf

…as-dev#9943 partly pandas-dev#9862

hsperr added a commit to hsperr/pandas that referenced this issue Apr 23, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

991a6b6

…as-dev#9943 partly pandas-dev#9862

hsperr added a commit to hsperr/pandas that referenced this issue Apr 23, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

725ffe5

…as-dev#9943 partly pandas-dev#9862

hsperr added a commit to hsperr/pandas that referenced this issue Apr 23, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

a641434

…as-dev#9943 partly pandas-dev#9862

jreback modified the milestones: 0.17.0, 0.16.1 Apr 28, 2015

hsperr added a commit to hsperr/pandas that referenced this issue May 17, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

d494287

…as-dev#9943 partly pandas-dev#9862

dkrasner mentioned this issue Jul 7, 2015

Lda sums columbia-applied-data-science/rosetta#47

Merged

max-sixty pushed a commit to max-sixty/pandas that referenced this issue Jul 27, 2015

FIX: interesction and union correct name chaning behavior. fixes pand…

efd8984

…as-dev#9943 partly pandas-dev#9862

jreback modified the milestones: Next Major Release, 0.17.0 Aug 15, 2015

TomAugspurger added the good first issue label Oct 11, 2017

jreback added good first issue and removed good first issue labels Dec 15, 2017

jreback removed the Difficulty Novice label Dec 15, 2017

Dr-Irv mentioned this issue Feb 22, 2018

BUG: names on union and intersection for Index were inconsistent (GH9943 GH9862) #19849

Merged

3 tasks

gfyoung pushed a commit that referenced this issue Nov 6, 2018

BUG: names on union and intersection for Index were inconsistent (#19849

810826d

) Closes gh-9862. xref gh-9943.

JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this issue Nov 14, 2018

BUG: names on union and intersection for Index were inconsistent (pan…

2acb22c

…das-dev#19849) Closes pandas-devgh-9862. xref pandas-devgh-9943.

tm9k1 pushed a commit to tm9k1/pandas that referenced this issue Nov 19, 2018

BUG: names on union and intersection for Index were inconsistent (pan…

a3cb293

…das-dev#19849) Closes pandas-devgh-9862. xref pandas-devgh-9943.

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this issue Feb 28, 2019

BUG: names on union and intersection for Index were inconsistent (pan…

108f2ae

…das-dev#19849) Closes pandas-devgh-9862. xref pandas-devgh-9943.

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this issue Feb 28, 2019

BUG: names on union and intersection for Index were inconsistent (pan…

47239bd

…das-dev#19849) Closes pandas-devgh-9862. xref pandas-devgh-9943.

jbrockmendel removed the Effort Low label Oct 21, 2019

mroeschke closed this as completed Apr 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: index name lost when indexing with another index #9943

BUG: index name lost when indexing with another index #9943

sergeny commented Apr 20, 2015

jreback commented Apr 20, 2015

shoyer commented Apr 20, 2015

Dr-Irv commented Feb 22, 2018

Dr-Irv commented Feb 23, 2018

jreback commented Feb 23, 2018

Dr-Irv commented Feb 23, 2018

gfyoung commented Nov 6, 2018

mroeschke commented Apr 18, 2021

BUG: index name lost when indexing with another index #9943

BUG: index name lost when indexing with another index #9943

Comments

sergeny commented Apr 20, 2015

jreback commented Apr 20, 2015

shoyer commented Apr 20, 2015

Dr-Irv commented Feb 22, 2018

Dr-Irv commented Feb 23, 2018

jreback commented Feb 23, 2018

Dr-Irv commented Feb 23, 2018

gfyoung commented Nov 6, 2018

mroeschke commented Apr 18, 2021