xs is filling nan in index with its last item, as if sorted ascending, in the resulting index #6574

leungwk · 2014-03-07T20:29:09Z

Illustration:

acc = [
    ('a','abcde',1),
    ('b','bbcde',2),
    ('y','yzcde',25),
    ('z','xbcde',24),
    ('z',None,26),
    ('z','zbcde',25),
    ('z','ybcde',26),
]
df1 = pd.DataFrame(acc, columns=['a1','a2','cnt']).set_index(['a1','a2'])

In [476]: df1
Out[476]: 
          cnt
a1 a2        
a  abcde    1
b  bbcde    2
y  yzcde   25
z  xbcde   24
   NaN     26
   zbcde   25
   ybcde   26

[7 rows x 1 columns]

In [477]: df1.xs('z',level='a1')
Out[477]: 
       cnt
a2        
xbcde   24
zbcde   26
zbcde   25
ybcde   26

[4 rows x 1 columns]

I was expecting:

       cnt
a2        
xbcde   24
NaN     26
zbcde   25
ybcde   26

because I thought it would preserve the index of df1.

Sorting explicitly doesn't seem to affect the result:

In [478]: df1.sort('cnt',ascending=False)
Out[478]: 
          cnt
a1 a2        
z  ybcde   26
   NaN     26
   zbcde   25
y  yzcde   25
z  xbcde   24
b  bbcde    2
a  abcde    1

[7 rows x 1 columns]

In [479]: df1.sort('cnt',ascending=False).xs('z',level='a1')
Out[479]: 
       cnt
a2        
ybcde   26
zbcde   26
zbcde   25
xbcde   24

[4 rows x 1 columns]

It might be related to forward filling, but then I think it would be:

       cnt
a2        
ybcde   26
ybcde   26
zbcde   25
xbcde   24

which still isn't what I was expecting.

Versions and dependencies:

In [480]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 12.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8

pandas: 0.13.1
Cython: 0.17.2
numpy: 1.6.2
scipy: 0.13.3
statsmodels: 0.5.0
IPython: 1.1.0
sphinx: 1.2
patsy: 0.2.1
scikits.timeseries: None
dateutil: 1.5
pytz: 2012h
bottleneck: None
tables: 3.0.0
numexpr: 2.0.1
matplotlib: 1.3.1
openpyxl: None
xlrd: 0.8.0
xlwt: None
xlsxwriter: None
sqlalchemy: None
lxml: None
bs4: None
html5lib: None
bq: None
apiclient: None

The text was updated successfully, but these errors were encountered:

jreback · 2014-03-07T20:45:00Z

Here's some other ways to get at what you want
(this may be a bug, as having NaN in an index is in general odd, so maybe some code
to 'deal' with that)

In [3]: df1.xs('z',level='a1',drop_level=False)
Out[3]: 
          cnt
a1 a2        
z  xbcde   24
   NaN     26
   zbcde   25
   ybcde   26

[4 rows x 1 columns]

In [4]: df1.loc[['z']]
Out[4]: 
          cnt
a1 a2        
z  xbcde   24
   NaN     26
   zbcde   25
   ybcde   26

[4 rows x 1 columns]

In [5]: df1.loc['z']
Out[5]: 
       cnt
a2        
xbcde   24
zbcde   26
zbcde   25
ybcde   26

[4 rows x 1 columns]

jreback · 2014-03-09T15:15:14Z

thanks for the report; fixed in master

jreback added Bug labels Mar 9, 2014

jreback added this to the 0.14.0 milestone Mar 9, 2014

jreback mentioned this issue Mar 9, 2014

BUG: Bug in .xs with a nan in level when dropped (GH6574) #6579

Merged

jreback closed this as completed in #6579 Mar 9, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xs is filling nan in index with its last item, as if sorted ascending, in the resulting index #6574

xs is filling nan in index with its last item, as if sorted ascending, in the resulting index #6574

leungwk commented Mar 7, 2014

jreback commented Mar 7, 2014

jreback commented Mar 9, 2014

xs is filling nan in index with its last item, as if sorted ascending, in the resulting index #6574

xs is filling nan in index with its last item, as if sorted ascending, in the resulting index #6574

Comments

leungwk commented Mar 7, 2014

jreback commented Mar 7, 2014

jreback commented Mar 9, 2014