In [1]:
import numpy as np
import pandas as pd
from numpy.random import randn

In [2]:
outside = ['G1','G1','G1','G2','G2','G2']
inside = [1,2,3,1,2,3]
hier_index = list(zip(outside,inside))
hier_index = pd.MultiIndex.from_tuples(hier_index)

In [3]:
df = pd.DataFrame(randn(6,2),hier_index,['A','B'])

In [4]:
df

Unnamed: 0,Unnamed: 1,A,B
G1,1,-0.156164,-1.381692
G1,2,0.553699,-1.274677
G1,3,0.575477,-2.078288
G2,1,0.052282,1.659373
G2,2,0.609869,-2.523152
G2,3,0.600435,1.116438


MultiIndex 可以讓 DataFrame 有兩個 index。

In [5]:
df.loc['G1']

Unnamed: 0,A,B
1,-0.156164,-1.381692
2,0.553699,-1.274677
3,0.575477,-2.078288


首先，用 loc['G1'] 則會回傳在這個 G1 index 裡的資料。

In [6]:
df.loc['G1'].loc[1]

A   -0.156164
B   -1.381692
Name: 1, dtype: float64

再者，如果在第一個 loc 後，再加上一個 loc ，則可以取得相對應的資料。
<br>也就是說，先取得 G1 裡面的資料，再來就是取得 index 1 的資料。</br>
<br>故就是先取得最外圍的 index，再取得裡面的 index。</br>

In [7]:
df.index.names

FrozenList([None, None])

用 index.names 可以得知在這兩個 index 是不是有名字，回傳 None,None，就代表這兩個 index 的欄位，並沒有名字。

In [8]:
df.index.names = ['Groups','Num']

In [9]:
df

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B
Groups,Num,Unnamed: 2_level_1,Unnamed: 3_level_1
G1,1,-0.156164,-1.381692
G1,2,0.553699,-1.274677
G1,3,0.575477,-2.078288
G2,1,0.052282,1.659373
G2,2,0.609869,-2.523152
G2,3,0.600435,1.116438


如果在 index.names 後面放上 = ，則可以幫 index 設定名字，在 [] 裡，第一個名字為 G1 的名字，第二個名字為 1,2,3 的名字。

In [10]:
df.loc['G2'].loc[2].loc['B']

-2.5231520598523405

In [11]:
df.loc['G2'].loc[2]['B']

-2.5231520598523405

先取得最外圍的 index，在取得內部的 index，最後選取欄。
<br>再選取欄的地方，無論有沒有加上 <code>.loc</code> 都可以選取到這個值。</br>

In [12]:
df

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B
Groups,Num,Unnamed: 2_level_1,Unnamed: 3_level_1
G1,1,-0.156164,-1.381692
G1,2,0.553699,-1.274677
G1,3,0.575477,-2.078288
G2,1,0.052282,1.659373
G2,2,0.609869,-2.523152
G2,3,0.600435,1.116438


In [13]:
df.loc['G1']

Unnamed: 0_level_0,A,B
Num,Unnamed: 1_level_1,Unnamed: 2_level_1
1,-0.156164,-1.381692
2,0.553699,-1.274677
3,0.575477,-2.078288


In [14]:
df.xs('G1')

Unnamed: 0_level_0,A,B
Num,Unnamed: 1_level_1,Unnamed: 2_level_1
1,-0.156164,-1.381692
2,0.553699,-1.274677
3,0.575477,-2.078288


In [15]:
df

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B
Groups,Num,Unnamed: 2_level_1,Unnamed: 3_level_1
G1,1,-0.156164,-1.381692
G1,2,0.553699,-1.274677
G1,3,0.575477,-2.078288
G2,1,0.052282,1.659373
G2,2,0.609869,-2.523152
G2,3,0.600435,1.116438


In [16]:
df.xs(1,level = 'Num')

Unnamed: 0_level_0,A,B
Groups,Unnamed: 1_level_1,Unnamed: 2_level_1
G1,-0.156164,-1.381692
G2,0.052282,1.659373


用 xs 的方式，可以橫跨 index ，並取得資料。
<br> <code>df.xs(1,level = 'Num')</code>，代表取得欄位名稱為 Num，且 index 為 1 的資料。</br>