# 3.4 层次化索引

## 3.4.1 层次化索引简介

### 简单地说，层出化索引就是轴上有多个级别索引

In [1]:
from pandas import Series,DataFrame
import pandas as pd
import numpy as np

In [2]:
obj = Series(np.random.randn(9),
            index=[['one','one','one','two','two','two','three','three','three'],
                  ['a','b','c','a','b','c','a','b','c']])
obj

one    a    1.047767
       b   -0.627898
       c   -0.921599
two    a   -1.004192
       b   -0.688532
       c   -0.923718
three  a   -2.135181
       b    1.498279
       c   -1.205424
dtype: float64

### 该索引对象为MultiIndex对象

In [3]:
obj.index

MultiIndex([(  'one', 'a'),
            (  'one', 'b'),
            (  'one', 'c'),
            (  'two', 'a'),
            (  'two', 'b'),
            (  'two', 'c'),
            ('three', 'a'),
            ('three', 'b'),
            ('three', 'c')],
           )

### 层次化索引的对象，索引和选取操作都很简单

In [4]:
obj['two']

a   -1.004192
b   -0.688532
c   -0.923718
dtype: float64

In [5]:
obj[:,'a']  #内层选取

one      1.047767
two     -1.004192
three   -2.135181
dtype: float64

### 对于DataFrame数据而言，行和列索引都可以为层次化索引。选取数据也很简单

In [6]:
df = DataFrame(np.arange(16).reshape(4,4),
              index=[['one','one','two','two'],['a','b','a','b']],
              columns=[['apple','apple','orange','orange'],['red','green','red','green']])
df

Unnamed: 0_level_0,Unnamed: 1_level_0,apple,apple,orange,orange
Unnamed: 0_level_1,Unnamed: 1_level_1,red,green,red,green
one,a,0,1,2,3
one,b,4,5,6,7
two,a,8,9,10,11
two,b,12,13,14,15


In [7]:
df['apple']

Unnamed: 0,Unnamed: 1,red,green
one,a,0,1
one,b,4,5
two,a,8,9
two,b,12,13


## 3.4.2 重排分级顺序

### 通过swaplevel方法可以对层次化索引进行重排

In [8]:
df.swaplevel(0,1)

Unnamed: 0_level_0,Unnamed: 1_level_0,apple,apple,orange,orange
Unnamed: 0_level_1,Unnamed: 1_level_1,red,green,red,green
a,one,0,1,2,3
b,one,4,5,6,7
a,two,8,9,10,11
b,two,12,13,14,15


## 3.4.3 汇总统计

### 在对层次化索引的pandas数据进行汇总统计时，可以通过level参数指定在某层次上进行汇总统计

In [9]:
df.sum(level=0)

Unnamed: 0_level_0,apple,apple,orange,orange
Unnamed: 0_level_1,red,green,red,green
one,4,6,8,10
two,20,22,24,26


In [10]:
df.sum(level=1,axis=1)

Unnamed: 0,Unnamed: 1,red,green
one,a,2,4
one,b,10,12
two,a,18,20
two,b,26,28
