# VII. MultiIndex DataFrame
- 다중 레벨을 가지는 인덱스 객체

In [1]:
import pandas as pd

## 1. MultiIndex 객체 생성

## 1-1. 배열을 MultiIndex로 변환 :: pd.MultiIndex.from_arrays( )
```python
pd.MultiIndex.from_arrays(
    arrays,
    names = None
```
- 배열(array)을 MultiIndex로 변환
- Option
    - arrays : MultiIndex로 사용할 배열
    - names : 인덱스의 이름(수준의 이름, 대표 이름)

In [2]:
arrays = [[1, 1, 2, 2], ['red', 'blue', 'red', 'blue']]
pd.MultiIndex.from_arrays(arrays, names=('number', 'color'))

MultiIndex([(1,  'red'),
            (1, 'blue'),
            (2,  'red'),
            (2, 'blue')],
           names=['number', 'color'])

## 1-2. 튜플을 MultiIndex로 변환 :: pd.MultiIndex.from_tuple( )
```python
pd.MultiIndex.from_tuple(
    tuple,
    names
```
- 튜플(Tuple)을 MultiIndex로 변환
- Option
    - tuple : MultiIndex로 사용할 튜플
    - names : 인덱스의 이름(수준의 이름, 대표 이름)

In [3]:
address = [
    ('8809 Flair Square', ' Toddside', 'IL', '37206'),
    ('9901 Austin Street', 'Toddside',' IL', '37206'),
    ('905 Hogan Quarter', 'Franklin', 'IL', '37206')
]
row_index = pd.MultiIndex.from_tuples(
    tuples = address,
    names = ['Street','City','State','Zip']
)
row_index

MultiIndex([( '8809 Flair Square', ' Toddside',  'IL', '37206'),
            ('9901 Austin Street',  'Toddside', ' IL', '37206'),
            ( '905 Hogan Quarter',  'Franklin',  'IL', '37206')],
           names=['Street', 'City', 'State', 'Zip'])

## 2. MultiIndex DataFrame 생성

### 2-1. pd.DataFrame을 이용한 MultiIndex DataFrame 생성
```python
pd.DataFrame(
    data = None,
    index = None,
    columns = None,
```
- 기본적인 DataFrame의 index, columns(옵션)에 MultiIndex 객체를 넣게 되면 MultiIndex DataFrame이 만들어짐
- Option
    - data
    - index : MultiIndex 객체
    - columns : MultiIndex 객체

In [4]:
data = [
    ['A','B+'],
    ['C+','C'],
    ['D-','A'],
]

columns = ['Schools','Cost of Living']

address = [
    ('8809 Flair Square', ' Toddside', 'IL', '37206'),
    ('9901 Austin Street', 'Toddside',' IL', '37206'),
    ('905 Hogan Quarter', 'Franklin', 'IL', '37206')
]

row_index = pd.MultiIndex.from_tuples(
    tuples = address,
    names = ['Street','City','State','Zip']
)

area_grades = pd.DataFrame(
    data = data, 
    index = row_index, 
    columns = columns
)
area_grades

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Schools,Cost of Living
Street,City,State,Zip,Unnamed: 4_level_1,Unnamed: 5_level_1
8809 Flair Square,Toddside,IL,37206,A,B+
9901 Austin Street,Toddside,IL,37206,C+,C
905 Hogan Quarter,Franklin,IL,37206,D-,A


### 2-2. pd.read_csv를 통한 MultiIndex DataFrame 생성
```python
pd.read_csv(
    filepath_or_buffer,
    index_col = None,
    header = None
```
- index_col or header에 여러 값이 있는 리스트를 전달하면 판다스는 DataFrame에 대한 MultiIndex를 자동으로 생성함
- Option
    - index_col : 행 인덱스 위치 설정
    - header : 열 인덱스 위치 설정

In [5]:
neighborhoods = pd.read_csv('./Data/neighborhoods.csv',
                            index_col = [0,1,2], 
                            header = [0,1])
neighborhoods

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


## 3. MultiIndex DataFrame의 속성

In [6]:
neighborhoods = pd.read_csv('./Data/neighborhoods.csv',
                            index_col = [0,1,2], 
                            header = [0,1])
neighborhoods.columns.names = ['Category','SubCategory']
neighborhoods

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


### 3-1. MultiIndex 수준 이름 검색 :: MultiIndex.names
```python
MultiIndex.names
```
- MultiIndex의 수준 이름
    - 인덱스의 대표 이름

In [7]:
neighborhoods.index.names      # 행 인덱스 이름

FrozenList(['State', 'City', 'Street'])

In [8]:
neighborhoods.columns.names    # 열 인덱스 이름

FrozenList(['Category', 'SubCategory'])

### 3-2. MultiIndex의 인덱스 검색 :: get_level_values( )
```python
MultiIndex.get_level_values(
    level                        
)
```
- 요청된 레벨에 대한 레벨 값의 벡터를 반환
    - 즉, 인덱스 값들을 반환
- Option
    - level : MultiIndex에서 수준의 정수 위치 또는 수준의 이름

In [9]:
neighborhoods.index.get_level_values(1)
neighborhoods.index.get_level_values('City') # 'City'인덱스 값들

Index(['Fisherborough', 'Port Curtisville', 'Jimenezview', 'Stevenshire',
       'New Joshuaport', 'Wellsville', 'Jodiburgh', 'Lake Christopher',
       'Port Mike', 'Hardyburgh',
       ...
       'Scottstad', 'Port Willieport', 'Port Linda', 'Kaylamouth',
       'Port Shawnfort', 'North Matthew', 'Chadton', 'Diazmouth', 'Laurentown',
       'South Kennethmouth'],
      dtype='object', name='City', length=251)

## 4. MultiIndex 정렬 :: sort_index( )
```python
MultiIndex.sort_index(
    level = None,
    ascending = True,          # True : 오름차순 | False : 내림차순
    axis = 0,                  # 0 or 'index' | 1 or 'columns'
)
```
- 모든 레벨을 정렬
- Option
    - level : 정렬할 레벨
    - ascending : 오름차순 내림차순 설정
    - axis : 정렬할 축

In [10]:
neighborhoods.sort_index(
    axis = 0,                       # 행 인덱스
    level = [0,1],                  # 'State'와'City'기준
    ascending = [True, False])      # 'State' 오름차순, 'City' 내림차순

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
AK,Scottstad,082 Leblanc Freeway,D,C-,D,B+
AK,Scottstad,114 Jones Garden,D-,D-,D,D
AK,Rowlandchester,386 Rebecca Cove,C-,A-,A+,C
AL,Vegaside,191 Mindy Meadows,B+,A-,A+,D+
...,...,...,...,...,...,...
WY,Port Jason,624 Faulkner Orchard,A-,F,C+,C+
WY,Martintown,013 Bell Mills,C-,D,A-,B-
WY,Lake Nicole,933 Jennifer Burg,C,A+,A-,C
WY,Lake Nicole,754 Weaver Turnpike,B,D-,B,D


In [11]:
neighborhoods.sort_index(
    axis = 1,                       # 열 인덱스
    level = [0,1],                  # 'Category','SubCategory' 기준
    ascending = [True,False],      # 'Category'은 오름차순 | 'SubCategory'는 내림차순
)      

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Schools,Police
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,A+,D-
SD,Port Curtisville,446 Cynthia Inlet,C-,B,D+,B
WV,Jimenezview,432 John Common,A,A+,B,F
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,C+,B
MT,Chadton,601 Richards Road,A-,D,D,D+
SC,Diazmouth,385 Robin Harbors,F,D,D+,B-
VA,Laurentown,255 Gonzalez Land,C+,B-,D-,F


## 5. MultiIndex 행과 열 선택

### 5-1. 하나 이상의 열 추출 :: MultiIndex[ ]
- MultiIndex의 가장 바깥쪽 레벨에 해당 값을 찾음

In [12]:
neighborhoods['Services'] # Category에서 Services 선택

Unnamed: 0_level_0,Unnamed: 1_level_0,SubCategory,Police,Schools
State,City,Street,Unnamed: 3_level_1,Unnamed: 4_level_1
MO,Fisherborough,244 Tracy View,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,B,D+
WV,Jimenezview,432 John Common,F,B
AK,Stevenshire,238 Andrew Rue,A-,A-
ND,New Joshuaport,877 Walter Neck,B,B
...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B,C+
MT,Chadton,601 Richards Road,D+,D
SC,Diazmouth,385 Robin Harbors,B-,D+
VA,Laurentown,255 Gonzalez Land,F,D-


In [13]:
neighborhoods[('Services','Schools')] # Category에서 Services, SubCategory에서 Schools값을 선택

State  City                Street           
MO     Fisherborough       244 Tracy View       A+
SD     Port Curtisville    446 Cynthia Inlet    D+
WV     Jimenezview         432 John Common       B
AK     Stevenshire         238 Andrew Rue       A-
ND     New Joshuaport      877 Walter Neck       B
                                                ..
MI     North Matthew       055 Clayton Isle     C+
MT     Chadton             601 Richards Road     D
SC     Diazmouth           385 Robin Harbors    D+
VA     Laurentown          255 Gonzalez Land    D-
NE     South Kennethmouth  346 Wallace Pass     A-
Name: (Services, Schools), Length: 251, dtype: object

### 5-2. 여러개의 열 선택 :: MultiIndex[ [ ] ]
- 여러 개의 열을 추출하려면 튜플의 리스트를 대괄호 안에 작성

In [14]:
neighborhoods[[('Services','Schools'),('Culture','Museums')]]

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Services,Culture
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Schools,Museums
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2
MO,Fisherborough,244 Tracy View,A+,F
SD,Port Curtisville,446 Cynthia Inlet,D+,B
WV,Jimenezview,432 John Common,B,A+
AK,Stevenshire,238 Andrew Rue,A-,A
ND,New Joshuaport,877 Walter Neck,B,C-
...,...,...,...,...
MI,North Matthew,055 Clayton Isle,C+,C
MT,Chadton,601 Richards Road,D,D
SC,Diazmouth,385 Robin Harbors,D+,D
VA,Laurentown,255 Gonzalez Land,D-,B-


### 5-3. 인덱스 레이블을 통하여 행*열 추출 :: MultiIndex.loc[ ]
```python
MultiIndex.loc[
    RowsIndex_values,
    columnsIndex_values
]
```
- 인덱스 레이블로 행과 열을 추출

In [15]:
neighborhoods.loc['MO']    # 하나의 행 인덱스 사용
neighborhoods.loc['MO',:] 

Unnamed: 0_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,SubCategory,Restaurants,Museums,Police,Schools
City,Street,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
Fisherborough,244 Tracy View,C+,F,D-,A+
East Connie,798 Joseph Orchard,B,D,A+,D+
Port Elizabeth,072 Mariah Creek,C,C-,D-,A
Hendersonland,984 Williams Road,B+,A-,D-,A+
New Bailey,424 Marissa Underpass,C-,F,F,B+
Josephfort,259 Robles Turnpike,D,A-,B-,D


In [16]:
neighborhoods.loc[['MO','SD']]    # 하나의 행 인덱스의 여러값 사용
neighborhoods.loc[['MO','SD'],:]

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
MO,East Connie,798 Joseph Orchard,B,D,A+,D+
MO,Port Elizabeth,072 Mariah Creek,C,C-,D-,A
MO,Hendersonland,984 Williams Road,B+,A-,D-,A+
MO,New Bailey,424 Marissa Underpass,C-,F,F,B+
MO,Josephfort,259 Robles Turnpike,D,A-,B-,D
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
SD,West Scott,139 Hardy Vista,C+,A-,D+,B-
SD,Port Susanborough,445 Moreno Port,A,A+,C,C


In [17]:
neighborhoods.loc[('MO','East Connie','798 Joseph Orchard')]  # 여러개의 행 인덱스 사용
neighborhoods.loc[('MO','East Connie','798 Joseph Orchard'),:]

Category  SubCategory
Culture   Restaurants     B
          Museums         D
Services  Police         A+
          Schools        D+
Name: (MO, East Connie, 798 Joseph Orchard), dtype: object

In [18]:
neighborhoods.loc[:,'Culture']      # 하나의 열 인덱스 사용

Unnamed: 0_level_0,Unnamed: 1_level_0,SubCategory,Restaurants,Museums
State,City,Street,Unnamed: 3_level_1,Unnamed: 4_level_1
MO,Fisherborough,244 Tracy View,C+,F
SD,Port Curtisville,446 Cynthia Inlet,C-,B
WV,Jimenezview,432 John Common,A,A+
AK,Stevenshire,238 Andrew Rue,D-,A
ND,New Joshuaport,877 Walter Neck,D+,C-
...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C
MT,Chadton,601 Richards Road,A-,D
SC,Diazmouth,385 Robin Harbors,F,D
VA,Laurentown,255 Gonzalez Land,C+,B-


In [19]:
neighborhoods.loc[:,('Culture','Museums')]  # 하나의 열 인덱스의 여러 값을 사용

State  City                Street           
MO     Fisherborough       244 Tracy View        F
SD     Port Curtisville    446 Cynthia Inlet     B
WV     Jimenezview         432 John Common      A+
AK     Stevenshire         238 Andrew Rue        A
ND     New Joshuaport      877 Walter Neck      C-
                                                ..
MI     North Matthew       055 Clayton Isle      C
MT     Chadton             601 Richards Road     D
SC     Diazmouth           385 Robin Harbors     D
VA     Laurentown          255 Gonzalez Land    B-
NE     South Kennethmouth  346 Wallace Pass     B-
Name: (Culture, Museums), Length: 251, dtype: object

In [20]:
neighborhoods.loc[:,['Culture','Services']]   # 여러 열 인덱스 사용

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


### 5-4. 하나의 이상의 행 추출 :: MultiIndex.iloc( )

```python
MultiIndex.iloc[
    RowsIndex 위치,
    columnsIndex 위치
]
```
- 인덱스 위치로 행과 열을 추출

In [21]:
neighborhoods

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


In [22]:
neighborhoods.iloc[1,:]  # 첫 번째 행 출력

Category  SubCategory
Culture   Restaurants    C-
          Museums         B
Services  Police          B
          Schools        D+
Name: (SD, Port Curtisville, 446 Cynthia Inlet), dtype: object

In [23]:
neighborhoods.iloc[:,1] # 첫 번째 열 추출

State  City                Street           
MO     Fisherborough       244 Tracy View        F
SD     Port Curtisville    446 Cynthia Inlet     B
WV     Jimenezview         432 John Common      A+
AK     Stevenshire         238 Andrew Rue        A
ND     New Joshuaport      877 Walter Neck      C-
                                                ..
MI     North Matthew       055 Clayton Isle      C
MT     Chadton             601 Richards Road     D
SC     Diazmouth           385 Robin Harbors     D
VA     Laurentown          255 Gonzalez Land    B-
NE     South Kennethmouth  346 Wallace Pass     B-
Name: (Culture, Museums), Length: 251, dtype: object

## 6. 인덱스 위치 변경 :: MultiIndex.reorder_levels( )
```python
MultiIndex.reorder_levels(
    order
)
```
- 입력 순서를 사용하여 레벨을 재정렬
- Option
    - order : 새 레벨의 순서

In [24]:
neighborhoods

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,SubCategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


In [25]:
new_order = ['City','State','Street']
neighborhoods.reorder_levels(oreder = new_order)

TypeError: DataFrame.reorder_levels() got an unexpected keyword argument 'oreder'