# Join, Combine, Reshape

## Hierarchical Indices and Series objects

A Hierarchical index enables one to create multiple index levels on any given axis. The following example shows a simple example of how to set up a hierarchical index on a Series object 

In [4]:
import numpy as np
import pandas as pd

data = np.random.standard_normal(10)
index = [['BT.L', 'BT.L', 'BT.L','VOD.L', 'VOD.L', 'VOD.L', '.FTSE', '.FTSE', '.FTSE', '.SX5E'], ['bid', 'ask','mid','bid', 'ask','mid','bid', 'ask','mid','bid']]

series = pd.Series(data, index=index)
series = series.sort_index()


### Index by outer level
We can index a subset of rows by outer level key as follows
#### (I)

In [38]:
series["BT.L"]

ask   -1.135774
bid   -0.451556
mid    0.194875
dtype: float64

#### (II)

In [40]:
series[['BT.L', '.SX5E']]

BT.L   ask   -1.135774
       bid   -0.451556
       mid    0.194875
.SX5E  bid   -0.123017
dtype: float64

#### (III)

In [42]:
series['.FTSE': 'BT.L']

.FTSE  ask    0.780578
       bid   -0.500866
       mid   -0.950928
.SX5E  bid   -0.123017
BT.L   ask   -1.135774
       bid   -0.451556
       mid    0.194875
dtype: float64

#### (IV)

In [46]:
series.loc['.FTSE': 'BT.L']

.FTSE  ask    0.780578
       bid   -0.500866
       mid   -0.950928
.SX5E  bid   -0.123017
BT.L   ask   -1.135774
       bid   -0.451556
       mid    0.194875
dtype: float64

### Index by inner level
We can also index by inner level as follows

#### (I)

In [47]:
series.loc[:,'ask':'bid']

.FTSE  ask    0.780578
       bid   -0.500866
.SX5E  bid   -0.123017
BT.L   ask   -1.135774
       bid   -0.451556
VOD.L  ask    0.007611
       bid    0.744661
dtype: float64

#### (II)

In [48]:
series.loc[:,['ask','bid']]

.FTSE  ask    0.780578
       bid   -0.500866
.SX5E  bid   -0.123017
BT.L   ask   -1.135774
       bid   -0.451556
VOD.L  ask    0.007611
       bid    0.744661
dtype: float64

### Stacking and Unstacking

#### Unstacking

We can unstack a Series into a DataFrame. The outer level becomes the DataFrame index and the inner level becomes the DataFrame column index

In [52]:
series.unstack()

Unnamed: 0,ask,bid,mid
.FTSE,0.780578,-0.500866,-0.950928
.SX5E,,-0.123017,
BT.L,-1.135774,-0.451556,0.194875
VOD.L,0.007611,0.744661,0.027464


## Hierarchical Indices and DataFrame objects

A DataFrame's index and column index can both be hierarchical 

In [36]:
data = np.random.standard_normal(16).reshape(4,4)

index = [['GBP','GBP','USD','USD'],['VOD','BT','MSFT', 'AAA']]
colIndex = [['2021-01-01','2021-01-01','2021-01-02','2021-01-02'],['bid','ask','bid','ask']]

df1 = pd.DataFrame(data,index=index, columns=colIndex)
df1 = df1.sort_index()
df1

Unnamed: 0_level_0,Unnamed: 1_level_0,2021-01-01,2021-01-01,2021-01-02,2021-01-02
Unnamed: 0_level_1,Unnamed: 1_level_1,bid,ask,bid,ask
GBP,BT,0.015253,1.284388,0.070362,0.347758
GBP,VOD,0.024992,-1.12985,-1.517904,0.629708
USD,AAA,-1.257783,-0.389574,0.402573,0.046667
USD,MSFT,-1.108315,1.410324,1.896382,1.209533


### Indexing
Indexing with multi-level indices on both column and row becomes a bit syntactically challenging. The key point is each axis takes a tuple of values for the key parts

In [29]:
df1.loc[('GBP','VOD'),('2021-01-01','bid')]

1.0419965575377923

In [30]:
#### Slices

In [37]:
df1.loc[('GBP','BT'):('GBP','VOD')]

Unnamed: 0_level_0,Unnamed: 1_level_0,2021-01-01,2021-01-01,2021-01-02,2021-01-02
Unnamed: 0_level_1,Unnamed: 1_level_1,bid,ask,bid,ask
GBP,BT,0.015253,1.284388,0.070362,0.347758
GBP,VOD,0.024992,-1.12985,-1.517904,0.629708
