# MultiIndex DataFrames

We need only one label or one index position to locate a value in a Series. We need two reference points to locate a value in a DataFrame: a label/index for the rows and a label/index for the columns. Can we expand beyond two dimensions? Absolutely! Pandas supports data sets with any number of dimensions through the use of a MultiIndex.

A multiIndex object is an index object that holds multiple levels

Each level stores a value for the row

A MultiIndex is also ideal for hierarchical data: data in which one column’s values are a subcategory of another column’s values.

In [13]:
import pandas as pd
import numpy as np

Let's create a MultiIndex from scratch

In [14]:
addresses = [
    ('8809 Flair Square', 'Toddside', 'IL', '37206'),
    ('9901 Austin Street', 'Toddside', 'IL', '37206'),
    ('905 Hogan Quarter', 'Franklin', 'IL', '37206')
             ]
addresses

[('8809 Flair Square', 'Toddside', 'IL', '37206'),
 ('9901 Austin Street', 'Toddside', 'IL', '37206'),
 ('905 Hogan Quarter', 'Franklin', 'IL', '37206')]

In [15]:
pd.MultiIndex.from_tuples(addresses)

MultiIndex([( '8809 Flair Square', 'Toddside', 'IL', '37206'),
            ('9901 Austin Street', 'Toddside', 'IL', '37206'),
            ( '905 Hogan Quarter', 'Franklin', 'IL', '37206')],
           )

In pandas terminology, the collection of tuple values at the same position forms a level
of the MultiIndex.

We can set each multiindex level a name

In [16]:
row_index = pd.MultiIndex.from_tuples(addresses, names=['Street', 'City', 'State', 'Zip'])
row_index

MultiIndex([( '8809 Flair Square', 'Toddside', 'IL', '37206'),
            ('9901 Austin Street', 'Toddside', 'IL', '37206'),
            ( '905 Hogan Quarter', 'Franklin', 'IL', '37206')],
           names=['Street', 'City', 'State', 'Zip'])

Attaching our miltiindex to a dataframe

In [17]:
data = [
    ['A', 'B+'],
    ['C+', 'C'],
    ['D-', 'A']
]
columns = ['Schools', 'Cost Of Living']

area_grades = pd.DataFrame(data, index=row_index, columns=columns)
area_grades

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Schools,Cost Of Living
Street,City,State,Zip,Unnamed: 4_level_1,Unnamed: 5_level_1
8809 Flair Square,Toddside,IL,37206,A,B+
9901 Austin Street,Toddside,IL,37206,C+,C
905 Hogan Quarter,Franklin,IL,37206,D-,A


In [18]:
area_grades.columns

Index(['Schools', 'Cost Of Living'], dtype='object')

In [19]:
column_index = pd.MultiIndex.from_tuples(
    [
        ('Culture', 'Restaurants'),
        ('Culture', 'Museums'),
        ('Services', 'Police'),
        ('Services', 'School')
    ]
)
column_index

MultiIndex([( 'Culture', 'Restaurants'),
            ( 'Culture',     'Museums'),
            ('Services',      'Police'),
            ('Services',      'School')],
           )

Now we need a dataframe with four columns because the column_index has 4 tuples

In [20]:
data = [
    ['C-', 'B+', 'B-', 'A'],
    ['D+', 'C', 'A', 'C+'],
    ['A-', 'A', 'D+', 'F']
]

pd.DataFrame(data, row_index, column_index)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Restaurants,Museums,Police,School
Street,City,State,Zip,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
8809 Flair Square,Toddside,IL,37206,C-,B+,B-,A
9901 Austin Street,Toddside,IL,37206,D+,C,A,C+
905 Hogan Quarter,Franklin,IL,37206,A-,A,D+,F


# Working with neighbordhoods.csv

In [21]:
neighbors = pd.read_csv('/home/diego/Documents/Data/neighborhoods.csv', index_col=[0, 1, 2], header=[0, 1])

In [22]:
neighbors.info()

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 251 entries, ('MO', 'Fisherborough', '244 Tracy View') to ('NE', 'South Kennethmouth', '346 Wallace Pass')
Data columns (total 4 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   (Culture, Restaurants)  251 non-null    object
 1   (Culture, Museums)      251 non-null    object
 2   (Services, Police)      251 non-null    object
 3   (Services, Schools)     251 non-null    object
dtypes: object(4)
memory usage: 27.1+ KB


In [23]:
neighbors

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


In [24]:
neighbors.index

MultiIndex([('MO',      'Fisherborough',        '244 Tracy View'),
            ('SD',   'Port Curtisville',     '446 Cynthia Inlet'),
            ('WV',        'Jimenezview',       '432 John Common'),
            ('AK',        'Stevenshire',        '238 Andrew Rue'),
            ('ND',     'New Joshuaport',       '877 Walter Neck'),
            ('ID',         'Wellsville',   '696 Weber Stravenue'),
            ('TN',          'Jodiburgh',    '285 Justin Corners'),
            ('DC',   'Lake Christopher',   '607 Montoya Harbors'),
            ('OH',          'Port Mike',      '041 Michael Neck'),
            ('ND',         'Hardyburgh', '550 Gilmore Mountains'),
            ...
            ('AK',          'Scottstad',      '114 Jones Garden'),
            ('IA',    'Port Willieport',  '320 Jennifer Mission'),
            ('ME',         'Port Linda',        '692 Hill Glens'),
            ('KS',         'Kaylamouth',       '483 Freeman Via'),
            ('WA',     'Port Shawnfort',    '6

In [25]:
neighbors.columns

MultiIndex([( 'Culture', 'Restaurants'),
            ( 'Culture',     'Museums'),
            ('Services',      'Police'),
            ('Services',     'Schools')],
           )

In [26]:
neighbors.index.get_level_values(1)

Index(['Fisherborough', 'Port Curtisville', 'Jimenezview', 'Stevenshire',
       'New Joshuaport', 'Wellsville', 'Jodiburgh', 'Lake Christopher',
       'Port Mike', 'Hardyburgh',
       ...
       'Scottstad', 'Port Willieport', 'Port Linda', 'Kaylamouth',
       'Port Shawnfort', 'North Matthew', 'Chadton', 'Diazmouth', 'Laurentown',
       'South Kennethmouth'],
      dtype='object', name='City', length=251)

In [27]:
neighbors.columns.names = ['Category', 'Subcategory']

In [28]:
neighbors.columns.names

FrozenList(['Category', 'Subcategory'])

In [29]:
neighbors

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Subcategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
MO,Fisherborough,244 Tracy View,C+,F,D-,A+
SD,Port Curtisville,446 Cynthia Inlet,C-,B,B,D+
WV,Jimenezview,432 John Common,A,A+,F,B
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
ND,New Joshuaport,877 Walter Neck,D+,C-,B,B
...,...,...,...,...,...,...
MI,North Matthew,055 Clayton Isle,B-,C,B,C+
MT,Chadton,601 Richards Road,A-,D,D+,D
SC,Diazmouth,385 Robin Harbors,F,D,B-,D+
VA,Laurentown,255 Gonzalez Land,C+,B-,F,D-


In [30]:
neighbors.nunique()

Category  Subcategory
Culture   Restaurants    13
          Museums        13
Services  Police         13
          Schools        13
dtype: int64

In [31]:
neighbors.sort_index()

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Subcategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
AK,Rowlandchester,386 Rebecca Cove,C-,A-,A+,C
AK,Scottstad,082 Leblanc Freeway,D,C-,D,B+
AK,Scottstad,114 Jones Garden,D-,D-,D,D
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
AL,Clarkland,430 Douglas Mission,A,F,C+,B+
...,...,...,...,...,...,...
WY,Lake Nicole,754 Weaver Turnpike,B,D-,B,D
WY,Lake Nicole,933 Jennifer Burg,C,A+,A-,C
WY,Martintown,013 Bell Mills,C-,D,A-,B-
WY,Port Jason,624 Faulkner Orchard,A-,F,C+,C+


Sorting selecting a level index

In [32]:
neighbors.sort_index(level=['City'])

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Subcategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
AR,Allisonland,124 Diaz Brooks,C-,A+,F,C+
GA,Amyburgh,941 Brian Expressway,B,B,D-,C+
IA,Amyburgh,163 Heather Neck,F,D,A+,A-
ID,Andrewshire,952 Ellis Drive,C+,A-,C+,A
UT,Baileyfort,919 Stewart Hills,D+,C+,A,C
...,...,...,...,...,...,...
NC,West Scott,348 Jack Branch,A-,D-,A-,A
SD,West Scott,139 Hardy Vista,C+,A-,D+,B-
IN,Wilsonborough,066 Carr Road,A+,C-,B,F
NC,Wilsonshire,871 Christopher Vista,B+,B,D+,F


In [34]:
neighbors = neighbors.sort_index()

In [35]:
neighbors

Unnamed: 0_level_0,Unnamed: 1_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Unnamed: 1_level_1,Subcategory,Restaurants,Museums,Police,Schools
State,City,Street,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
AK,Rowlandchester,386 Rebecca Cove,C-,A-,A+,C
AK,Scottstad,082 Leblanc Freeway,D,C-,D,B+
AK,Scottstad,114 Jones Garden,D-,D-,D,D
AK,Stevenshire,238 Andrew Rue,D-,A,A-,A-
AL,Clarkland,430 Douglas Mission,A,F,C+,B+
...,...,...,...,...,...,...
WY,Lake Nicole,754 Weaver Turnpike,B,D-,B,D
WY,Lake Nicole,933 Jennifer Burg,C,A+,A-,C
WY,Martintown,013 Bell Mills,C-,D,A-,B-
WY,Port Jason,624 Faulkner Orchard,A-,F,C+,C+


Extracting data in a multi index object

In [40]:
neighbors[('Services', 'Schools')]

State  City            Street              
AK     Rowlandchester  386 Rebecca Cove         C
       Scottstad       082 Leblanc Freeway     B+
                       114 Jones Garden         D
       Stevenshire     238 Andrew Rue          A-
AL     Clarkland       430 Douglas Mission     B+
                                               ..
WY     Lake Nicole     754 Weaver Turnpike      D
                       933 Jennifer Burg        C
       Martintown      013 Bell Mills          B-
       Port Jason      624 Faulkner Orchard    C+
       Reneeshire      717 Patel Square         A
Name: (Services, Schools), Length: 251, dtype: object

Extracting rows

In [45]:
neighbors.loc[('TX', 'Kingchester', '534 Gordon Falls')]

Category  Subcategory
Culture   Restaurants     C
          Museums        D+
Services  Police          B
          Schools         B
Name: (TX, Kingchester, 534 Gordon Falls), dtype: object

In [48]:
neighbors.loc['TX', 'Leeberg']

Category,Culture,Culture,Services,Services
Subcategory,Restaurants,Museums,Police,Schools
Street,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
693 Avila Pines,C+,F,A+,D+


In [51]:
neighbors['NE':'NH']

Unnamed: 0_level_0,Unnamed: 1_level_0,Subcategory
State,City,Street
AK,Rowlandchester,386 Rebecca Cove
AK,Scottstad,082 Leblanc Freeway
AK,Scottstad,114 Jones Garden
AK,Stevenshire,238 Andrew Rue
AL,Clarkland,430 Douglas Mission
...,...,...
WY,Lake Nicole,754 Weaver Turnpike
WY,Lake Nicole,933 Jennifer Burg
WY,Martintown,013 Bell Mills
WY,Port Jason,624 Faulkner Orchard


Cross sections

The xs method allows us to extract rows by providing a value for one MultiIndex level

In [52]:
neighbors.xs(key='Lake Nicole', level='City')

Unnamed: 0_level_0,Category,Culture,Culture,Services,Services
Unnamed: 0_level_1,Subcategory,Restaurants,Museums,Police,Schools
State,Street,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
OR,650 Angela Track,D,C-,D,F
WY,754 Weaver Turnpike,B,D-,B,D
WY,933 Jennifer Burg,C,A+,A-,C
