# 🛠 IFQ718 Module 06 Exercises 04

## 🔍  Context: Indexing

The *index* of a DataFrame specifies an identity for each record in the frame. It allows very quickly discover of records.

The index is not considered as data of the frame but rather, metadata.

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv')

### The default index

In [None]:
df

When the `df` is printed, notice there is an unlabeled column to the left of `species`, with numbers from zero to `df.shape[0] - 1`.

In [None]:
df.shape[0] - 1

This column is an unlabeled *index* for the DataFrame - the one that was created by default.

Let's view the index of `df`:

In [None]:
df.index

Now, name it:

In [None]:
df.index.name = 'idx'

In [None]:
df

And view its type:

In [None]:
df.index.dtype

### Identifying rows by index, after filtering

In [None]:
df[df['island'] == 'Dream']

There are 124 rows where island is Dream, but the index is as large as 219.

The index has remained the same, even after filtering. This allows us to identify particular records at any time.

In [None]:
df.iloc[219]

In [None]:
df.iloc[215:219]

In [None]:
df.iloc[215:219, 0:3]

### Selecting an index

The index can be chosen, based on existing columns

In [None]:
df.set_index('species', inplace=True)

In [None]:
df

Notice that the name of the index was reused from the column that we selected

In [None]:
df.index

Now we can select rows using `.loc` and the new index 

In [None]:
df.loc['Adelie']

And add another column to the index

In [None]:
df.set_index('island', append=True, inplace=True)

In [None]:
# Observe the new multi-index
df

In [None]:
# Find Adelie penguins on the island Torgersen
df.loc[('Adelie', 'Torgersen'), :]

In [None]:
# Find Adelie penguins on any island
df.loc[('Adelie', ), :]

In [None]:
# Find Adelie penguins on the islands Torgersen and Dream
df.loc[('Adelie', ['Torgersen', 'Dream']), :]

You may have realised that having a unique index value for each record is useful.

We can also reset the index:

In [None]:
df.reset_index(inplace=True)

In [None]:
df

In [None]:
df.index

### ✍ Activity 1: set an appropriate index using the existing columns

The activities will be using the miles per gallon (MPG) dataset.

In [None]:
df_mpg = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/mpg.csv')

In [None]:
df_mpg

In [None]:
# Write your code here

### ✍ Activity 2: select records using the values of your index

In [None]:
# Write your code here

### ✍ Activity 3: select records using `.iloc`

In [None]:
# Write your code here

### ✍ Activity 4: make your index multi-leveled

In [None]:
# Write your code here

### ✍ Activity 5: select records using the labels of your index

In [None]:
# Write your code here