# Student Do: Indexing Fever

# Indexing Fever

You've caught the multi-indexing fever! Add power to your financial analytic pipelines by indexing your data by month and year with a DatetimeIndex.

For this demo, you will use historical stock data from [Bombardier (BBD.B)](https://web.tmxmoney.com/quote.php?qm_symbol=BBD.B) that comprises `BBD.B` ticker prices from March to May 2019.

## Instructions

### Import Libraries and Dependencies

In [1]:
import pandas as pd
from pathlib import Path

### Read CSV as Pandas DataFrame

In [82]:
# Read csv data
stock_data = r'C:\Users\TribThapa\Desktop\Thapa\ResearchFellow\Courses\FinTech_Bootcamp_MonashUni2021\monu-mel-virt-fin-pt-05-2021-u-c\Activities\Week 3\3\06-Stu_Multi_Indexing\Resources\bombardier_stock_data.csv'

stock_data = pd.read_csv(stock_data, parse_dates=True, index_col="Date", infer_datetime_format=True)
stock_data.head()

#stock_data.index

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
2019-03-01,2.9
2019-03-04,2.92
2019-03-05,2.95
2019-03-06,2.83
2019-03-07,2.88


### Assess & Clean Data

In [65]:
# Check for nulls
stock_data.isnull().mean() * 100

# Drop nulls
stock_data = stock_data.dropna().copy()

# Drop duplicates
stock_data = stock_data.drop_duplicates().copy()

# Validate no more missing values
stock_data.isnull().sum()

Close    0
dtype: int64

### Group by `year` and `month`

In [78]:
stock_data_idx = stock_data.groupby([
    stock_data.index.year,
    stock_data.index.month,
    stock_data.index.day
]).first()

stock_data_idx

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Close
Date,Date,Date,Unnamed: 3_level_1
2019,3,1,2.9
2019,3,4,2.92
2019,3,5,2.95
2019,3,6,2.83
2019,3,7,2.88
2019,3,8,2.84
2019,3,11,2.85
2019,3,12,2.82
2019,3,13,2.89
2019,3,15,2.86


### Access `Close` for May 2019 Using Multi-indexing Lookup

In [79]:
# Select BBD Close for May 2019
stock_May1 = stock_data_idx.loc[2019,5,30]
stock_May1

Close    2.02
Name: (2019, 5, 30), dtype: float64

In [77]:
stock_May2 = stock_data.loc['2019-05-30']
stock_May2

Close    2.02
Name: 2019-05-30 00:00:00, dtype: float64

### Challenge

Take this activity to the next level by calculating the mean close price for `BBD.B` for all of `2019`.

In [84]:
stock_2019 = stock_data.loc['2018-12-31':'2020-01-01'].mean()
stock_2019

Close    2.535556
dtype: float64