# DateTime Groupby

In this exercise you will get the opportunity to practice grouping by DateTime indices and analyzing the results.

#### Import Libraries and Dependencies

In [1]:
import pandas as pd
from pathlib import Path

## 1. Import the data, taking care to declare the `datetime` index.


In [2]:
# Import data
spy_path = Path('../Resources/spy_stock_volume.csv')

# Read in data and index by date
spy_data = pd.read_csv(
    spy_path, 
    index_col='Date',     
    parse_dates=True, 
    infer_datetime_format=True
)
spy_data

Unnamed: 0_level_0,close,volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-03-12 08:00:00,258.60,229683
2020-03-12 09:00:00,257.76,457488
2020-03-12 10:00:00,252.81,291881
2020-03-12 11:00:00,259.99,353484
2020-03-12 12:00:00,257.12,520699
...,...,...
2021-02-08 10:00:00,388.89,39322
2021-02-08 11:00:00,389.03,22696
2021-02-08 12:00:00,388.80,29164
2021-02-08 13:00:00,389.33,21826


## 2. Slice the DateTimeIndex Data into one month

In [3]:
# Slice the Data to One Specific Month
volume_jan_2021 = spy_data.loc['2021-01-01':'2021-01-31']
volume_jan_2021

Unnamed: 0_level_0,close,volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-01-04 08:00:00,373.51,57311
2021-01-04 09:00:00,369.98,100077
2021-01-04 10:00:00,366.96,114323
2021-01-04 11:00:00,366.27,109650
2021-01-04 12:00:00,368.60,68479
...,...,...
2021-01-29 10:00:00,373.02,102301
2021-01-29 11:00:00,369.26,145045
2021-01-29 12:00:00,369.79,114704
2021-01-29 13:00:00,370.79,102156


## 3. Save the total volume of shares traded for that month into a new variable.

In [4]:
# Calculate the total number of shares traded for the month of January, 2021
jan_2021_volume = volume_jan_2021['volume'].sum()
jan_2021_volume

9004793

## 4. Group the bigger dataset on share volume into `year` and `month` using the `datetime` index. Use this grouping to create a DataFrame of total monthly SPY shares traded each month.

In [5]:
spy_volume = spy_data['volume']
spy_volume.head()

Date
2020-03-12 08:00:00    229683
2020-03-12 09:00:00    457488
2020-03-12 10:00:00    291881
2020-03-12 11:00:00    353484
2020-03-12 12:00:00    520699
Name: volume, dtype: int64

In [6]:
# Specify the way you want to group things -- here we are using the datetimeindex
groupby_levels = [spy_volume.index.year, spy_volume.index.month]

# Then Groupby that, choosing an aggregation function
total_monthly_volume = spy_volume.groupby(by=groupby_levels).sum()
total_monthly_volume

Date  Date
2020  3       25896997
      4       19886634
      5       13629627
      6       16840144
      7       10681540
      8        7050740
      9       12578450
      10      11122869
      11       9930549
      12       7874859
2021  1        9004793
      2        2075300
Name: volume, dtype: int64

## 5. Using the DataFrame constructed in step (4), Calculate the `median` monthly total volume of shares traded in the S&P 500.

In [7]:
# We can do summary statistics on the aggregated data we just created
# For Example: View the median amount of monthly shares traded
median_monthly_volume = total_monthly_volume.median()
median_monthly_volume

10902204.5

## 6. Compare this `median` number to the number you calculated in step (3). How does that month compare in terms of trading activity?

In [8]:
# Compare the shares traded in January 2021 to the median amount that get traded each month
jan_2021_volume / median_monthly_volume

0.8259607494979571

**Question** : Compare this `median` number to the number you calculated in step (3). How does that month compare in terms of trading activity?

> **Sample Answer**: It looks like January 2021 volume was about 16% below that of the typical amount for the S&P 500.
