# Exercise 01 - volume weighted average price

The end goal of this exercise is to calculate the volume weighted average closing price of SPY over the month of December 2018.

This will utilize the column-wise operation and aggregation calculations that we learned in Tutorial 03.

We will use the data in following location: *../data/spy_dec_2018.csv*

#### 1) Import the `numpy` and `pandas` packages.

In [1]:
import numpy as np
import pandas as pd

#### 2) Use the `read_csv` method in pandas to read in the csv file containing the data.  Assign the data to a variable called `df_spy`.

In [2]:
df_spy = pd.read_csv("data/spy_dec_2018.csv")
df_spy

Unnamed: 0,date,open,high,low,close,volume,adjusted
0,2018-12-03,280.279999,280.399994,277.51001,279.299988,103176300,277.678436
1,2018-12-04,278.369995,278.850006,269.899994,270.25,177986000,268.681
2,2018-12-06,265.920013,269.970001,262.440002,269.839996,204185400,268.273376
3,2018-12-07,269.459991,271.220001,262.630005,263.570007,161018900,262.039795
4,2018-12-10,263.369995,265.160004,258.619995,264.070007,151445900,262.536896
5,2018-12-11,267.660004,267.869995,262.480011,264.130005,121504400,262.596527
6,2018-12-12,267.470001,269.0,265.369995,265.459991,97976700,263.918793
7,2018-12-13,266.519989,267.48999,264.119995,265.369995,96662700,263.829315
8,2018-12-14,262.959991,264.029999,259.850006,260.470001,116961100,258.957794
9,2018-12-17,259.399994,260.649994,253.529999,255.360001,165492300,253.877457


Recall that the volume weighted average price is defined as: $\sum_{i}\frac{v_{i}}{v_{m}} \cdot c_{i}$

Where:
- $i$ ranges over the days of the month
- $v_{i}$ is the volume on day $i$
- $v_{m}$ is the total volume over the month
- $c_{i}$ is the close price for day $i$

The next few steps will walk you through how to calculate this from the data set.

#### 3) Calculate the total volume for the month using an aggregation calculation on the `volume` column of `df_spy`.  Assign this value to a variable called `total_volume`.

In [3]:
total_volume = df_spy['volume'].sum()

#### 4) Add a new column to `df_spy` called `vwap_weight`.  This column will be calculated by taking each day's volume and dividing it by the monthly volume.  This can be done using a component-wise calculation involving `total_volume` quantity calculated above, and the `volume` column of `df_spy`.


In [4]:
df_spy['vwap_weight'] = df_spy['volume'] / total_volume
df_spy

Unnamed: 0,date,open,high,low,close,volume,adjusted,vwap_weight
0,2018-12-03,280.279999,280.399994,277.51001,279.299988,103176300,277.678436,0.034875
1,2018-12-04,278.369995,278.850006,269.899994,270.25,177986000,268.681,0.060161
2,2018-12-06,265.920013,269.970001,262.440002,269.839996,204185400,268.273376,0.069017
3,2018-12-07,269.459991,271.220001,262.630005,263.570007,161018900,262.039795,0.054426
4,2018-12-10,263.369995,265.160004,258.619995,264.070007,151445900,262.536896,0.05119
5,2018-12-11,267.660004,267.869995,262.480011,264.130005,121504400,262.596527,0.04107
6,2018-12-12,267.470001,269.0,265.369995,265.459991,97976700,263.918793,0.033117
7,2018-12-13,266.519989,267.48999,264.119995,265.369995,96662700,263.829315,0.032673
8,2018-12-14,262.959991,264.029999,259.850006,260.470001,116961100,258.957794,0.039534
9,2018-12-17,259.399994,260.649994,253.529999,255.360001,165492300,253.877457,0.055938


#### 4) Verify that `vwap_weights` sum to one by using an aggregation calculation.

In [5]:
df_spy['vwap_weight'].sum()

1.0

#### 5) Add a new column to `df_spy` called `weighted_close` that is the component-wise product of the `vwap_weight` and `close` columns.

In [6]:
df_spy['weighted_close'] = df_spy['vwap_weight'] * df_spy['close']
df_spy

Unnamed: 0,date,open,high,low,close,volume,adjusted,vwap_weight,weighted_close
0,2018-12-03,280.279999,280.399994,277.51001,279.299988,103176300,277.678436,0.034875,9.740518
1,2018-12-04,278.369995,278.850006,269.899994,270.25,177986000,268.681,0.060161,16.258585
2,2018-12-06,265.920013,269.970001,262.440002,269.839996,204185400,268.273376,0.069017,18.623539
3,2018-12-07,269.459991,271.220001,262.630005,263.570007,161018900,262.039795,0.054426,14.345115
4,2018-12-10,263.369995,265.160004,258.619995,264.070007,151445900,262.536896,0.05119,13.517855
5,2018-12-11,267.660004,267.869995,262.480011,264.130005,121504400,262.596527,0.04107,10.847782
6,2018-12-12,267.470001,269.0,265.369995,265.459991,97976700,263.918793,0.033117,8.7913
7,2018-12-13,266.519989,267.48999,264.119995,265.369995,96662700,263.829315,0.032673,8.670456
8,2018-12-14,262.959991,264.029999,259.850006,260.470001,116961100,258.957794,0.039534,10.297466
9,2018-12-17,259.399994,260.649994,253.529999,255.360001,165492300,253.877457,0.055938,14.284395


#### 6) The volume-weighted close over the month is now just the sum of the `weighted_prices`.  Calculate it using an aggregation calculation.

In [7]:
df_spy['weighted_close'].sum()

255.35538617685188