# Exercise - Volume Weighted Average Price

The end goal of this exercise is to calculate the volume weighted average close price of SPY over the month of July 2021.

This will utilize column-wise operations and aggregation calculations.

#### 1) Import the `numpy`, `pandas`, and `pandas_datareader` packages.

In [1]:
import numpy as np
import pandas as pd
import pandas_datareader as pdr

#### 2) Use the `pandas_datareader` to query the SPY data from Yahoo finance.  Assign the data to a variable called `df_spy`.

In [2]:
df_spy = pdr.get_data_yahoo('SPY', start='2021-07-01', end='2021-07-31')
df_spy

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2021-07-01,430.600006,428.799988,428.869995,430.429993,53441000,430.429993
2021-07-02,434.100006,430.519989,431.670013,433.720001,57697700,433.720001
2021-07-06,434.01001,430.01001,433.779999,432.929993,68710400,432.929993
2021-07-07,434.76001,431.51001,433.660004,434.459991,63549500,434.459991
2021-07-08,431.730011,427.519989,428.779999,430.920013,97595200,430.920013
2021-07-09,435.839996,430.709991,432.529999,435.519989,76238600,435.519989
2021-07-12,437.350006,434.970001,435.429993,437.079987,52889600,437.079987
2021-07-13,437.839996,435.309998,436.23999,435.589996,52911300,435.589996
2021-07-14,437.920013,434.910004,437.399994,436.23999,64130400,436.23999
2021-07-15,435.529999,432.720001,434.809998,434.75,55126400,434.75


Recall that the volume weighted average price is defined as: $\sum_{i}\frac{v_{i}}{v_{m}} \cdot c_{i}$

Where:
- $i$ ranges over the days of the month
- $v_{i}$ is the volume on day $i$
- $v_{m}$ is the total volume over the month
- $c_{i}$ is the close price for day $i$

The next few steps will walk you through how to calculate this from the data set.

#### 3) Calculate the total volume for the month using an aggregation calculation on the `volume` column of `df_spy`.  Assign this value to a variable called `total_volume`.

In [3]:
total_volume = df_spy['Volume'].sum()

#### 4) Add a new column to `df_spy` called `vwap_weight`.  This column will be calculated by taking each day's volume and dividing it by the monthly volume.  This can be done using a component-wise calculation involving the `total_volume` quantity calculated above, and the `volume` column of `df_spy`.


In [4]:
df_spy['vwap_weight'] = df_spy['Volume'] / total_volume
df_spy

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close,vwap_weight
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2021-07-01,430.600006,428.799988,428.869995,430.429993,53441000,430.429993,0.03758
2021-07-02,434.100006,430.519989,431.670013,433.720001,57697700,433.720001,0.040574
2021-07-06,434.01001,430.01001,433.779999,432.929993,68710400,432.929993,0.048318
2021-07-07,434.76001,431.51001,433.660004,434.459991,63549500,434.459991,0.044689
2021-07-08,431.730011,427.519989,428.779999,430.920013,97595200,430.920013,0.06863
2021-07-09,435.839996,430.709991,432.529999,435.519989,76238600,435.519989,0.053612
2021-07-12,437.350006,434.970001,435.429993,437.079987,52889600,437.079987,0.037193
2021-07-13,437.839996,435.309998,436.23999,435.589996,52911300,435.589996,0.037208
2021-07-14,437.920013,434.910004,437.399994,436.23999,64130400,436.23999,0.045097
2021-07-15,435.529999,432.720001,434.809998,434.75,55126400,434.75,0.038766


#### 4) Verify that `vwap_weights` sum to one by using an aggregation calculation.

In [5]:
df_spy['vwap_weight'].sum()

1.0

#### 5) Add a new column to `df_spy` called `weighted_close` that is the component-wise product of the `vwap_weight` and `close` columns.

In [6]:
df_spy['weighted_close'] = df_spy['vwap_weight'] * df_spy['Close']
df_spy

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close,vwap_weight,weighted_close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2021-07-01,430.600006,428.799988,428.869995,430.429993,53441000,430.429993,0.03758,16.175736
2021-07-02,434.100006,430.519989,431.670013,433.720001,57697700,433.720001,0.040574,17.597659
2021-07-06,434.01001,430.01001,433.779999,432.929993,68710400,432.929993,0.048318,20.918334
2021-07-07,434.76001,431.51001,433.660004,434.459991,63549500,434.459991,0.044689,19.415513
2021-07-08,431.730011,427.519989,428.779999,430.920013,97595200,430.920013,0.06863,29.574135
2021-07-09,435.839996,430.709991,432.529999,435.519989,76238600,435.519989,0.053612,23.349089
2021-07-12,437.350006,434.970001,435.429993,437.079987,52889600,437.079987,0.037193,16.256166
2021-07-13,437.839996,435.309998,436.23999,435.589996,52911300,435.589996,0.037208,16.207397
2021-07-14,437.920013,434.910004,437.399994,436.23999,64130400,436.23999,0.045097,19.673261
2021-07-15,435.529999,432.720001,434.809998,434.75,55126400,434.75,0.038766,16.853347


#### 6) The volume-weighted close over the month is now just the sum of the `weighted_prices`.  Calculate it using an aggregation calculation.

In [7]:
df_spy['weighted_close'].sum()

434.12201644309096