## Introduction to Volume, Slippage, and Liquidity

### What is Volume?
Volume is how much trading occurs for a given instrument, or set of instruments, over a given time period. 

#### Example Volume Computation
For instance, consider a hypothetical equity asset $A$. If, over the course of a minute, $100,000$ shares of $A$ are bought, then the shares traded volume of $A$ is $100,000$. To find the dollar volume traded for $A$, which is the more often-used statistic, we need to take a weighted average of all the different prices $A$ traded for over the minute times the number of shares that traded at each price. This is equivalent to adding up the dollar volumes of all the individual trasnactions that occured. For instance, let's say in this case there were there separate transactions. One for 30,000 shares, one for $60,000$ shares, and one for $10,000$ shares. The prices were $30$ USD, $31$ USD, and $33$ USD, respectively. Let's model this out.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import yfinance as yf
from pandas_datareader import data as pdr

In [None]:
num_shares = np.asarray([30000, 60000, 10000])
prices = np.asarray([30, 31, 33])

np.dot(num_shares, prices)

So total dollar volume is $3.09$ million USD. Notice that this is equivalent to taking the dollar volume averaged price and multiplying by the number of shares traded over that bar.

In [None]:
# Get the average trade price
print("Average trade price: %s" % (np.mean(prices)))

# Get the volume weighted average
vw = np.dot(num_shares, prices) / float(np.sum(num_shares))
print("Volume weighted average trade price: %s" % (vw))

# Go back to volume
v = vw * np.sum(num_shares)
print("Volume: %s" % (v))

Often in real datasets you will be given averaged or 'last traded' data rather than individual trades. With averaged data, some average is taken over a bar (time period). With last traded data, only the final observation from that bar is reported. It is important to know if the data is averaged, volume averaged, or simply last traded. All of these will need to be treated differently.

For pricing data, Quantopian currently (as of April 2017) provides the last traded prices at a minute resolution. The volume is the sum of all volume in that bar. While we do not offer minutely volume-weighted price, daily volume-weighted price can be approximated from the minute bars.

Let's look at some volume data.

In [None]:
start_date='2015-06-01'
end_date='2015-06-08'
data = pdr.get_data_yahoo('SPY', start=start_date, end=end_date, interval = "1m")

In [None]:
data = yf.download(tickers = "NVDA", start=start_date, end=end_date, interval = "1m")

In [2]:
data = pdr.get_data_yahoo("SPY", start="2017-06-01", end="2017-06-06", interval = "1m")

In [3]:
# data.head()
print(data)

                  High         Low        Open       Close    Volume  \
Date                                                                   
2017-06-01  243.380005  241.639999  241.970001  243.360001  68962000   
2017-06-02  244.350006  243.080002  243.419998  244.169998  88666100   
2017-06-05  244.300003  243.759995  243.970001  243.990005  44698800   
2017-06-06  243.979996  243.119995  243.339996  243.210007  50375400   

             Adj Close  
Date                    
2017-06-01  229.565781  
2017-06-02  230.329880  
2017-06-05  230.160095  
2017-06-06  229.424271  
