# VWAP

## 1 Notations
* $VWAP_{t} = \frac{\sum_{k \in S}^{t} TP_{k} \cdot V_{k}}{\sum_{k \in S}^{t} V_{k}}$
* $Z_{t} = \frac{CP_{t} - VWAP_{t}}{\sigma_{t}}$
* $V_{t}$ = Volume at t-th minute bar
* $TP_{t}$ = Typical price = $\frac{HP_{t} + LP_{t} + CP_{t}}{3}$
* $HP_{t}$ = High price
* $LP_{t}$ = Low price
* $CP_{t}$ = Closing price at t-th minute bar
* $\sigma_{t}$ = Rolling standard deviation of ($CP_{t} - VWAP_{t}$)

## 2 Mean Reversion Strategy
* Position: long only (for now)
* Entry: $Z_{t} <$ `z_entry` $\in [-2.0, -1.5]$
* Exit: $Z_{t} >$ `z_exit` $\in [-0.5, -0.2]$
* Sigma stop: $Z_{t} <$ `z_stop` $\in [-3.0, -3.5]$
* Time stop: max holding time $<$ T $\in [60, 120]$ minutes
* Reset: $Z_{t} >$ `z_reset` $\in [-0.2, -0.0]$
* Normalization: trade after $t \in [30, 60]$ minutes
* Flat by the end of the day

## 3 Codes

In [1]:
import numpy as np
import pandas as pd
import pandas_ta as ta
import matplotlib.pyplot as plt

### 3.1 Data Management
* Each csv file has its name in the `"YYYY-MM-DD.csv"` format
* It has columns: `["ticker", "volume", "open", "close", "high", "low", "window_start", "transactions"]`
* `"ticker"` includes the minute bar aggregate of every ticker symbol of US stocks on the given day
* `"window_start"` is epoch time in ns

In [2]:
def datafeed(file_name, ticker_symbol):
    df = pd.read_csv(file_name)
    df = df[df["ticker"] == ticker_symbol]
    df["window_start"] = pd.to_datetime(df["window_start"], unit="ns")
    df = df.set_index("window_start").sort_index()
    df = df.between_time("09:30", "16:00")
    return df

### 3.2 VWAP Calculation
* For the rolling standard deviation, we choose `window=60` for default value, which means 1-hour rolling

In [3]:
def compute_typical_price(df):
    df["typical"] = (df["high"] + df["low"] + df["close"]) / 3
    return df

In [4]:
def compute_vwap(df):
    df["vwap"] = ta.vwap(high=df["high"], low=df["low"], close=df["close"], volume=df["volume"])
    return df

In [5]:
def compute_rolling_std(df, window=60):
    df["rolling_std"] = (df["close"] - df["vwap"]).rolling(window=window).std()
    return df

In [6]:
def compute_z_score(df):
    df["z"] = (df["close"] - df["vwap"]) / df["rolling_std"]
    return df

### 3.3 Transaction Log
* We initialize a log with a cash amount of choice and a position of 0
* We use a helper function to calculate the real time equity based on position and price

In [7]:
def create_log(df_data, cash):
    df_log = pd.DataFrame(np.nan, index=df_data.index, columns=["cash", "position", "price", "equity", "status"])
    df_log.at[df_log.index[0], "cash"] = cash
    df_log.at[df_log.index[0], "position"] = 0
    df_log.at[df_log.index[0], "price"] = df_data.at[df_data.index[0], "close"]
    df_log["status"] = "OK"
    return df_log

In [8]:
def update_equity(df):
    df["equity"] = df["position"] * df["price"] + df["cash"]
    return df

### 3.4 Paper Broker Orders
* Because of the small size of the trade, each order is "all in" for now
* We simultaneously write to the log whenever an order is executed

In [9]:
def broker_buy(df_data, df_log, quantity, index):
    buy_price = df_data.at[index, "close"]
    df_log.at[index, "price"] = buy_price
    df_log.at[index, "position"] += quantity
    df_log.at[index, "cash"] -= buy_price * quantity
    update_equity(df_log)
    return df_log

In [10]:
def broker_sell(df_data, df_log, quantity, index):
    sell_price = df_data.at[index, "close"]
    df_log.at[index, "price"] = sell_price
    df_log.at[index, "position"] -= quantity
    df_log.at[index, "cash"] += sell_price * quantity
    update_equity(df_log)
    return df_log

### 3.5 Strategy Execution
* For now, the execution only works with the log because of the paper broker orders
* There are more effective ways to start trading beyond a strict 1-hour ban, but we will use the simple rule for now

In [11]:
def vwap_entry(df_data, df_log, index, z_entry=-2.0):
    if df_log.at[index, "status"] == "OK" and df_data.at[index, "z"] <= z_entry and df_log.at[index, "cash"] > 0:
        quantity = df_log.at[index, "cash"] // df_data.at[index, "close"]
        df_log = broker_buy(df_data, df_log, quantity, index)
    return df_log

In [12]:
def vwap_exit(df_data, df_log, index, z_exit=-0.5):
    if df_data.at[index, "z"] >= z_exit:
        quantity = df_log.at[index, "position"]
        df_log = broker_sell(df_data, df_log, quantity, index)
    return df_log

In [13]:
def vwap_sigma_stop(df_data, df_log, index, z_stop=-3.0):
    if df_data.at[index, "z"] <= z_stop:
        quantity = df_log.at[index, "position"]
        df_log = broker_sell(df_data, df_log, quantity, index)
        df_log.loc[index:, "status"] = "Cooldown"
    return df_log

In [14]:
#def vwap_time_stop(df_data, df_log, index, stop_interval=120):

In [15]:
def vwap_reset(df_data, df_log, index, z_reset=-0.2):
    if df_log.at[index, "status"] == "Cooldown" and df_data.at[index, "z"] >= z_reset:
        df_log.loc[index:, "status"] = "OK"
    return df_log

In [16]:
def vwap_normalization(df_log, normalization_interval=60):
    start = df_log.index[0]
    end = start + pd.Timedelta(minutes=normalization_interval)
    df_log.loc[start:end, "status"] = "Normalization"
    return df_log

In [17]:
def vwap_flatten(df_data, df_log):
    last_index = df_log.index[-1]
    if df_log.at[last_index, "position"] != 0:
        quantity = df_log.at[last_index, "position"]
        df_log = broker_sell(df_data, df_log, quantity, last_index)
    return df_log

## 4 Backtest

In [18]:
backtest_data = datafeed("data/2025-09-03.csv", "META")

backtest_data = compute_typical_price(backtest_data)
backtest_data = compute_vwap(backtest_data)
backtest_data = compute_rolling_std(backtest_data)
backtest_data = compute_z_score(backtest_data)

In [19]:
backtest_log = create_log(backtest_data, 10000.00)

In [20]:
backtest_log = vwap_normalization(backtest_log)

for idx, i in enumerate(backtest_log.index):
    if idx == 0:
        continue
        
    prev_i = backtest_log.index[idx-1]
    backtest_log.at[i, "cash"] = backtest_log.at[prev_i, "cash"]
    backtest_log.at[i, "position"] = backtest_log.at[prev_i, "position"]
    backtest_log.at[i, "price"] = backtest_data.at[i, "close"]
    
    backtest_log = vwap_entry(backtest_data, backtest_log, i, -1.5)
    backtest_log = vwap_exit(backtest_data, backtest_log, i, -0.2)
    backtest_log = vwap_sigma_stop(backtest_data, backtest_log, i, -3.5)
    backtest_log = vwap_reset(backtest_data, backtest_log, i, -0.2)

backtest_log = vwap_flatten(backtest_data, backtest_log)
backtest_log = update_equity(backtest_log)

## 5 Result Visualization

In [21]:
backtest_data

Unnamed: 0_level_0,ticker,volume,open,close,high,low,transactions,typical,vwap,rolling_std,z
window_start,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2025-09-03 09:40:00,META,263,741.0000,741.0000,741.00,741.0000,15,741.000000,741.000000,,
2025-09-03 11:00:00,META,1802,741.4400,740.8000,741.44,740.8000,71,741.013333,741.011635,,
2025-09-03 11:01:00,META,301,741.3600,741.3600,741.36,741.3600,9,741.360000,741.055954,,
2025-09-03 11:07:00,META,2154,741.4800,741.4900,741.49,741.4800,105,741.486667,741.261209,,
2025-09-03 11:10:00,META,528,741.0800,741.2000,741.20,741.0800,9,741.160000,741.250623,,
...,...,...,...,...,...,...,...,...,...,...,...
2025-09-03 15:56:00,META,10107,736.0520,736.1985,736.35,735.9500,411,736.166167,736.942697,0.909336,-0.818396
2025-09-03 15:57:00,META,13179,736.2999,735.6400,736.30,735.6400,479,735.860000,736.938404,0.907808,-1.430263
2025-09-03 15:58:00,META,8129,735.7700,735.9750,736.07,735.7105,337,735.918500,736.935916,0.907020,-1.059422
2025-09-03 15:59:00,META,7597,735.9000,736.0400,736.13,735.8400,327,736.003333,736.933795,0.906920,-0.985528


In [22]:
backtest_log

Unnamed: 0_level_0,cash,position,price,equity,status
window_start,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-09-03 09:40:00,10000.0000,0.0,741.0000,10000.0000,Normalization
2025-09-03 11:00:00,10000.0000,0.0,740.8000,10000.0000,OK
2025-09-03 11:01:00,10000.0000,0.0,741.3600,10000.0000,OK
2025-09-03 11:07:00,10000.0000,0.0,741.4900,10000.0000,OK
2025-09-03 11:10:00,10000.0000,0.0,741.2000,10000.0000,OK
...,...,...,...,...,...
2025-09-03 15:56:00,527.7502,13.0,736.1985,10098.3307,OK
2025-09-03 15:57:00,527.7502,13.0,735.6400,10091.0702,OK
2025-09-03 15:58:00,527.7502,13.0,735.9750,10095.4252,OK
2025-09-03 15:59:00,527.7502,13.0,736.0400,10096.2702,OK


## 6 Notes

### 6.1 To-Do List
* Adjust `datafeed` to accept real time data from `polygon.io`
* Update broker order functions to connect with broker API
* Optimize `enumerate` in backtest
* Implement `vwap_time_stop` function
* Figure out the live minute bar vs closing price in backtest

### 6.2 Ideal Characteristics for VWAP Mean Reversion
* High liquidity / large market cap
* Range-bound or non-trending behavior intraday
* Regular volume profiles
* Moderate volatility: enough to extend, not too much to snap off
* VWAP “respected” historically as support/resistance
* Example candidates: SPY, QQQ, AAPL, MSFT, GOOG, AMZN, META