<div style="background-color:#000;"><img src="pqn.png"></img></div>

## Imports and setup

We’re pulling in specialized tools for getting financial data, crunching numbers, and running backtests. This group of libraries gives us everything we need to handle price history and simulate trading strategies.

In [None]:
import pandas as pd
import numpy as np
import yfinance as yf

These libraries bring in both core and extended functions for quantitative trading analysis, including the core backtesting framework, trading calendar, and custom commissions or trading costs.

In [None]:
import zipline
from zipline.api import set_slippage, set_commission, order_target_percent, symbol, schedule_function, date_rules, time_rules
from zipline.utils.calendars import get_calendar
from zipline.finance import commission, slippage

In [None]:
from zipline.data.bundles import register, unregister
import shutil
import os

In [None]:
from zipline.data.bundles.core import ingest
from zipline.data.bundles.yahoofinance import yahoo_equities

In [None]:
from zipline import run_algorithm
from zipline.api import set_benchmark

This set of imports will let us collect and prepare price data, then set up our environment for running and evaluating an investment strategy. By using these, we can backtest well-defined, rules-based strategies with real price history.

## Load and process stock price data

Here, we gather several years of daily price data for well-known tech companies and clean it up to remove any gaps.

In [None]:
tickers = ["AAPL", "MSFT", "GOOGL", "META", "AMZN"]
start_date = "2016-01-01"
end_date = "2023-12-31"

In [None]:
price_data = yf.download(tickers, start=start_date, end=end_date, progress=False)["Adj Close"]
price_data = price_data.dropna(axis=0, how="any")

We select a handful of familiar technology stocks and specify the time period for our backtest. To keep our calculations clean, we pull only the adjusted closing prices—these reflect splits and dividends to show the truest value. We then remove any days with missing data so our later calculations won’t trip us up or give misleading results.

Next, we calculate how much each stock's price changes every day and build a factor based on those moves, which will drive our trading signals.

In [None]:
mean_rev_window = 5
returns = price_data.pct_change()
factor_scores = -(returns - returns.rolling(mean_rev_window).mean())
factor_scores = factor_scores.shift(1)

We’re focusing on mean reversion—a quick way to spot when a stock's price jump is out of sync with its recent history. For each stock, we compare today's price change to its average over the last few days. If something looks “too high” compared to where it’s been, that becomes a potential sell signal, and if “too low,” a buy. By shifting these signals one day back, we make sure our backtest uses only information that would have been known in real life, not future facts.

## Define and configure trading strategy

We map stock tickers into the trading engine’s format so we can access each security by name, and set up the core functions that will apply our trading rules and manage costs.

In [None]:
assets = {t: symbol(t) for t in tickers}

In [None]:
def initialize(context):
    set_slippage(slippage.FixedSlippage(spread=0.0))
    set_commission(commission.PerShare(cost=0.001))
    context.assets = [symbol(t) for t in tickers]
    schedule_function(rebalance, date_rule=date_rules.every_day(), time_rule=time_rules.market_open(minutes=1))

In [None]:
def rebalance(context, data):
    date = data.current_dt.strftime('%Y-%m-%d')
    if date not in factor_scores.index:
        return
    scores = factor_scores.loc[date]
    scores = scores.dropna()
    if len(scores) == 0:
        return
    scores = scores.sort_values(ascending=False)
    top = scores.head(2)
    bottom = scores.tail(2)
    weights = pd.Series(0, index=scores.index)
    if len(top) > 0:
        weights[top.index] = 0.5 / len(top)
    if len(bottom) > 0:
        weights[bottom.index] = -0.5 / len(bottom)
    for t in scores.index:
        order_target_percent(assets[t], weights[t])

Here, we tell the trading engine to treat all the stocks in our list as “tradeable,” and then define our main strategy logic. Our rules work like this: every day, shortly after the open, we sort the stocks by their mean reversion signal. We’re looking to buy the two that look most “oversold” and sell (or short) the two that look most “overbought.” We then split our cash so that our buys and sells each add up to about half our portfolio, targeting balance between long and short bets. We set up small, fixed per-share trading costs and simulate perfectly efficient trading without extra cost from spread. This function gets called automatically at the scheduled time.

## Build and register the data bundle

We make sure the data we use for backtesting is stored in a format the trading engine expects, cleaning up any old data and loading a fresh copy specific to our stocks and timeframe.

In [None]:
bundle_name = "meanrev_bundle"

In [None]:
if bundle_name in zipline.data.bundles.bundles:
    unregister(bundle_name)
if os.path.exists(os.path.expanduser(f"~/.zipline/data/{bundle_name}")):
    shutil.rmtree(os.path.expanduser(f"~/.zipline/data/{bundle_name}"))

In [None]:
register(
    bundle_name,
    yahoo_equities(tickers, start=start_date, end=end_date, show_progress=True),
)

In [None]:
ingest(bundle_name, show_progress=True)

Before running our simulation, we need to process our data into a neat “bundle.” We remove any old versions of this dataset to avoid confusion, then build and save a fresh feed from Yahoo Finance. This ensures that when the backtest runs, it gets consistent and up-to-date data tailored just for this model.

## Run the backtest and view results

Now, we set our simulation parameters, run the backtest, and visualize how our trading rules performed over time.

In [None]:
calendar = get_calendar("XNYS")
capital_base = 100000

In [None]:
def before_trading_start(context, data):
    set_benchmark(symbol("AAPL"))

In [None]:
results = run_algorithm(
    start=pd.Timestamp(start_date, tz="utc"),
    end=pd.Timestamp(end_date, tz="utc"),
    initialize=initialize,
    before_trading_start=before_trading_start,
    capital_base=capital_base,
    bundle=bundle_name,
    trading_calendar=calendar,
)

In [None]:
results[["portfolio_value"]].plot(title="Mean Reversion Strategy Portfolio Value", figsize=(12, 6))

We tell the backtester to use the New York stock market’s calendar, and start with a cash balance of $100,000. Before trading each day, we use Apple as our yardstick to compare results. Then we fire off our simulation, guiding it with our rules and the pre-loaded data. Finally, we generate a chart showing how the total value of our account changed over the years—this gives us a quick, honest read on whether our mean reversion approach would have been successful with real money.

<a href="https://pyquantnews.com/">PyQuant News</a> is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to <a href="https://gettingstartedwithpythonforquantfinance.com/">get started with Python for quant finance</a>. For educational purposes. Not investment advice. Use at your own risk.