<div style="background-color:#000;"><img src="pqn.png"></img></div>

We use numpy for math operations, pandas for organizing data, yfinance for pulling price data, seaborn and matplotlib for charting, scipy.stats for statistics, and warnings to keep output clean.

## Imports and setup

This block imports all the libraries needed for data handling, analysis, and visualization, and suppresses warnings to keep output readable.

In [None]:
import numpy as np
import pandas as pd
import yfinance as yf
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import spearmanr
import warnings
warnings.filterwarnings("ignore")

Here, we set up all our tools. Numpy and pandas handle data, yfinance brings in market data, seaborn and matplotlib handle plotting, and scipy.stats helps with statistics. We keep output uncluttered by hiding warnings so we can focus on results.

## Download and prepare market data

This block pulls daily data for a list of large technology companies from Yahoo Finance, then restructures the data so it's easy to analyze by date and ticker symbol.

In [None]:
symbols = ["AAPL", "MSFT", "GOOGL", "META", "NVDA", "AMZN", "TSLA"]
prices = (
    yf.download(
        symbols,
        start="2020-01-01",
        end="2024-12-31",
        auto_adjust=False,
        progress=False,
    )
    .stack(level=1)
    .reset_index()
    .set_index("Date")
)
prices.columns = [col.lower() for col in prices.columns]

In [None]:
grouped_prices = (
    prices
    .set_index("ticker", append=True)
    .reorder_levels(["ticker", "Date"])
    .sort_index(level=[0, 1])
)

We're collecting daily price data for key symbols like AAPL, MSFT, and TSLA, covering nearly five years. The data is organized in a way that lets us easily group by ticker and date, with all column names set to lowercase for consistency. We make sure the structure lines up with how we'll analyze trends and returns by keeping both the company ticker and the trading date.

## Calculate mean reversion and returns

This block calculates how far each stock's price is from its average over a rolling one-month window and measures returns over several holding periods for later analysis.

In [None]:
window = 22
mean_reversion = (
    lambda x: (x - x.rolling(window, min_periods=window).mean())
    / x.rolling(window, min_periods=window).std()
)
grouped_prices["factor_score"] = grouped_prices.groupby("ticker")["close"].transform(mean_reversion)

In [None]:
lags = [1, 5, 10, 22, 42, 63, 126]
for lag in lags:
    grouped_prices[f"return_{lag}d"] = (
        grouped_prices
        .groupby(level="ticker")
        .close
        .pct_change(lag)
    )

In [None]:
for t in lags:
    grouped_prices[f"target_{t}d"] = (
        grouped_prices
        .groupby(level="ticker")[f"return_{t}d"]
        .shift(-t)
    )

In [None]:
grouped_prices.dropna(inplace=True)

We use a rolling 22-day window, roughly one month, to see how much a stock's price differs from its own recent average—this helps us understand how "unusual" a price move is. We also calculate returns for each company over periods from 1 to 126 days to explore short- and long-term movements. By creating shifted versions of these returns, we set up "targets" representing forward returns. Last, we remove all rows with missing values to keep our analysis clean.

## Visualize and evaluate relationships

This block visualizes the link between recent price deviations and future returns for Tesla, then measures how well our signal predicts returns for each symbol using a statistical test.

In [None]:
target = "target_22d"
metric = "factor_score"
j = sns.jointplot(x=metric, y=target, data=grouped_prices.loc["TSLA"])
plt.tight_layout()

In [None]:
results = (
    grouped_prices
    .groupby("ticker")
    .apply(lambda x: spearmanr(x[metric], x[target]))
    .apply(pd.Series)
    .round(4)
)
results.columns = ["statistic", "p_value"]
results.sort_values("statistic", ascending=False)

We create a scatter plot for Tesla, showing how much its price moved away from average versus what happened to its returns in the next month. After visualizing, we run a statistical test for each company to measure the strength of the connection between our price deviation measure and future returns. The results table ranks each symbol by how predictive that relationship is. This gives us a clear view of which stocks' price moves are most likely to revert and which signals might be meaningful.

<a href="https://pyquantnews.com/">PyQuant News</a> is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to <a href="https://gettingstartedwithpythonforquantfinance.com/">get started with Python for quant finance</a>. For educational purposes. Not investment advice. Use at your own risk.