# AI for Trading: A Simple Pattern-Detecting "Trading Bot"

In this notebook we will **not** build a production-ready trading bot.  
Instead, we will use a **very simplified example** to show how AI / machine learning can:

1. Load historical price data from a well-known market (S&P 500 ETF: `SPY`)
2. Visualize the data and some simple indicators
3. Turn the problem into a **prediction task** (e.g. "Will the price go up tomorrow?")
4. Train different models, including a simple **neural network**, to detect patterns
5. Evaluate and reflect on the results and limitations

> ⚠️ **Important disclaimer**  
> This notebook is for **educational purposes only**.  
> It is **not** financial advice and must **not** be used for real-money trading decisions.


In [None]:
# If you're running this in a fresh environment, uncomment the line below to install dependencies:
# !pip install yfinance pandas numpy scikit-learn matplotlib

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier

plt.rcParams["figure.figsize"] = (10, 5)
plt.rcParams["axes.grid"] = True


## 1. A small "data module" to grab market data

In a real trading bot, you usually have a **separate module** responsible for:

- Connecting to a data provider (broker, exchange, or API like Yahoo Finance)
- Downloading historical data
- Cleaning and formatting it in a consistent way

Below we define a **tiny Python module-like set of functions** that:
- Downloads data for a given ticker (we'll use `SPY`, an ETF that tracks the S&P 500 index)
- Lets us specify how much history we want (e.g. 2 years)


In [None]:
def download_price_data(ticker: str = "SPY", period: str = "2y", interval: str = "1d") -> pd.DataFrame:
    """
    Download historical OHLCV (Open, High, Low, Close, Volume) data using yfinance.
    
    Args:
        ticker: Ticker symbol, e.g. "SPY".
        period: How far back to go (e.g. "1y", "2y", "5y", "max").
        interval: Bar size, e.g. "1d", "1h", "15m".
    
    Returns:
        DataFrame with datetime index and columns: Open, High, Low, Close, Adj Close, Volume.
    """
    data = yf.download(ticker, period=period, interval=interval, auto_adjust=True)
    data.dropna(inplace=True)
    return data

# Let's grab 2 years of daily data for SPY
data = download_price_data("SPY", period="2y", interval="1d")
data.head()


## 2. Visualizing the price history

To get some intuition, we'll:
- Plot the **closing price** over time.
- Add a couple of **moving averages**, which are common technical indicators:
  - 20-day moving average (short-term)
  - 50-day moving average (medium-term)

These are simple examples of hand-crafted **features** that traders use to detect trends.

In [None]:
# Compute moving averages
data["MA20"] = data["Close"].rolling(window=20).mean()
data["MA50"] = data["Close"].rolling(window=50).mean()

# Plot
plt.figure()
plt.plot(data.index, data["Close"], label="Close")
plt.plot(data.index, data["MA20"], label="MA 20")
plt.plot(data.index, data["MA50"], label="MA 50")
plt.title("SPY Price with 20- and 50-day Moving Averages")
plt.xlabel("Date")
plt.ylabel("Price")
plt.legend()
plt.show()


## 3. Turning price data into a prediction problem

Machine learning algorithms need:
- **Features (X):** Numbers that describe the current situation.
- **Labels/Targets (y):** The thing we want the model to predict.

Here we will define:
- **Target (`y`)**:  
  `1` if tomorrow's close is higher than today's close (price goes **up**),  
  `0` if tomorrow's close is lower or equal (price goes **down or stays flat**).

- **Features (`X`)** (very simple):
  - Today's daily return
  - 5-day rolling mean of returns
  - 20-day rolling mean of returns
  - 20-day moving average (price)
  - 50-day moving average (price)
  - Distance between price and moving averages

This is intentionally simple so that the notebook remains easy to understand for a workshop.

In [None]:
# Daily returns
data["Return_1d"] = data["Close"].pct_change()

# Rolling mean of returns
data["Return_5d_mean"] = data["Return_1d"].rolling(window=5).mean()
data["Return_20d_mean"] = data["Return_1d"].rolling(window=20).mean()

# Distance from moving averages (as percentage)
data["Dist_MA20"] = (data["Close"] - data["MA20"]) / data["MA20"]
data["Dist_MA50"] = (data["Close"] - data["MA50"]) / data["MA50"]

# Target: will price go up tomorrow?
data["Tomorrow_Close"] = data["Close"].shift(-1)
data["Target_Up"] = (data["Tomorrow_Close"] > data["Close"]).astype(int)

# Drop rows with any NaN (due to rolling calculations and shift)
data_ml = data.dropna().copy()

feature_cols = [
    "Return_1d",
    "Return_5d_mean",
    "Return_20d_mean",
    "MA20",
    "MA50",
    "Dist_MA20",
    "Dist_MA50",
]

X = data_ml[feature_cols]
y = data_ml["Target_Up"]

X.head(), y.head()


## 4. Train/test split

We now split our data into:
- **Training set**: Used to fit the models (learn patterns).
- **Test set**: Data the model has **never seen** during training, used to estimate how well it generalizes.

This is crucial to avoid **overfitting** (the model memorizing the past instead of learning patterns).
We’ll use a simple **time-based split**:

- First 70% of the timeline → training
- Last 30% → test


In [None]:
# Use a time-based split instead of random shuffling
split_index = int(len(X) * 0.7)

X_train, X_test = X.iloc[:split_index], X.iloc[split_index:]
y_train, y_test = y.iloc[:split_index], y.iloc[split_index:]

len(X_train), len(X_test)

## 5. Baseline models: "dumb" strategies

Before using AI, it's good practice to compare against simple baselines:

1. **Naive predictor**: Always predicts "up".
   - Markets have a long-term upward drift, so this is not completely stupid.

2. **Logistic Regression**: A simple **linear model** (not yet a neural network).
   - It tries to find a linear combination of features that best separates "up" vs "down" days.

These baselines help us check if our more complex models are actually learning something useful.


In [None]:
# Baseline 1: always predict "up"
y_pred_always_up = np.ones_like(y_test)
acc_always_up = accuracy_score(y_test, y_pred_always_up)
print(f"Baseline (always predict 'up') accuracy: {acc_always_up:.3f}")

# Baseline 2: Logistic Regression
log_reg = LogisticRegression(max_iter=1000)
log_reg.fit(X_train, y_train)

y_pred_log_reg = log_reg.predict(X_test)
acc_log_reg = accuracy_score(y_test, y_pred_log_reg)
print(f"Logistic Regression accuracy: {acc_log_reg:.3f}")

print("\nClassification report (Logistic Regression):")
print(classification_report(y_test, y_pred_log_reg, digits=3))


## 6. Random Forest: a non-linear ensemble model

Next, we try a **Random Forest**, which is:
- A collection (ensemble) of decision trees.
- Each tree captures non-linear rules like:  
  “If the price is far above the moving average AND recent returns are negative, then…”
- The forest averages across many trees to reduce overfitting.

This is not yet a neural network, but it **can detect more complex patterns** than a linear model.


In [None]:
rf = RandomForestClassifier(
    n_estimators=200,
    max_depth=6,
    random_state=42,
    n_jobs=-1,
)

rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)

acc_rf = accuracy_score(y_test, y_pred_rf)
print(f"Random Forest accuracy: {acc_rf:.3f}")

print("\nClassification report (Random Forest):")
print(classification_report(y_test, y_pred_rf, digits=3))

# Feature importance plot
importances = pd.Series(rf.feature_importances_, index=feature_cols).sort_values(ascending=False)
plt.figure()
importances.plot(kind="bar")
plt.title("Random Forest Feature Importances")
plt.ylabel("Importance")
plt.show()


## 7. A simple Neural Network (MLPClassifier)

Now we build a **small neural network** using scikit-learn's `MLPClassifier`:

- It is a **feedforward neural network** (also called a multilayer perceptron).
- Architecture:
  - Input layer: one neuron per feature
  - Hidden layer(s): neurons with non-linear activation functions
  - Output layer: predicts probability of "up" (1) vs "down" (0)

Even though this is much simpler than the huge networks used in modern AI (like GPT or large vision models),
it illustrates the same **core idea**: learning complex mappings from inputs to outputs by adjusting weights.


In [None]:
mlp = MLPClassifier(
    hidden_layer_sizes=(32, 16),  # two hidden layers
    activation="relu",
    solver="adam",
    max_iter=500,
    random_state=42,
)

mlp.fit(X_train, y_train)

y_pred_mlp = mlp.predict(X_test)
acc_mlp = accuracy_score(y_test, y_pred_mlp)

print(f"Neural Network (MLP) accuracy: {acc_mlp:.3f}")

print("\nClassification report (Neural Network):")
print(classification_report(y_test, y_pred_mlp, digits=3))

# Confusion matrix to see types of errors
cm = confusion_matrix(y_test, y_pred_mlp)
cm


## 8. Visualizing predictions over time

For intuition, we can plot:
- The actual direction (up/down) of the market
- The neural network's predicted direction

This helps illustrate when the model is in sync with the market and when it is not.


In [None]:
results = pd.DataFrame(index=X_test.index)
results["Actual_Up"] = y_test
results["Predicted_Up_MLP"] = y_pred_mlp

# Convert 0/1 to -1/1 for visualization (down = -1, up = 1)
results["Actual_Signal"] = results["Actual_Up"].replace({0: -1, 1: 1})
results["Pred_Signal"] = results["Predicted_Up_MLP"].replace({0: -1, 1: 1})

plt.figure()
plt.plot(results.index, results["Actual_Signal"], label="Actual up/down")
plt.plot(results.index, results["Pred_Signal"], label="Predicted up/down (MLP)", alpha=0.7)
plt.title("Actual vs Predicted Direction (MLP)")
plt.yticks([-1, 1], ["Down or flat", "Up"])
plt.xlabel("Date")
plt.legend()
plt.show()


## 9. From prediction to a (very naive) trading strategy

If we wanted to turn predictions into a simple **strategy**, we could say:

- If the model predicts **up** → go **long** (buy).
- If the model predicts **down** → go **flat** (hold cash).

We can simulate the cumulative return of this naive strategy and compare it to
just "buy and hold" SPY (this is still just a toy example).


In [None]:
# Use the test period's actual returns
test_returns = data_ml.loc[X_test.index, "Return_1d"]

# Strategy: if Predicted_Up_MLP == 1, we take the daily return; else, return 0
strategy_returns = test_returns * results["Predicted_Up_MLP"]

# Buy & hold benchmark: always invested
buy_hold_returns = test_returns

# Compute cumulative returns
cum_strategy = (1 + strategy_returns).cumprod()
cum_buy_hold = (1 + buy_hold_returns).cumprod()

plt.figure()
plt.plot(cum_strategy.index, cum_strategy, label="Strategy (MLP signals)")
plt.plot(cum_buy_hold.index, cum_buy_hold, label="Buy & Hold SPY")
plt.title("Toy Backtest: Strategy vs Buy & Hold (Test Period)")
plt.xlabel("Date")
plt.ylabel("Cumulative Return (normalized)")
plt.legend()
plt.show()


## 10. Interpretation & limitations

### What did the AI models do?

- They **did not** magically "understand" the market.
- They saw **historical patterns** in numerical features:
  - Recent returns
  - Moving averages
  - Distance from moving averages
- Based on these patterns, they learned to estimate the probability that
  **tomorrow's price** would be higher than **today's**.

### Why this is *not* a real trading bot

- We used a **tiny dataset** and very few features.
- We ignored:
  - Transaction costs and slippage
  - Risk management (position sizing, stop losses)
  - Market regimes (volatile vs calm periods)
  - Robust backtesting techniques (walk-forward, cross-validation in time, etc.)
- Real trading systems require **much more careful design, validation, and risk control**.

### Educational takeaway

This notebook shows how AI / ML:
- Takes **raw data** (prices) → transforms it into **features**.
- Defines a **prediction target** based on future outcomes.
- Trains different **models** (linear, tree-based, neural network) to detect patterns.
- Evaluates performance on **unseen data** to estimate how useful those patterns might be.

In your AI presentation/workshop, you can now:
- Walk through each block and explain the concept at a high level.
- Emphasize that the goal is **learning patterns from data**, not hand-coding rules.
- Highlight both the **power** and the **limitations** of AI in real-world domains like trading.
