# EMA Crossover Strategy

This notebook implements and evaluates a **classic EMA crossover strategy** as part of the QuantTrade Foundation ETF roadmap.  
It demonstrates the full workflow of **data ingestion → feature engineering → strategy → backtesting → performance evaluation**.

**Purpose**
- Explore how exponential moving average crossovers generate buy/sell signals.  
- Build a reusable strategy component (`EMACrossStrategy`).  
- Visualize signals and evaluate performance using standardized metrics.  

**Notebook Sections**
1. **Setup** – initialize environment and configs  
2. **Helpers** – indicator and chart utilities  
3. **Data Ingestion** – load ETF data (DuckDB → Pandas)  
4. **Data Quality Checks** – ensure clean input data  
5. **Feature Engineering** – apply EMA, MA, RSI features  
6. **Strategy** – define and run the EMA crossover in Backtrader  
7. **Performance Evaluation** – analyze returns, drawdowns, Sharpe, and visualize executions  

**Outcomes**
- A validated EMA crossover strategy with optional SMA(200) trend filter.  
- Trade executions visualized with buy/sell markers.  
- Reusable strategy and performance evaluation utilities for future notebooks.


## 1. Setup

In [18]:
import os,sys
import duckdb
from pathlib import Path
import pandas as pd
import json
import backtrader as bt


In [19]:
PROJECT_ROOT = Path.cwd().parents[0]

if PROJECT_ROOT not in sys.path:
    sys.path.append(PROJECT_ROOT)

print(f"Project Root: {PROJECT_ROOT}")

Project Root: C:\Users\luyanda\workspace\QuantTrade


In [20]:
from utils.charts import render_lightweight_chart

In [21]:
from utils.duck import to_bt_daily_duckdb, to_bt_minute_duckdb

In [22]:
from utils.features import (
    add_mas_duckdb, add_macd_duckdb
)

In [23]:
from utils.performance import (
    run_backtest_daily, summarize_performance,
    transactions_to_df, extract_exec_df_from_strategy, 
    execs_to_lw_markers, plot_equity_curves_plotly,
    plot_returns_histogram, plot_drawdown_curve
)

In [24]:
from strategies import (
    MACDSignalCrossStrategy, MACDZeroLineStrategy, 
    MACDHistogramMomentumStrategy
)

In [25]:
DB_MINUTE = PROJECT_ROOT / "data" / "processed" / "alpaca" / "price_minute_alpaca.duckdb"
print(f"DB_MINUTE: {DB_MINUTE}")
con_minute = duckdb.connect(str(DB_MINUTE), read_only=True)
tables = [t[0] for t in con_minute.execute("SHOW TABLES").fetchall()]
print("📋 Tables:", tables)


DB_MINUTE: C:\Users\luyanda\workspace\QuantTrade\data\processed\alpaca\price_minute_alpaca.duckdb
📋 Tables: ['alpaca_minute']


In [26]:
DB_DAILY = PROJECT_ROOT / "data" / "processed" / "dolt" / "stocks.duckdb"
print(f"DB_DAILY: {DB_DAILY}")
con_daily = duckdb.connect(str(DB_DAILY))
tables = [t[0] for t in con_daily.execute("SHOW TABLES").fetchall()]
print("📋 Tables:", tables)

DB_DAILY: C:\Users\luyanda\workspace\QuantTrade\data\processed\dolt\stocks.duckdb
📋 Tables: ['dividend', 'ohlcv', 'split', 'symbol']


In [27]:
ETFS = ["SPY", "QQQ"]
CHARTS_DIR = PROJECT_ROOT / "charts" / "002_EMA_Crossovers"

In [28]:
START_CASH = 10_000
COMMISSION_BPS = 0.5 / 10_000  # 0.5 bps

## 2. Helpers

## 3. Data Ingestion

In [29]:
# --- Ingest latest minute-level data ---
minute_data = {sym: to_bt_minute_duckdb(con_minute, "alpaca_minute", sym) for sym in ETFS}

for symbol in ETFS:
    print(minute_data[symbol].tail(1))

                       open    high     low   close  volume  trade_count  \
datetime                                                                   
2025-08-15 22:00:00  643.48  643.48  643.16  643.16  4995.0          6.0   

                          vwap  
datetime                        
2025-08-15 22:00:00  643.30901  
                       open    high     low   close  volume  trade_count  \
datetime                                                                   
2025-08-15 22:56:00  577.03  577.03  577.03  577.03   314.0          2.0   

                       vwap  
datetime                     
2025-08-15 22:56:00  577.03  


In [30]:
# --- Ingest latest daily data ---
daily_data = {
    sym: to_bt_daily_duckdb(con_daily, sym, table="ohlcv", date_col="date", symbol_col="act_symbol")
    for sym in ETFS
}

for symbol in ETFS:
    print(daily_data[symbol].tail(1))


              open    high     low   close      volume
datetime                                              
2025-08-14  642.79  645.62  642.34  644.95  59327466.0
              open    high     low   close      volume
datetime                                              
2025-08-14  578.28  581.88  577.91  579.89  45425043.0


## 4. Data Quality Checks

U.S. Market (SPY, QQQ)
Assuming regular NYSE/Nasdaq trading hours:

| **Session**     | **Hours (ET)**   | **Duration** |
| --------------- | ---------------- | ------------ |
| Regular session | 09:30 – 16:00 ET | 6.5 hours    |
|                 |                  | 390 minutes  |

Expect around 390 rows per ETF

In [31]:
for symbol in ETFS:
    df = minute_data[symbol]
    print(f"\n🔍 {symbol}")
    print(f"  • Rows: {len(df)}")
    print(f"  • Date Range: {df.index.min().date()} → {df.index.max().date()}")
    print(f"  • Timezone-aware: {df.index.tz is not None}")
    # print(f"  • Missing 'close': {df['close'].isna().sum()}")

    # --- Drop timezone if needed ---
    df = df.copy()
    if df.index.tz is not None:
        df.index = df.index.tz_localize(None)

    # --- Identify all available intraday dates ---
    df["date"] = df.index.normalize()
    available_dates = df["date"].unique()

    # --- Construct full expected range (business days) ---
    expected_dates = pd.date_range(
        start=df.index.min().normalize(),
        end=df.index.max().normalize(),
        freq='B'
    )

    # --- Missing trading days entirely ---
    missing_dates = sorted(set(expected_dates) - set(available_dates))
    print(f"  • Missing Intraday Dates: {len(missing_dates)}")
    # if missing_dates:
    #     print("    Example:", missing_dates[:5])

    # --- Check for partial trading days (fewer than 390 rows) ---
    counts = df.groupby("date").size()
    partial_days = counts[counts < 390]
    print(f"  • Partial Intraday Days (<390 rows): {len(partial_days)}")
    # if not partial_days.empty:
    #     print("    Example:", partial_days.head())



🔍 SPY
  • Rows: 193498
  • Date Range: 2023-08-09 → 2025-08-15
  • Timezone-aware: False
  • Missing Intraday Dates: 22
  • Partial Intraday Days (<390 rows): 278

🔍 QQQ
  • Rows: 188312
  • Date Range: 2023-08-09 → 2025-08-15
  • Timezone-aware: False
  • Missing Intraday Dates: 22
  • Partial Intraday Days (<390 rows): 305


## 5. Feature Engineering

In [32]:
IND_MA_WINDOWS = [200]
IND_MACD_FAST = 12
IND_MACD_SLOW = 26
IND_MACD_SIGNAL = 9

In [33]:
daily_data_ma = add_mas_duckdb(
    daily_data,
    con_daily,
    windows=IND_MA_WINDOWS,
    price_col="close",
    prefix="ma"
)

In [34]:
daily_data_ma_macd = add_macd_duckdb(
    daily_data_ma,
    con_daily,
    price_col="close",
    fast=IND_MACD_FAST,
    slow=IND_MACD_SLOW,
    signal=IND_MACD_SIGNAL,
    prefix="macd",
)

In [35]:
print(daily_data_ma_macd["SPY"].tail(1))

              open    high     low   close      volume      ma200      macd  \
datetime                                                                      
2025-08-14  642.79  645.62  642.34  644.95  59327466.0  590.91615  6.145473   

            macd_signal  macd_hist  
datetime                            
2025-08-14     5.732925   0.412548  


In [36]:
# daily_data_ema = add_emas_duckdb(daily_data, con_daily, windows=IND_EMA_WINDOWS, price_col="close", prefix="ema")
# daily_data_ema_ma = add_mas_duckdb(daily_data_ema, con_daily, windows=IND_MA_WINDOWS, price_col="close", prefix="ma")
# daily_data_ema_ma_rsi = add_rsi_duckdb(daily_data_ema_ma, con_daily, period=IND_RSI_PERIOD, price_col="close")

## 6. Strategy

In [37]:
# Bridge pandas DataFrame → Backtrader DataFeed

class PandasDataMACDExt(bt.feeds.PandasData):
    """
    Extended PandasData feed that *optionally* maps MACD-related columns if present.
    Only OHLCV are required. If your df contains 'macd', 'macd_signal', 'macd_hist',
    and/or 'ma200', they will be exposed as data lines for plotting/inspection.

    Strategy note:
      The MACD strategies defined in macd_strategies.py compute MACD internally from
      close; they do NOT require these extra lines. These are here for convenience.
    """
    # Extra lines we *may* provide to the feed
    lines = ('macd', 'macd_signal', 'macd_hist', 'ma200',)

    # Default mapping: None means "not mapped unless set later"
    params = (
        ('datetime', None),      # index is datetime
        ('open', 'open'),
        ('high', 'high'),
        ('low', 'low'),
        ('close', 'close'),
        ('volume', 'volume'),
        ('openinterest', None),

        # Optional columns (auto-mapped in __init__ if present)
        ('macd', None),
        ('macd_signal', None),
        ('macd_hist', None),
        ('ma200', None),
    )

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        df = self.p.dataname
        if isinstance(df, pd.DataFrame):
            cols = {str(c).lower() for c in df.columns}
            # auto-map if present
            if 'macd' in cols and self.p.macd is None:
                self.p.macd = 'macd'
            if 'macd_signal' in cols and self.p.macd_signal is None:
                self.p.macd_signal = 'macd_signal'
            if 'macd_hist' in cols and self.p.macd_hist is None:
                self.p.macd_hist = 'macd_hist'
            if 'ma200' in cols and self.p.ma200 is None:
                self.p.ma200 = 'ma200'

In [38]:
SYMBOL = 'SPY'
START_CASH = 100_000
COMMISSION_BPS = 0.5  # 0.5 bps = 0.005% (adjust to your IBKR model)

In [39]:
df = daily_data_ma_macd["SPY"].copy()

In [40]:
# --- Variant 1: MACD crosses Signal ---
btres_sig = run_backtest_daily(
    df=df,
    strategy_cls=MACDSignalCrossStrategy,
    strategy_params=dict(
        fast=IND_MACD_FAST, slow=IND_MACD_SLOW, signal=IND_MACD_SIGNAL,
        use_trend_filter=True,   # gates longs with SMA(200)
        stake_pct=0.99,
        printlog=False,
    ),
    start_cash=START_CASH,
    commission_bps=COMMISSION_BPS,
    datafeed_cls=PandasDataMACDExt,
    symbol=SYMBOL,
)

In [41]:
exec_df_sig = transactions_to_df(btres_sig.strategy.analyzers.tx)
markers_sig = execs_to_lw_markers(exec_df_sig)

In [44]:
render_lightweight_chart(
    df,
    symbol="SPY",
    assets_rel="../../utils/static",
    out_html=CHARTS_DIR/"etf_macd_sig_SPY.html",
    theme="dark",
    macd_params=None,                      # disable compute
    macd_add_markers=True,
    macd_from_cols=("macd","macd_signal","macd_hist"),
    
    markers=markers_sig
    # macd_add_markers=True
)

WindowsPath('C:/Users/luyanda/workspace/QuantTrade/charts/002_EMA_Crossovers/etf_macd_sig_SPY.html')

In [None]:
render_lightweight_chart(
    df,
    symbol="SPY",
    out_html=CHARTS_DIR/"etf_macd_sig_SPY.html",
    theme="dark",
    ma_windows=IND_MA_WINDOWS,  
    # ema_windows=IND_EMA_WINDOWS,          
    # rsi_period=IND_RSI_PERIOD,             
    # rsi_bounds=IND_RSI_BOUNDS,
    timeframes=["1d", "1h", "15m"],
    default_tf="1d",
    watermark_text="SPY — {tf}",
    watermark_opacity=0.0001,
    assets_rel="../../utils/static",
    markers=markers_sig
)

WindowsPath('C:/Users/luyanda/workspace/QuantTrade/charts/002_EMA_Crossovers/etf_macd_sig_SPY.html')

In [23]:
exec_df = transactions_to_df(btres.strategy.analyzers.tx)
markers = execs_to_lw_markers(exec_df)

In [24]:
render_lightweight_chart(
    df,
    symbol="SPY",
    out_html=CHARTS_DIR/"etf_cross_daily_markers_SPY.html",
    theme="dark",
    ma_windows=IND_MA_WINDOWS,  
    ema_windows=IND_EMA_WINDOWS,          
    rsi_period=IND_RSI_PERIOD,             
    # rsi_bounds=IND_RSI_BOUNDS,
    timeframes=["1d", "1h", "15m"],
    default_tf="1d",
    watermark_text="SPY — {tf}",
    watermark_opacity=0.0001,
    assets_rel="../../utils/static",
    markers=markers
)

WindowsPath('C:/Users/luyanda/workspace/QuantTrade/charts/002_EMA_Crossovers/etf_cross_daily_markers_SPY.html')

## 7. Performance Evaluation

In [25]:
perf = summarize_performance(btres)

# Interactive plots
fig1 = plot_equity_curves_plotly(perf, symbol="SPY")
fig1.show()

fig2 = plot_returns_histogram(perf)
fig2.show()

fig3 = plot_drawdown_curve(perf)
fig3.show()

In [81]:
con_minute.close()

In [82]:
con_daily.close()