# Real Market Data Backtesting with KDB-X Python

This notebook demonstrates how to build a **financial backtesting system** using KDB-X Python. We'll simulate a simple trading strategy and calculate how much profit or loss it would have generated.

## What we explore in this Notebook?


- **Real market data** from Yahoo Finance (last 8 days, minute-level)
- **Moving average crossover strategy** - a classic momentum approach
- **15-minute bars** - aggregated from minute data for cleaner signals
- **Complete trade analysis** - PnL, win rate, drawdowns, and more

## The Strategy: Moving Average Crossover

We'll implement a **dual moving average crossover**:
- **Fast MA**: 10-period moving average (responsive to recent price action)
- **Slow MA**: 30-period moving average (tracks longer-term trend)
- **Buy signal**: When fast MA crosses *above* slow MA ‚Üí bullish momentum
- **Sell signal**: When fast MA crosses *below* slow MA ‚Üí bearish momentum

This is a trend-following strategy used by traders to ride momentum waves.

## Prerequisites

- Requires KDB-X to be installed, you can sign up and download on [Developer Center](https://developer.kx.com/products/kdb-x/install). For full install instructions see: [KDB-X Install](https://code.kx.com/kdb-x/).

- To Install [KDB-X Python](https://code.kx.com/pykx/4.0/examples/jupyter-integration.html): `pip install --upgrade --pre pykx`


In [4]:
!pip install -qq --upgrade --pre pykx

[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m44.8/44.8 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m13.5/13.5 MB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import pykx as kx
import yfinance as yf
from datetime import datetime, timedelta

## Step 1: Fetch Real Market Data

We'll download the last 8 days of minute-level data for **AAPL** and **MSFT**. Yahoo Finance provides free intraday data, though it's limited to the most recent period.

### Why minute data?
Minute bars give us fine-grained price action, allowing us to:
- Aggregate into any timeframe we want (5min, 15min, 1hr)
- Simulate realistic entry/exit timing
- See intraday volatility patterns

We'll use Python/yfinance to fetch the data, then convert it to a q table for all subsequent analysis.

In [6]:
# Calculate dynamic date range (last 8 days)
end_date = datetime.now()
start_date = end_date - timedelta(days=8)

print(f"Fetching data from {start_date.date()} to {end_date.date()}...")

# Download minute-level data
data = yf.download(
    tickers=["AAPL", "MSFT"],
    start=start_date.strftime("%Y-%m-%d"),
    end=end_date.strftime("%Y-%m-%d"),
    interval="1m",
    auto_adjust=True
)

# Reshape from wide (columns per ticker) to long format (rows per ticker)
data_long = data.stack(level=1).reset_index()

# Rename columns
data_long = data_long.rename(columns={
    'Datetime': 'dt',
    'Ticker': 'sym'
})

# Convert ticker symbols to lowercase (q convention)
data_long['sym'] = data_long['sym'].str.lower()

# Convert to KDB-X Python table
quotes = kx.toq(data_long)

print(f"‚úì Loaded {len(data_long):,} minute bars")
print(f"‚úì Tickers: {', '.join(data_long['sym'].unique())}")

Fetching data from 2026-01-19 to 2026-01-27...


[*********************100%***********************]  2 of 2 completed

‚úì Loaded 3,898 minute bars
‚úì Tickers: aapl, msft



  data_long = data.stack(level=1).reset_index()


## Step 2: Inspect the Data Structure

Let's examine what we're working with.

The data should have columns: `dt` (datetime), `sym` (symbol), `Open`, `High`, `Low`, `Close`, `Volume`.

In [7]:
# Show the first 10 quotes
quotes.head(10)

Unnamed: 0,dt,sym,Close,High,Low,Open,Volume
,,,,,,,
0.0,2026.01.20D14:30:00.000000000,aapl,254.48,254.79,252.36,252.51,2600373.0
1.0,2026.01.20D14:30:00.000000000,msft,450.45,452.13,450.1,451.43,1289307.0
2.0,2026.01.20D14:31:00.000000000,aapl,253.9,254.64,253.809,254.39,250428.0
3.0,2026.01.20D14:31:00.000000000,msft,450.12,450.86,449.33,450.51,253023.0
4.0,2026.01.20D14:32:00.000000000,aapl,254.3,254.57,253.83,253.91,232262.0
5.0,2026.01.20D14:32:00.000000000,msft,449.74,450.175,449.3,450.06,99677.0
6.0,2026.01.20D14:33:00.000000000,aapl,254.38,254.79,254.19,254.27,192391.0
7.0,2026.01.20D14:33:00.000000000,msft,450.32,450.5,449.28,449.69,89392.0
8.0,2026.01.20D14:34:00.000000000,aapl,254.07,254.39,253.85,254.37,181225.0


In [10]:
quotes.dtypes

Unnamed: 0,columns,datatypes
,,
0.0,dt,"""kx.TimestampAtom"""
1.0,sym,"""kx.SymbolAtom"""
2.0,Close,"""kx.FloatAtom"""
3.0,High,"""kx.FloatAtom"""
4.0,Low,"""kx.FloatAtom"""
5.0,Open,"""kx.FloatAtom"""
6.0,Volume,"""kx.LongAtom"""


In [8]:
quotes.select(
    columns={
        'bars': 'count i',
        'start': 'min dt',
        'end': 'max dt',
        'low_price': 'min Close',
        'high_price': 'max Close'
        },
        by=kx.Column('sym')
  )

Unnamed: 0_level_0,bars,start,end,low_price,high_price
sym,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
aapl,1949,2026.01.20D14:30:00.000000000,2026.01.26D20:59:00.000000000,243.44,256.527
msft,1949,2026.01.20D14:30:00.000000000,2026.01.26D20:59:00.000000000,438.8001,474.1025


## Step 3: Aggregate to 15-Minute Bars

Minute data can be noisy. We'll aggregate it into **15-minute OHLCV bars** for cleaner signals.

### The Process:
1. Extract the time component from each datetime
2. Use [`xbar`](https://code.kx.com/pykx/4.0/api/columns.html#pykx.wrappers.Column.xbar) to bucket times into 15-minute intervals
3. Aggregate: first Open, max High, min Low, last Close, sum Volume
4. Group by symbol and time bucket

This is standard OHLC bar construction used throughout finance.

In [12]:
# Extract date and time components
quotes = quotes.update(columns={
    'date': '`date$dt',
    'time': '`time$dt'
})
quotes.dtypes


Unnamed: 0,columns,datatypes
,,
0.0,dt,"""kx.TimestampAtom"""
1.0,sym,"""kx.SymbolAtom"""
2.0,Close,"""kx.FloatAtom"""
3.0,High,"""kx.FloatAtom"""
4.0,Low,"""kx.FloatAtom"""
5.0,Open,"""kx.FloatAtom"""
6.0,Volume,"""kx.LongAtom"""
7.0,date,"""kx.DateAtom"""
8.0,time,"""kx.TimeAtom"""


In [13]:
# Create 15-minute time buckets using xbar
# xbar rounds down to nearest interval (e.g., 14:37 -> 14:30)
quotes = quotes.update(columns={'bucket': '15 xbar time'})

# Aggregate to 15-minute bars
bars = quotes.select(
    columns={
        'open': 'first Open',
        'high': 'max High',
        'low': 'min Low',
        'close': 'last Close',
        'volume': 'sum Volume'
    },
    by=[kx.Column('sym'), kx.Column('date'), kx.Column('bucket')]
)

# Recreate datetime column
bars = bars.update(columns={'dt': kx.Column('date') + kx.Column('bucket')})

# Clean up temporary columns
bars = bars.delete(columns=['date', 'bucket'])

# Sort by symbol and datetime for efficiency
bars = kx.q.xasc(['sym', 'dt'], bars)

print("15-minute bars created:")
bars.head(10)

15-minute bars created:


Unnamed: 0,sym,open,high,low,close,volume,dt
,,,,,,,
0.0,aapl,252.51,254.79,252.36,254.48,2600373.0,2026.01.20D14:30:00.000000000
1.0,aapl,254.39,254.64,253.809,253.9,250428.0,2026.01.20D14:31:00.000000000
2.0,aapl,253.91,254.57,253.83,254.3,232262.0,2026.01.20D14:32:00.000000000
3.0,aapl,254.27,254.79,254.19,254.38,192391.0,2026.01.20D14:33:00.000000000
4.0,aapl,254.37,254.39,253.85,254.07,181225.0,2026.01.20D14:34:00.000000000
5.0,aapl,254.02,254.16,253.3,253.57,186460.0,2026.01.20D14:35:00.000000000
6.0,aapl,253.55,253.58,252.5,252.76,152844.0,2026.01.20D14:36:00.000000000
7.0,aapl,252.76,252.99,252.57,252.67,121456.0,2026.01.20D14:37:00.000000000
8.0,aapl,252.64,253.66,252.47,253.465,142247.0,2026.01.20D14:38:00.000000000


## Step 4: Calculate Moving Averages

Now we calculate our strategy indicators:
- **Fast MA**: 10-period moving average using [`mavg`](https://code.kx.com/pykx/4.0/api/pykx-execution/q.html#mavg)
- **Slow MA**: 30-period moving average

The `by sym` clause ensures we calculate MAs separately for each ticker - no data leakage between symbols!

A 10-period moving average is the average closing price of the last 10 fifteen-minute bars (fast-moving, reacts quickly to recent price changes), while a 30-period moving average is the average closing price of the last 30 fifteen-minute bars (slow-moving, smooths out short-term fluctuations to show the longer-term trend).

### Identifying Crossovers
A crossover occurs when:
- **Bullish**: Fast MA was ‚â§ Slow MA, now Fast MA > Slow MA
- **Bearish**: Fast MA was ‚â• Slow MA, now Fast MA < Slow MA

We use [`prev`](https://code.kx.com/pykx/4.0/api/columns.html#pykx.wrappers.Column.prev) to access the previous row's values for comparison.

In [14]:
# 1. Calculate moving averages per symbol
bars = bars.update(
    columns={
        'ma_fast': kx.Column('close').mavg(10),
        'ma_slow': kx.Column('close').mavg(30)
    },
    by=kx.Column('sym')
)

# 2. Get previous values for crossover detection
bars = bars.update(
    columns={
        'prev_fast': kx.Column('ma_fast').prev(),
        'prev_slow': kx.Column('ma_slow').prev()
    },
    by=kx.Column('sym')
)

# 3. Identify crossover signals
# signal = 1: Bullish Cross
# signal = -1: Bearish Cross
bars = bars.update(
    columns={
        'signal': """
        ?[(ma_fast > ma_slow) & prev_fast <= prev_slow; 1i;
          ?[(ma_fast < ma_slow) & prev_fast >= prev_slow; -1i; 0i]]
        """
    }
)

print("Trading signals generated:")

# Filter and display rows where signal is not 0
signals = bars.select(where='signal <> 0')
signals.head(10)


Trading signals generated:


Unnamed: 0,sym,open,high,low,close,volume,dt,ma_fast,ma_slow,prev_fast,prev_slow,signal
,,,,,,,,,,,,
0.0,aapl,253.09,253.14,252.6,252.74,123556.0,2026.01.20D14:40:00.000000000,253.4935,253.5832,253.6675,253.6675,-1i
1.0,aapl,251.4701,251.74,251.445,251.6157,99923.0,2026.01.20D15:29:00.000000000,251.5011,251.4717,251.4335,251.5038,1i
2.0,aapl,251.29,251.33,251.19,251.215,54891.0,2026.01.20D15:39:00.000000000,251.3346,251.3358,251.3747,251.3416,-1i
3.0,aapl,251.78,252.06,251.76,252.06,79390.0,2026.01.20D16:01:00.000000000,251.338,251.3233,251.244,251.3,1i
4.0,aapl,251.45,251.55,251.36,251.36,42067.0,2026.01.20D16:21:00.000000000,251.545,251.5538,251.592,251.5458,-1i
5.0,aapl,251.5084,251.55,251.43,251.445,48272.0,2026.01.20D16:40:00.000000000,251.3716,251.3599,251.3331,251.3725,1i
6.0,aapl,250.87,250.87,250.82,250.8503,43375.0,2026.01.20D16:44:00.000000000,251.2422,251.2451,251.2892,251.2728,-1i
7.0,aapl,251.24,251.28,251.14,251.15,41432.0,2026.01.20D17:03:00.000000000,251.0464,251.0329,250.9764,251.0376,1i
8.0,aapl,250.73,250.77,250.66,250.76,42190.0,2026.01.20D17:14:00.000000000,250.912,250.9138,250.941,250.9168,-1i


## Step 5: Generate Trade Pairs

We have signals, now we need to pair them up into actual trades:
- Each **BUY signal** (signal=1) is an entry
- Each **SELL signal** (signal=-1) is an exit
- We match each entry with the next exit using an asof join

### Why Asof Join?
The [`aj`](https://code.kx.com/q/ref/aj/) function finds the "most recent match as of a time" - perfect for matching "next exit after this entry".

We'll trade 100 shares per position for simplicity.

In [15]:
# 1. Extract and label trades
trades = bars.select(
    columns=['sym', 'dt', 'signal', 'close'],
    where='signal <> 0'
)

# 2. Add 'tradetype' as a symbol column
trades = trades.update(
    columns={'tradetype': '?[signal = 1; `buy; `sell]'}
)

# 3. Separate buys and sells into their own tables
buys = trades.select(
    columns={'sym': 'sym', 'entry_dt': 'dt', 'entry_price': 'close'},
    where='tradetype = `buy'
)

sells = trades.select(
    columns={'sym': 'sym', 'exit_dt': 'dt', 'exit_price': 'close'},
    where='tradetype = `sell'
)

# 4. Sort BOTH in ascending order and apply the sorted attribute
buys = kx.q.xasc(['sym', 'entry_dt'], buys).sorted()
sells = kx.q.xasc(['sym', 'exit_dt'], sells).sorted()

# 5. Match each SELL with its preceding BUY using a lambda function
positions = kx.q(
    '{[s;b] aj[`sym`exit_dt; s; update exit_dt:entry_dt from b]}',
    sells,  # First argument: s (sells)
    buys    # Second argument: b (buys)
)

# Reorder columns for readability (entry, then exit)
positions = positions.select(
    columns=['sym', 'entry_dt', 'entry_price', 'exit_dt', 'exit_price']
)

# 6. Clean up - remove sells without a matching buy
positions = positions.select(where='not null entry_price')
positions = positions.update(columns={'size': '100i'})

print("Generated positions:")
positions.head(10)

Generated positions:


Unnamed: 0,sym,entry_dt,entry_price,exit_dt,exit_price,size
,,,,,,
0.0,aapl,2026.01.20D15:29:00.000000000,251.6157,2026.01.20D15:39:00.000000000,251.215,100i
1.0,aapl,2026.01.20D16:01:00.000000000,252.06,2026.01.20D16:21:00.000000000,251.36,100i
2.0,aapl,2026.01.20D16:40:00.000000000,251.445,2026.01.20D16:44:00.000000000,250.8503,100i
3.0,aapl,2026.01.20D17:03:00.000000000,251.15,2026.01.20D17:14:00.000000000,250.76,100i
4.0,aapl,2026.01.20D19:44:00.000000000,247.703,2026.01.20D19:51:00.000000000,247.105,100i
5.0,aapl,2026.01.20D20:59:00.000000000,246.69,2026.01.21D15:07:00.000000000,246.4138,100i
6.0,aapl,2026.01.21D15:20:00.000000000,247.165,2026.01.21D16:02:00.000000000,247.92,100i
7.0,aapl,2026.01.21D17:25:00.000000000,246.0199,2026.01.21D18:05:00.000000000,245.83,100i
8.0,aapl,2026.01.21D18:14:00.000000000,246.2,2026.01.21D18:26:00.000000000,245.89,100i


## Step 6: Calculate PnL

For each position, we calculate:
- **PnL in dollars**: `size √ó (exit_price - entry_price)`
- **PnL percentage**: `100 √ó (exit_price - entry_price) / entry_price`
- **Holding period**: How long we held the position

This tells us which trades were winners and losers.

In [32]:
# Calculate PnL metrics
positions = positions.update(
    columns={
        'pnl': 'size * (exit_price - entry_price)',                    # Dollar P&L
        'pnl_pct': '100 * (exit_price - entry_price) % entry_price',   # Percentage return
        'holding_period': 'exit_dt - entry_dt'                          # Time held
    }
)

print("Position PnL:")
positions.head(15)

Position PnL:


Unnamed: 0,sym,entry_dt,entry_price,exit_dt,exit_price,size,pnl,pnl_pct,holding_period
,,,,,,,,,
0.0,aapl,2026.01.20D15:29:00.000000000,251.6157,2026.01.20D15:39:00.000000000,251.215,100i,-40.07111,-0.1592552,0D00:10:00.000000000
1.0,aapl,2026.01.20D16:01:00.000000000,252.06,2026.01.20D16:21:00.000000000,251.36,100i,-69.99969,-0.2777104,0D00:20:00.000000000
2.0,aapl,2026.01.20D16:40:00.000000000,251.445,2026.01.20D16:44:00.000000000,250.8503,100i,-59.47113,-0.2365174,0D00:04:00.000000000
3.0,aapl,2026.01.20D17:03:00.000000000,251.15,2026.01.20D17:14:00.000000000,250.76,100i,-38.99994,-0.1552854,0D00:11:00.000000000
4.0,aapl,2026.01.20D19:44:00.000000000,247.703,2026.01.20D19:51:00.000000000,247.105,100i,-59.80072,-0.2414211,0D00:07:00.000000000
5.0,aapl,2026.01.20D20:59:00.000000000,246.69,2026.01.21D15:07:00.000000000,246.4138,100i,-27.61993,-0.1119621,0D18:08:00.000000000
6.0,aapl,2026.01.21D15:20:00.000000000,247.165,2026.01.21D16:02:00.000000000,247.92,100i,75.50049,0.3054659,0D00:42:00.000000000
7.0,aapl,2026.01.21D17:25:00.000000000,246.0199,2026.01.21D18:05:00.000000000,245.83,100i,-18.98956,-0.0771871,0D00:40:00.000000000
8.0,aapl,2026.01.21D18:14:00.000000000,246.2,2026.01.21D18:26:00.000000000,245.89,100i,-30.99976,-0.1259129,0D00:12:00.000000000


## Step 7: Performance by Symbol

Different stocks behave differently. Let's break down performance by ticker to see which one the strategy works better on.

We might find:
- One stock trends more consistently (better for MA crossover)
- One stock is choppier (more whipsaws, lower win rate)
- Differences in volatility and trade frequency

In [39]:
print(f"\n{'='*60}")
print("OVERALL PERFORMANCE")
print(f"{'='*60}")

total_pnl = kx.q('sum', positions['pnl'])
num_winners = len(positions.select(where='pnl > 0'))
num_losers = len(positions.select(where='pnl < 0'))
total_trades = len(positions)

print(f"Total P&L: ${total_pnl}")
print(f"Total Trades: {total_trades}")
print(f"Winners: {num_winners} ({100*num_winners/total_trades:.1f}%) | Losers: {num_losers} ({100*num_losers/total_trades:.1f}%)")
print(f"Win Rate: {100*num_winners/total_trades:.2f}%")

# Per-symbol breakdown
print(f"\n{'='*60}")
print("PERFORMANCE BY SYMBOL")
print(f"{'='*60}")

# Calculate per-symbol statistics
symbol_stats = positions.select(
    columns={
        'trades': 'count i',
        'winners': 'sum pnl > 0',
        'total_pnl': 'sum pnl',
        'avg_pnl': 'avg pnl',
        'win_rate': '100 * (sum pnl > 0) % count i',
        'best_trade': 'max pnl',
        'worst_trade': 'min pnl'
    },
    by='sym'
)
print(symbol_stats)

print(f"{'='*60}")



OVERALL PERFORMANCE
Total P&L: $2051.543
Total Trades: 72
Winners: 24 (33.3%) | Losers: 48 (66.7%)
Win Rate: 33.33%

PERFORMANCE BY SYMBOL
sym | trades winners total_pnl avg_pnl  win_rate best_trade worst_trade
----| -----------------------------------------------------------------
aapl| 36     12      332.9483  9.248564 33.33333 405.0003   -94.00024  
msft| 36     12      1718.594  47.73873 33.33333 1609       -216.9983  


## Step 8: Cumulative PnL Over Time

To understand the strategy's progression, we calculate cumulative PnL - how our account balance would have grown (or shrunk) over the backtest period.

The [`sums`](https://code.kx.com/pykx/4.0/api/columns.html#pykx.wrappers.Column.sums) function creates a running total, perfect for equity curves.

In [22]:
# Sort trades chronologically and calculate running PnL
pnl_timeline = positions.select(
    columns=['entry_dt', 'sym', 'pnl']
)

# Sort by entry_dt
pnl_timeline = kx.q.xasc(['entry_dt'], pnl_timeline)

# Calculate cumulative PnL
pnl_timeline = pnl_timeline.update(
    columns={'cumulative_pnl': kx.Column('pnl').sums()}
)

print("CUMULATIVE PnL OVER TIME:")
pnl_timeline.tail(15)  # Last 15 rows (equivalent to -15#)

CUMULATIVE PnL OVER TIME:


Unnamed: 0,entry_dt,sym,pnl,cumulative_pnl
,,,,
0.0,2026.01.26D14:31:00.000000000,aapl,405.0003,2002.165
1.0,2026.01.26D14:46:00.000000000,msft,253.5004,2255.666
2.0,2026.01.26D15:19:00.000000000,msft,-6.970215,2248.695
3.0,2026.01.26D15:59:00.000000000,msft,328.3997,2577.095
4.0,2026.01.26D16:31:00.000000000,aapl,-38.00049,2539.095
5.0,2026.01.26D17:04:00.000000000,aapl,-94.00024,2445.094
6.0,2026.01.26D17:23:00.000000000,msft,-91.00037,2354.094
7.0,2026.01.26D17:39:00.000000000,aapl,-26.49994,2327.594
8.0,2026.01.26D17:44:00.000000000,msft,-89.50195,2238.092


## Step 9: Maximum Drawdown Analysis

**Maximum drawdown** is the largest peak-to-trough decline in cumulative PnL. It measures the strategy's worst-case risk.

### Why it matters:
- Shows how much capital you could have lost at the worst point
- Helps determine position sizing and risk tolerance
- More realistic than just looking at final PnL

We calculate:
1. Running maximum PnL at each point (the peak)
2. Current drawdown = Current PnL - Running maximum
3. Maximum drawdown = Largest negative value

In [23]:
# Calculate running maximum (the peak)
pnl_timeline = pnl_timeline.update(
    columns={'running_max': 'maxs cumulative_pnl'}
)

# Calculate drawdown from peak
pnl_timeline = pnl_timeline.update(
    columns={'drawdown': 'cumulative_pnl - running_max'}
)

# Find maximum drawdown
max_dd = pnl_timeline['drawdown'].min()
print(f"Maximum Drawdown: ${max_dd}")

# Show when it occurred
print("\nWorst drawdown period:")
worst_drawdown = pnl_timeline.select(
    where='drawdown = min drawdown'
)
worst_drawdown

Maximum Drawdown: $-782.6584

Worst drawdown period:


Unnamed: 0,entry_dt,sym,pnl,cumulative_pnl,running_max,drawdown
,,,,,,
0.0,2026.01.21D18:53:00.000000000,msft,-216.9983,-645.4788,137.1796,-782.6584


## Step 10: Testing Different Parameters

What if we used **different MA periods**? Let's try a faster 5/20 combination and compare signal frequency.

### Trade-offs:
- **Shorter MAs** (5/20): More signals, more responsive, but more whipsaws
- **Longer MAs** (10/30): Fewer signals, smoother, but slower to react

This is called **parameter sensitivity analysis** - testing how results change with different settings.

In [37]:
# Calculate alternative moving averages (5/20)
bars2 = bars.update(
    columns={
        'ma_fast5': kx.Column('close').mavg(5),
        'ma_slow20': kx.Column('close').mavg(20)
    },
    by=kx.Column('sym')
)

# Detect crossovers with new parameters
# To detect a "crossover," we need to know the values from the PREVIOUS row.
# 'prev' shifts the data down by one, allowing us to compare 'now' vs 'then'.
bars2 = bars2.update(
    columns={
        'prev_fast5': kx.Column('ma_fast5').prev(),
        'prev_slow20': kx.Column('ma_slow20').prev()
    },
    by=kx.Column('sym')
)

# Initialize all signals to 0 (No Trade/Hold)
bars2 = bars2.update(columns={'signal2': '0i'})

# BULLISH CROSSOVER:
# Fast MA is currently ABOVE slow MA, but was BELOW or EQUAL in the previous bar.
bars2 = bars2.update(
    columns={'signal2': '1i'},
    where='(ma_fast5 > ma_slow20) and prev_fast5 <= prev_slow20'
)

# BEARISH CROSSOVER:
# Fast MA is currently BELOW slow MA, but was ABOVE or EQUAL in the previous bar
bars2 = bars2.update(
    columns={'signal2': '-1i'},
    where='(ma_fast5 < ma_slow20) and prev_fast5 >= prev_slow20'
)

# Compare signal counts
signal_count_10_30 = bars.select(kx.Column('signal').abs().sum())['signal'][0]
signal_count_5_20 = bars2.select(kx.Column('signal2').abs().sum())['signal2'][0]

signal_count = kx.Table(data={
    'strategy': ['MA_10_30', 'MA_5_20'],
    'signals': [signal_count_10_30, signal_count_5_20]
})

print("SIGNAL FREQUENCY COMPARISON:")
signal_count

SIGNAL FREQUENCY COMPARISON:


Unnamed: 0,strategy,signals
,,
0.0,MA_10_30,145i
1.0,MA_5_20,235i


#### Challenge: Can you find the PnL stats for this new 5/20 Moving Average strategy?

In [None]:
/ Challenge Code:

## Conclusion and Next Steps

### What We Built

This notebook demonstrated:
1. **Real data integration**: Yahoo Finance ‚Üí KDB-X Python pipeline
2. **Bar aggregation**: Minute data ‚Üí 15-minute OHLC bars
3. **Technical indicators**: Moving averages with `mavg`
4. **Signal generation**: Crossover detection logic
5. **Trade matching**: Entry/exit pairing with `aj`
6. **Performance analysis**: PnL, win rate, drawdown, cumulative returns
7. **Parameter testing**: Comparing different MA periods

### Key KDB-X Python Features Used

- **`xbar`**: Time bucketing for bar aggregation
- **`mavg`**: Efficient moving averages
- **`by` clause**: Per-symbol calculations without loops
- **`aj` (asof join)**: Time-series matching for trade pairing
- **`sums` / `maxs`**: Running calculations for equity curves
- **`prev`**: Access previous row values for crossover detection
- **Attributes (`` .sorted() ``)**: Performance optimization for sorted data

### Improvements for Production

To make this a real trading strategy, add:

1. **Transaction costs**: Commissions ($0.005/share?), slippage (1-2 bps?)
2. **Position sizing**: Risk-based sizing, not fixed 100 shares
3. **Risk management**: Stop losses, maximum position limits
4. **Multiple timeframes**: Confirm signals on higher timeframes
5. **Volume filters**: Ignore signals on low-volume bars
6. **Market regime detection**: Different parameters for trending vs ranging markets
7. **Walk-forward testing**: Test on rolling windows to avoid overfitting
8. **Realistic execution**: Model market impact, time-in-force

### Why KDB-X Python?

This example used just 8 days of minute data. In production:
- Backtests run on **years of tick data** (billions of records)
- Real-time strategies process **millions of quotes per second**
- Portfolio analytics aggregate **thousands of instruments simultaneously**

KDB-X Python excels at these scales with the same clean syntax we used here. The code patterns are identical whether you're processing 10,000 rows or 10 billion.

### Resources

- [KDB-X Python Docs](https://code.kx.com/pykx/4.0/index.html)
- [Time-series joins (aj, asof)](https://code.kx.com/pykx/4.0/user-guide/advanced/Pandas_API.html#tablemerge_asof)
- [Moving averages (mavg)](https://code.kx.com/pykx/4.0/api/columns.html#pykx.wrappers.Column.mavg)
- [KDB-X Python Documentation](https://code.kx.com/pykx/)

Happy backtesting! üöÄ