# QuantDL Tutorial

**QuantDL** is a financial data library for alpha research. It fetches data from S3 and provides operators for signal construction.

## Data Format: Wide Tables

All data is returned as **wide tables**:
- Rows = timestamps (trading days)
- Columns = symbols (tickers)

```
┌────────────┬─────────┬─────────┬─────────┐
│ timestamp  │ IBM     │ AAPL    │ GOOGL   │
├────────────┼─────────┼─────────┼─────────┤
│ 2024-01-02 │ 162.66  │ 185.64  │ 140.93  │
│ 2024-01-03 │ 161.22  │ 184.25  │ 139.63  │
└────────────┴─────────┴─────────┴─────────┘
```

## Operator Categories

| Category | Direction | Example |
|----------|-----------|--------|
| Time Series | Column-wise (over time) | `ts_mean`, `ts_delta` |
| Cross-Sectional | Row-wise (across stocks) | `rank`, `zscore` |
| Arithmetic | Element-wise | `add`, `multiply` |
| Logical | Element-wise boolean | `gt`, `and_` |
| Group | Within-group operations | `group_rank`, `group_neutralize` |
| Vector | Multi-column aggregation | `vec_avg`, `vec_sum` |

---
# 1. Installation & Setup

Install QuantDL from TestPyPI:

In [1]:
!pip install --extra-index-url https://test.pypi.org/simple/ quantdl==0.1.4

Looking in indexes: https://pypi.org/simple, https://test.pypi.org/simple/
Collecting quantdl==0.1.4
  Using cached https://test-files.pythonhosted.org/packages/24/b7/dda43ca759384f239c90908d1bf92c7533d623b561be9065041ff0b03f96/quantdl-0.1.4-py3-none-any.whl.metadata (7.1 kB)
Using cached https://test-files.pythonhosted.org/packages/24/b7/dda43ca759384f239c90908d1bf92c7533d623b561be9065041ff0b03f96/quantdl-0.1.4-py3-none-any.whl (42 kB)
Installing collected packages: quantdl
  Attempting uninstall: quantdl
    Found existing installation: quantdl 0.1.3
    Uninstalling quantdl-0.1.3:
      Successfully uninstalled quantdl-0.1.3
Successfully installed quantdl-0.1.4

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
import quantdl
print(quantdl.__file__)

/Users/zdf/Documents/GitHub/example/.venv/lib/python3.13/site-packages/quantdl/__init__.py


In [3]:
from dotenv import load_dotenv
load_dotenv()

import polars as pl
from quantdl import QuantDLClient
from quantdl.operators import *

# Initialize client
client = QuantDLClient()

# Define symbols
symbols = ["IBM", "TXN", "NOW", "BMY", "LMT"]

Jupyter runs its own event loop to handle interactive execution. When client calls asyncio.run(), it tries to create a new event loop - but Python doesn't allow nested event loops by default. `nest_asyncio.apply()` monkey-patches Python's asyncio to allow nested run() calls.

In [4]:
import nest_asyncio
nest_asyncio.apply()

---
# 2. Data Fetching

## 2.1 `client.ticks()` - Daily OHLCV Data

Fetch daily price and volume data.

**Parameters:**
- `symbols`: List of ticker symbols
- `field`: One of `"open"`, `"high"`, `"low"`, `"close"`, `"volume"`
- `start`, `end`: Date range

In [5]:
# Fetch close prices
prices = client.ticks(symbols, field="close", start="2024-01-01", end="2024-06-30")
print(f"Shape: {prices.shape}")
prices.head(7)

Shape: (124, 6)


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,161.5,169.25999,687.52002,52.76,456.12
2024-01-03,160.10001,166.74001,675.29999,52.3,459.12
2024-01-04,160.86,164.465,671.87,52.04,457.87
2024-01-05,159.16,165.10001,676.15997,52.23,456.5
2024-01-08,161.14,168.53999,696.26001,51.79,458.60001
2024-01-09,160.08,168.63,698.66998,51.28,456.29001
2024-01-10,161.23,167.25,714.29999,50.66,455.39999


In [6]:
# Fetch volume
volume = client.ticks(symbols, field="volume", start="2024-01-01", end="2024-06-30")
volume.head(7)

timestamp,IBM,TXN,NOW,BMY,LMT
date,i64,i64,i64,i64,i64
2024-01-02,3825044,5568968,1130930,17757453,1206678
2024-01-03,4086065,5809552,883137,15662627,1174302
2024-01-04,3212004,6364914,914469,16930548,1087809
2024-01-05,4199504,3062300,723641,12272235,705324
2024-01-08,3321698,5667772,1198165,18745875,716216
2024-01-09,2617186,4958909,984114,13385140,733045
2024-01-10,2967852,4013568,1011932,18095314,666085


---
# 3. Time Series Operators

Time series operators work **column-wise** (over time for each stock independently).

## 3.1 `ts_delay(x, d)` - Lag Values

Shift values backward by `d` periods.

$$\text{ts\_delay}(x, d)_t = x_{t-d}$$

In [7]:
prices_5d_ago = ts_delay(prices, 5)

print("Current prices vs 5 days ago:")
pl.concat([
    prices.head(7),
    prices_5d_ago.head(7).rename(lambda c: f"{c}_5d_ago")
], how="horizontal")

Current prices vs 5 days ago:


timestamp,IBM,TXN,NOW,BMY,LMT,timestamp_5d_ago,IBM_5d_ago,TXN_5d_ago,NOW_5d_ago,BMY_5d_ago,LMT_5d_ago
date,f64,f64,f64,f64,f64,date,f64,f64,f64,f64,f64
2024-01-02,161.5,169.25999,687.52002,52.76,456.12,2024-01-02,,,,,
2024-01-03,160.10001,166.74001,675.29999,52.3,459.12,2024-01-03,,,,,
2024-01-04,160.86,164.465,671.87,52.04,457.87,2024-01-04,,,,,
2024-01-05,159.16,165.10001,676.15997,52.23,456.5,2024-01-05,,,,,
2024-01-08,161.14,168.53999,696.26001,51.79,458.60001,2024-01-08,,,,,
2024-01-09,160.08,168.63,698.66998,51.28,456.29001,2024-01-09,161.5,169.25999,687.52002,52.76,456.12
2024-01-10,161.23,167.25,714.29999,50.66,455.39999,2024-01-10,160.10001,166.74001,675.29999,52.3,459.12


## 3.2 `ts_delta(x, d)` - Price Change / Momentum

Difference from `d` days ago.

$$\text{ts\_delta}(x, d)_t = x_t - x_{t-d}$$

In [8]:
momentum_5d = ts_delta(prices, 5)

print("5-day price change (momentum):")
momentum_5d.head(7)

5-day price change (momentum):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,,,,,
2024-01-09,-1.42,-0.62999,11.14996,-1.48,0.17001
2024-01-10,1.12999,0.50999,39.0,-1.64,-3.72001


## 3.3 `ts_mean(x, d)` - Rolling Mean (Moving Average)

Simple moving average over `d` periods.

$$\text{ts\_mean}(x, d)_t = \frac{1}{d} \sum_{i=0}^{d-1} x_{t-i}$$

In [9]:
ma_5 = ts_mean(prices, 5)

print("5-day moving average:")
ma_5.head(7)

5-day moving average:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,161.5,169.25999,687.52002,52.76,456.12
2024-01-03,160.800005,168.0,681.410005,52.53,457.62
2024-01-04,160.820003,166.821667,678.230003,52.366667,457.703333
2024-01-05,160.405002,166.391253,677.712495,52.3325,457.4025
2024-01-08,160.552002,166.821,681.421998,52.224,457.642002
2024-01-09,160.268002,166.695002,683.65199,51.928,457.676004
2024-01-10,160.494,166.797,691.45199,51.6,456.932002


## 3.4 `ts_sum(x, d)` - Rolling Sum

Cumulative sum over `d` periods.

$$\text{ts\_sum}(x, d)_t = \sum_{i=0}^{d-1} x_{t-i}$$

In [10]:
vol_5d = ts_sum(volume, 5)

print("5-day cumulative volume:")
vol_5d.head(7)

5-day cumulative volume:


timestamp,IBM,TXN,NOW,BMY,LMT
date,i64,i64,i64,i64,i64
2024-01-02,3825044,5568968,1130930,17757453,1206678
2024-01-03,7911109,11378520,2014067,33420080,2380980
2024-01-04,11123113,17743434,2928536,50350628,3468789
2024-01-05,15322617,20805734,3652177,62622863,4174113
2024-01-08,18644315,26473506,4850342,81368738,4890329
2024-01-09,17436457,25863447,4703526,76996425,4416696
2024-01-10,16318244,24067463,4832321,79429112,3908479


## 3.5 `ts_std(x, d)` - Rolling Standard Deviation

Volatility measure over `d` periods.

$$\text{ts\_std}(x, d)_t = \sqrt{\frac{1}{d-1} \sum_{i=0}^{d-1} (x_{t-i} - \bar{x})^2}$$

In [11]:
# Calculate daily returns first
daily_return = divide(ts_delta(prices, 1), ts_delay(prices, 1))
volatility = ts_std(daily_return, 5)

print("5-day rolling volatility:")
volatility.head(7)

5-day rolling volatility:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,0.009486,0.00088,0.008977,0.00265,0.006576
2024-01-05,0.008348,0.010484,0.012085,0.006343,0.005449
2024-01-08,0.011001,0.016864,0.020184,0.005768,0.004944
2024-01-09,0.009906,0.01462,0.01748,0.005516,0.005151
2024-01-10,0.009662,0.01323,0.014283,0.006152,0.00366


## 3.6 `ts_min(x, d)` / `ts_max(x, d)` - Rolling Min/Max

Lowest/highest value in rolling window.

$$\text{ts\_min}(x, d)_t = \min_{i=0}^{d-1} x_{t-i}$$
$$\text{ts\_max}(x, d)_t = \max_{i=0}^{d-1} x_{t-i}$$

In [12]:
rolling_low = ts_min(prices, 5)
rolling_high = ts_max(prices, 5)

print("5-day rolling low:")
rolling_low.head(7)

5-day rolling low:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,161.5,169.25999,687.52002,52.76,456.12
2024-01-03,160.10001,166.74001,675.29999,52.3,456.12
2024-01-04,160.10001,164.465,671.87,52.04,456.12
2024-01-05,159.16,164.465,671.87,52.04,456.12
2024-01-08,159.16,164.465,671.87,51.79,456.12
2024-01-09,159.16,164.465,671.87,51.28,456.29001
2024-01-10,159.16,164.465,671.87,50.66,455.39999


## 3.7 `ts_arg_min(x, d)` / `ts_arg_max(x, d)` - Days Since Min/Max

How many days ago was the min/max?

$$\text{ts\_arg\_min}(x, d)_t = \arg\min_{i=0}^{d-1} x_{t-i}$$

In [13]:
days_since_low = ts_arg_min(prices, 4)
days_since_high = ts_arg_max(prices, 4)

print("Days since 4-day low (0 = today is low):")
days_since_low.head(7)

Days since 4-day low (0 = today is low):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,0.0,1.0,1.0,1.0,3.0
2024-01-08,1.0,2.0,2.0,0.0,1.0
2024-01-09,2.0,3.0,3.0,0.0,0.0
2024-01-10,3.0,3.0,3.0,0.0,0.0


## 3.8 `ts_rank(x, d)` - Percentile Rank in Window

Where does current value rank in recent history? Returns 0-1.

$$\text{ts\_rank}(x, d)_t = \frac{\text{rank of } x_t \text{ in } \{x_{t-d+1}, ..., x_t\}}{d}$$

In [14]:
percentile = ts_rank(prices, 5)

print("Percentile rank in 5-day window:")
percentile.head(7)

Percentile rank in 5-day window:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,0.5,0.5,0.5,0.5,0.5
2024-01-03,0.0,0.0,0.0,0.0,1.0
2024-01-04,0.5,0.0,0.0,0.0,0.5
2024-01-05,0.0,0.333333,0.666667,0.333333,0.333333
2024-01-08,0.75,0.75,1.0,0.0,0.75
2024-01-09,0.25,1.0,1.0,0.0,0.0
2024-01-10,1.0,0.5,1.0,0.0,0.0


## 3.9 `ts_zscore(x, d)` - Rolling Z-Score

Standardize relative to recent history.

$$\text{ts\_zscore}(x, d)_t = \frac{x_t - \text{ts\_mean}(x, d)_t}{\text{ts\_std}(x, d)_t}$$

In [15]:
price_zscore = ts_zscore(prices, 5)

print("5-day rolling z-score:")
price_zscore.head(7)

5-day rolling z-score:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-0.707107,-0.707107,-0.707107,-0.707107,0.707107
2024-01-04,0.057069,-0.982543,-0.773145,-0.895958,0.1106
2024-01-05,-1.234939,-0.603599,-0.22845,-0.335585,-0.658936
2024-01-08,0.630285,0.823656,1.458918,-1.209153,0.736118
2024-01-09,-0.242605,1.010504,1.17883,-1.570384,-1.106099
2024-01-10,0.843706,0.234546,1.311206,-1.480453,-1.191954


## 3.10 `ts_scale(x, d)` - Rolling Min-Max Scale [0, 1]

Where is current price in recent range?

$$\text{ts\_scale}(x, d)_t = \frac{x_t - \text{ts\_min}(x, d)_t}{\text{ts\_max}(x, d)_t - \text{ts\_min}(x, d)_t}$$

In [16]:
scaled_price = ts_scale(prices, 5)

print("5-day scaled price [0=low, 1=high]:")
scaled_price.head(7)

5-day scaled price [0=low, 1=high]:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,0.0,0.0,0.0,0.0,1.0
2024-01-04,0.542854,0.0,0.0,0.0,0.583333
2024-01-05,0.0,0.132432,0.274119,0.263889,0.126667
2024-01-08,0.846154,0.849843,1.0,0.0,0.82667
2024-01-09,0.464646,1.0,1.0,0.0,0.0
2024-01-10,1.0,0.668667,1.0,0.0,0.0


## 3.11 `ts_corr(x, y, d)` / `ts_covariance(x, y, d)` - Rolling Correlation/Covariance

Measure co-movement over time.

$$\text{ts\_corr}(x, y, d)_t = \frac{\text{Cov}(x, y)}{\sigma_x \sigma_y}$$

In [17]:
price_vol_corr = ts_corr(prices, volume, 5)

print("5-day rolling price-volume correlation:")
price_vol_corr.head(7)

5-day rolling price-volume correlation:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,-0.677233,0.241287,0.855746,-0.130313,0.021488
2024-01-09,-0.507565,0.125877,0.717503,0.038955,0.631587
2024-01-10,-0.574182,0.073813,0.586369,-0.347147,0.47169


## 3.12 `ts_decay_linear(x, d)` - Linear Decay Weighted Average

Recent values weighted more heavily.

$$\text{ts\_decay\_linear}(x, d)_t = \frac{\sum_{i=0}^{d-1} (d-i) \cdot x_{t-i}}{\sum_{i=0}^{d-1} (d-i)}$$

In [18]:
decay_avg = ts_decay_linear(prices, 5)

print("5-day linear decay weighted average:")
decay_avg.head(7)

5-day linear decay weighted average:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,160.441335,166.615667,682.644662,52.09,457.798003
2024-01-09,160.284001,167.218667,688.393989,51.775333,457.347339
2024-01-10,160.604667,167.403666,698.609989,51.352667,456.588668


## 3.13 `ts_av_diff(x, d)` - Deviation from Rolling Mean

How far is current value from average?

$$\text{ts\_av\_diff}(x, d)_t = x_t - \text{ts\_mean}(x, d)_t$$

In [19]:
price_dev = ts_av_diff(prices, 5)

print("Deviation from 5-day mean:")
price_dev.head(7)

Deviation from 5-day mean:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,0.587998,1.71899,14.838012,-0.434,0.958008
2024-01-09,-0.188002,1.934998,15.01799,-0.648,-1.385994
2024-01-10,0.736,0.453,22.848,-0.94,-1.532012


## 3.14 `ts_product(x, d)` - Rolling Product

Cumulative product over window. Useful for compound returns.

$$\text{ts\_product}(x, d)_t = \prod_{i=0}^{d-1} x_{t-i}$$

In [20]:
from quantdl.alpha.parser import alpha_eval
import quantdl.operators as ops

# Calculate return factors (1 + return)
return_factor = alpha_eval(
    "ts_delta(prices, 1) / ts_delay(prices, 1) + 1",
    {"prices": prices},
    ops=ops
)
cum_return = alpha_eval(
    "ts_product(rf, 5)",
    {"rf": return_factor.data},
    ops=ops
)

print("5-day cumulative return factor:")
cum_return

5-day cumulative return factor:


Alpha(124 rows x 6 cols)

In [21]:
cum_return.data.head(7)

timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,0.991331,0.985112,0.982226,0.991281,1.006577
2024-01-04,0.996037,0.971671,0.977237,0.986353,1.003837
2024-01-05,0.985511,0.975423,0.983477,0.989955,1.000833
2024-01-08,0.997771,0.995746,1.012712,0.981615,1.005437
2024-01-09,0.991207,0.996278,1.016218,0.971948,1.000373
2024-01-10,1.007058,1.003059,1.057752,0.968642,0.991898


## 3.15 `ts_quantile(x, d)` - Gaussian Quantile Transform

Transform time-series rank to Gaussian distribution via inverse CDF.

$$\text{ts\_quantile}(x, d)_t = \Phi^{-1}(\text{ts\_rank}(x, d)_t)$$

In [22]:
gaussian_rank = ts_quantile(prices, 5)

print("Gaussian quantile transform:")
gaussian_rank.head(7)

Gaussian quantile transform:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,0.0,0.0,0.0,0.0,0.0
2024-01-03,-0.67449,-0.67449,-0.67449,-0.67449,0.67449
2024-01-04,0.0,-0.967422,-0.967422,-0.967422,0.0
2024-01-05,-1.150349,-0.318639,0.318639,-0.318639,-0.318639
2024-01-08,0.524401,0.524401,1.281552,-1.281552,0.524401
2024-01-09,-0.524401,1.281552,1.281552,-1.281552,-1.281552
2024-01-10,1.281552,0.0,1.281552,-1.281552,-1.281552


## 3.16 `ts_regression(y, x, d, rettype)` - Rolling Regression

Linear regression over rolling window. Returns alpha, beta, or residual.

$$y_t = \alpha + \beta \cdot x_t + \epsilon_t$$

In [23]:
beta = ts_regression(prices, volume, 5, rettype="beta")
alpha_reg = ts_regression(prices, volume, 5, rettype="alpha")
resid = ts_regression(prices, volume, 5, rettype="resid")

print("Rolling beta (price vs volume):")
beta.head(7)

Rolling beta (price vs volume):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-5e-06,-1e-05,4.9e-05,2.1959e-07,-9.3e-05
2024-01-04,-5.2764e-07,-6e-06,5.8e-05,1.8404e-07,-9e-06
2024-01-05,-1e-06,2.7779e-07,3e-05,4.9866e-08,2e-06
2024-01-08,-1e-06,3.9178e-07,4.5e-05,-1.8661e-08,1.1282e-07
2024-01-09,-5.9877e-07,1.8805e-07,5.3e-05,6.1362e-09,3e-06
2024-01-10,-8.5052e-07,1.0883e-07,6e-05,-7.6205e-08,4e-06


## 3.17 `ts_count_nans(x, d)` - Count Nulls in Window

Data quality check.

$$\text{ts\_count\_nans}(x, d)_t = \sum_{i=0}^{d-1} \mathbb{1}[x_{t-i} = \text{NaN}]$$

In [24]:
daily_change = ts_delta(prices, 1)
nan_count = ts_count_nans(daily_change, 5)

print("NaN count in 5-day window:")
nan_count.head(7)

NaN count in 5-day window:


timestamp,IBM,TXN,NOW,BMY,LMT
date,i64,i64,i64,i64,i64
2024-01-02,1,1,1,1,1
2024-01-03,1,1,1,1,1
2024-01-04,1,1,1,1,1
2024-01-05,1,1,1,1,1
2024-01-08,1,1,1,1,1
2024-01-09,0,0,0,0,0
2024-01-10,0,0,0,0,0


## 3.18 `ts_backfill(x, d)` - Forward Fill Nulls

Fill missing values with last valid value.

$$\text{ts\_backfill}(x, d)_t = \begin{cases} x_t & \text{if } x_t \neq \text{NaN} \\ x_{t-k} & \text{where } k = \min\{j : x_{t-j} \neq \text{NaN}, j \leq d\} \end{cases}$$

In [25]:
# Create sparse data with NaN gaps
sparse = prices.head(7).with_columns(
    pl.when(pl.col("IBM").is_not_null() & (pl.int_range(pl.len()) >= 2) & (pl.int_range(pl.len()) <= 4))
    .then(pl.lit(None))
    .otherwise(pl.col("IBM"))
    .alias("IBM")
)
filled = ts_backfill(sparse, 5)

print("Before backfill:")
print(sparse.select(["timestamp", "IBM"]).head(7))
print("\nAfter backfill:")
print(filled.select(["timestamp", "IBM"]).head(7))

Before backfill:
shape: (7, 2)
┌────────────┬───────────┐
│ timestamp  ┆ IBM       │
│ ---        ┆ ---       │
│ date       ┆ f64       │
╞════════════╪═══════════╡
│ 2024-01-02 ┆ 161.5     │
│ 2024-01-03 ┆ 160.10001 │
│ 2024-01-04 ┆ null      │
│ 2024-01-05 ┆ null      │
│ 2024-01-08 ┆ null      │
│ 2024-01-09 ┆ 160.08    │
│ 2024-01-10 ┆ 161.23    │
└────────────┴───────────┘

After backfill:
shape: (7, 2)
┌────────────┬───────────┐
│ timestamp  ┆ IBM       │
│ ---        ┆ ---       │
│ date       ┆ f64       │
╞════════════╪═══════════╡
│ 2024-01-02 ┆ 161.5     │
│ 2024-01-03 ┆ 160.10001 │
│ 2024-01-04 ┆ 160.10001 │
│ 2024-01-05 ┆ 160.10001 │
│ 2024-01-08 ┆ 160.10001 │
│ 2024-01-09 ┆ 160.08    │
│ 2024-01-10 ┆ 161.23    │
└────────────┴───────────┘


## 3.19 `ts_step(x)` - Row Counter

Create a time index (0, 1, 2, ...).

In [26]:
time_idx = ts_step(prices)

print("Row counter:")
time_idx.head(7)

Row counter:


timestamp,IBM,TXN,NOW,BMY,LMT
date,i64,i64,i64,i64,i64
2024-01-02,1,1,1,1,1
2024-01-03,2,2,2,2,2
2024-01-04,3,3,3,3,3
2024-01-05,4,4,4,4,4
2024-01-08,5,5,5,5,5
2024-01-09,6,6,6,6,6
2024-01-10,7,7,7,7,7


## 3.20 `hump(x, hump)` - Limit Change Magnitude

Smooth signals by capping step changes.

$$\text{hump}(x, h)_t = \begin{cases} x_t & \text{if } |x_t - x_{t-1}| \leq h \\ x_{t-1} + h \cdot \text{sign}(x_t - x_{t-1}) & \text{otherwise} \end{cases}$$

In [27]:
price_zscore = ts_zscore(prices, 5)
smooth_signal = hump(price_zscore, 0.05)

print("Smoothed z-score (max 0.5 change per period):")
smooth_signal.head(7)

Smoothed z-score (max 0.5 change per period):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-0.707107,-0.707107,-0.707107,-0.707107,0.707107
2024-01-04,-0.566141,-0.848073,-0.707107,-0.848073,0.566141
2024-01-05,-0.719216,-0.694997,-0.554031,-0.694997,0.413066
2024-01-08,-0.47631,-0.452091,-0.311125,-0.937904,0.655972
2024-01-09,-0.47631,-0.19667,-0.055704,-1.193325,0.400551
2024-01-10,-0.223217,0.056424,0.197389,-1.446418,0.147458


## 3.21 `kth_element(x, d, k)` - K-th Element from End

Get the k-th element from the end of a window.

$$\text{kth\_element}(x, d, k)_t = x_{t-d+k}$$

In [28]:
third_from_last = kth_element(prices, 5, 3)

print("3rd element from end of 5-day window:")
third_from_last.head(7)

3rd element from end of 5-day window:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,161.5,169.25999,687.52002,52.76,456.12
2024-01-08,160.10001,166.74001,675.29999,52.3,459.12
2024-01-09,160.86,164.465,671.87,52.04,457.87
2024-01-10,159.16,165.10001,676.15997,52.23,456.5


## 3.22 `days_from_last_change(x)` / `last_diff_value(x, d)` - Regime Detection

Detect when values change (useful for discrete signals).

In [29]:
# Create discrete signal (buckets)
discrete_signal = bucket(rank(prices), range_spec="0,1,0.25")
days_unchanged = days_from_last_change(discrete_signal)

print("Days since signal changed:")
days_unchanged.head(7)

Days since signal changed:


timestamp,IBM,TXN,NOW,BMY,LMT
date,i64,i64,i64,i64,i64
2024-01-02,0,0,0,0,0
2024-01-03,1,1,1,1,1
2024-01-04,2,2,2,2,2
2024-01-05,3,3,3,3,3
2024-01-08,4,4,4,4,4
2024-01-09,5,5,5,5,5
2024-01-10,6,6,6,6,6


---
# 4. Cross-Sectional Operators

Cross-sectional operators work **row-wise** (across all stocks at each point in time).

## 4.1 `rank(x)` - Cross-Sectional Rank

Rank stocks relative to each other at each date. Returns 0-1.

$$\text{rank}(x)_{t,i} = \frac{\text{rank of } x_{t,i} \text{ among all stocks at } t}{n}$$

In [30]:
price_rank = rank(prices)

print("Cross-sectional rank (0-1):")
price_rank.head(7)

Cross-sectional rank (0-1):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,0.25,0.5,1.0,0.0,0.75
2024-01-03,0.25,0.5,1.0,0.0,0.75
2024-01-04,0.25,0.5,1.0,0.0,0.75
2024-01-05,0.25,0.5,1.0,0.0,0.75
2024-01-08,0.25,0.5,1.0,0.0,0.75
2024-01-09,0.25,0.5,1.0,0.0,0.75
2024-01-10,0.25,0.5,1.0,0.0,0.75


## 4.2 `zscore(x)` - Cross-Sectional Z-Score

Standardize across stocks at each date.

$$\text{zscore}(x)_{t,i} = \frac{x_{t,i} - \bar{x}_t}{\sigma_t}$$

In [31]:
momentum = ts_delta(prices, 5)
cs_zscore = zscore(momentum)

print("Cross-sectional z-score:")
cs_zscore.head(7)

Cross-sectional z-score:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,,,,,
2024-01-09,-0.551062,-0.404875,1.77494,-0.562164,-0.256839
2024-01-10,-0.329971,-0.364493,1.7787,-0.484209,-0.600027


## 4.3 `normalize(x)` - Demean (Subtract Row Mean)

Center values around zero at each date. Row sums become ~0.

$$\text{normalize}(x)_{t,i} = x_{t,i} - \bar{x}_t$$

In [32]:
demeaned = normalize(momentum)

print("Demeaned momentum (row sums ~ 0):")
demeaned.head(7)

Demeaned momentum (row sums ~ 0):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,,,,,
2024-01-09,-2.977996,-2.187986,9.591964,-3.037996,-1.387986
2024-01-10,-5.926004,-6.546004,31.944006,-8.695994,-10.776004


## 4.4 `scale(x, scale, longscale, shortscale)` - Scale to Target Sum

Convert signals to portfolio weights with target absolute sum.

$$\text{scale}(x, s)_{t,i} = \frac{x_{t,i} \cdot s}{\sum_j |x_{t,j}|}$$

In [33]:
weights = scale(demeaned, scale=1.0)

print("Portfolio weights (|sum| = 1):")
weights.head(7)

Portfolio weights (|sum| = 1):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,,,,,
2024-01-09,-0.155234,-0.114053,0.5,-0.158362,-0.072352
2024-01-10,-0.092756,-0.102461,0.5,-0.136113,-0.16867


## 4.5 `quantile(x, driver)` - Cross-Sectional Quantile Transform

Transform ranks to a specific distribution.

**Drivers:** `"gaussian"`, `"uniform"`, `"cauchy"`

$$\text{quantile}(x, \text{gaussian})_{t,i} = \Phi^{-1}(\text{rank}(x)_{t,i})$$

In [34]:
gaussian_quantile = quantile(momentum, driver="gaussian")

print("Gaussian quantile transform:")
gaussian_quantile.head(7)

Gaussian quantile transform:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,,,,,
2024-01-09,-0.38532,0.0,0.841621,-0.841621,0.38532
2024-01-10,0.38532,0.0,0.841621,-0.38532,-0.841621


## 4.6 `winsorize(x, std)` - Clip Outliers

Clip values outside mean ± n×std.

$$\text{winsorize}(x, n)_{t,i} = \max(\min(x_{t,i}, \bar{x}_t + n\sigma_t), \bar{x}_t - n\sigma_t)$$

In [35]:
winsorized = winsorize(momentum, std=1.0)

print("Winsorized momentum (+/-1 std):")
winsorized.head(7)

Winsorized momentum (+/-1 std):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,,,,,
2024-01-04,,,,,
2024-01-05,,,,,
2024-01-08,,,,,
2024-01-09,-1.42,-0.62999,6.962102,-1.48,0.17001
2024-01-10,1.12999,0.50999,25.015182,-1.64,-3.72001


---
# 5. Arithmetic Operators

Element-wise operations on DataFrames.

## 5.1 `add(*args)` / `subtract(x, y)` / `multiply(*args)` / `divide(x, y)`

Basic arithmetic operations.

In [36]:
# Daily return = (price - price_yesterday) / price_yesterday
daily_change = ts_delta(prices, 1)
lagged_prices = ts_delay(prices, 1)
daily_return = divide(daily_change, lagged_prices)

print("Daily return:")
daily_return.head(7)

Daily return:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-0.008669,-0.014888,-0.017774,-0.008719,0.006577
2024-01-04,0.004747,-0.013644,-0.005079,-0.004971,-0.002723
2024-01-05,-0.010568,0.003861,0.006385,0.003651,-0.002992
2024-01-08,0.01244,0.020836,0.029727,-0.008424,0.0046
2024-01-09,-0.006578,0.000534,0.003461,-0.009847,-0.005037
2024-01-10,0.007184,-0.008184,0.022371,-0.01209,-0.001951


In [37]:
# Volume-weighted price change
weighted = multiply(daily_change, volume)

print("Volume-weighted price change:")
weighted.head(7)

Volume-weighted price change:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-5720500.0,-14640000.0,-10792000.0,-7204800.0,3522906.0
2024-01-04,2441100.0,-14480000.0,-3136600.0,-4401900.0,-1359800.0
2024-01-05,-7139200.0,1944600.0,3104400.0,2331700.0,-966293.88
2024-01-08,6577000.0,19497000.0,24083000.0,-8248200.0,1504100.0
2024-01-09,-2774200.0,446351.39909,2371700.0,-6826400.0,-1693300.0
2024-01-10,3413000.0,-5538700.0,15817000.0,-11219000.0,-592828.9717


## 5.2 `power(x, p)` / `sqrt(x)` / `log(x)`

Power and logarithmic operations.

$$\text{power}(x, p) = x^p$$
$$\text{sqrt}(x) = \sqrt{x}$$
$$\text{log}(x) = \ln(x)$$

In [38]:
log_prices = log(prices)
sqrt_prices = sqrt(prices)

print("Log prices:")
log_prices.head(7)

Log prices:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,5.084505,5.131436,6.533091,3.965753,6.122756
2024-01-03,5.075799,5.116436,6.515157,3.956996,6.129312
2024-01-04,5.080534,5.102698,6.510065,3.952013,6.126585
2024-01-05,5.06991,5.106551,6.51643,3.955657,6.123589
2024-01-08,5.082274,5.127173,6.545723,3.947197,6.128178
2024-01-09,5.075674,5.127707,6.549178,3.937301,6.123129
2024-01-10,5.082832,5.11949,6.571303,3.925137,6.121176


## 5.3 `signed_power(x, e)` - Sign-Preserving Power

Apply power while preserving sign.

$$\text{signed\_power}(x, e) = \text{sign}(x) \cdot |x|^e$$

In [39]:
returns = divide(ts_delta(prices, 1), ts_delay(prices, 1))
sqrt_returns = signed_power(returns, 0.5)

print("Square root of returns (sign preserved):")
sqrt_returns.head(7)

Square root of returns (sign preserved):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-0.093106,-0.122017,-0.133319,-0.093374,0.0811
2024-01-04,0.068898,-0.116808,-0.071269,-0.070508,-0.052179
2024-01-05,-0.102802,0.062137,0.079907,0.060424,-0.0547
2024-01-08,0.111536,0.144346,0.172414,-0.091784,0.067825
2024-01-09,-0.081106,0.02311,0.058833,-0.099234,-0.070972
2024-01-10,0.084758,-0.090463,0.14957,-0.109957,-0.044165


## 5.4 `abs(x)` / `sign(x)` / `inverse(x)` / `reverse(x)`

Transformations.

- `abs(x)` = $|x|$
- `sign(x)` = $\text{sign}(x)$ (-1, 0, or 1)
- `inverse(x)` = $1/x$
- `reverse(x)` = $-x$

In [40]:
abs_change = abs(daily_change)
sign_change = sign(daily_change)

print("Absolute daily change:")
abs_change.head(7)

Absolute daily change:


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,1.39999,2.51998,12.22003,0.46,3.0
2024-01-04,0.75999,2.27501,3.42999,0.26,1.25
2024-01-05,1.7,0.63501,4.28997,0.19,1.37
2024-01-08,1.98,3.43998,20.10004,0.44,2.10001
2024-01-09,1.06,0.09001,2.40997,0.51,2.31
2024-01-10,1.15,1.38,15.63001,0.62,0.89002


## 5.5 `min(*args)` / `max(*args)` - Element-wise Min/Max

Element-wise minimum/maximum across DataFrames.

In [41]:
ma_3 = ts_mean(prices, 3)
ma_5 = ts_mean(prices, 5)

# Lower envelope
ma_lower = min(ma_3, ma_5)

print("Lower envelope (min of MA3, MA5):")
ma_lower.head(7)

Lower envelope (min of MA3, MA5):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,161.5,169.25999,687.52002,52.76,456.12
2024-01-03,160.800005,168.0,681.410005,52.53,457.62
2024-01-04,160.820003,166.821667,678.230003,52.366667,457.703333
2024-01-05,160.040003,165.435007,674.44332,52.19,457.4025
2024-01-08,160.386667,166.035,681.421998,52.02,457.642002
2024-01-09,160.126667,166.695002,683.65199,51.766667,457.130007
2024-01-10,160.494,166.797,691.45199,51.243333,456.763337


## 5.6 `densify(x, step)` - Bucket Values

Round to nearest step (discretize).

$$\text{densify}(x, s) = \text{round}(x / s) \cdot s$$

In [44]:
bucketed = bucket(rank(prices), range_spec="0,1,0.25")
dense = densify(bucketed)

print("Densified buckets:")
dense.head(7)

Densified buckets:


timestamp,IBM,TXN,NOW,BMY,LMT
date,u32,u32,u32,u32,u32
2024-01-02,1,2,3,0,3
2024-01-03,1,2,3,0,3
2024-01-04,1,2,3,0,3
2024-01-05,1,2,3,0,3
2024-01-08,1,2,3,0,3
2024-01-09,1,2,3,0,3
2024-01-10,1,2,3,0,3


---
# 6. Logical Operators

Boolean operations on DataFrames.

## 6.1 Comparisons: `gt`, `lt`, `ge`, `le`, `eq`, `ne`

- `gt(x, y)` = $x > y$
- `lt(x, y)` = $x < y$
- `ge(x, y)` = $x \geq y$
- `le(x, y)` = $x \leq y$
- `eq(x, y)` = $x = y$
- `ne(x, y)` = $x \neq y$

In [45]:
ma_5 = ts_mean(prices, 5)
above_ma = gt(prices, ma_5)

print("Price > 5-day MA:")
above_ma.head(7)

Price > 5-day MA:


timestamp,IBM,TXN,NOW,BMY,LMT
date,bool,bool,bool,bool,bool
2024-01-02,False,False,False,False,False
2024-01-03,False,False,False,False,True
2024-01-04,True,False,False,False,True
2024-01-05,False,False,False,False,False
2024-01-08,True,True,True,False,True
2024-01-09,False,True,True,False,False
2024-01-10,True,True,True,False,False


## 6.2 `and_(x, y)` / `or_(x, y)` / `not_(x)`

Boolean logic. Keywords, must use suffix.

In [46]:
momentum = ts_delta(prices, 3)
pos_momentum = gt(momentum, 0)

# Buy signal: above MA AND positive momentum
buy_signal = and_(above_ma, pos_momentum)

print("Buy signal (above MA AND positive momentum):")
buy_signal.head(7)

Buy signal (above MA AND positive momentum):


timestamp,IBM,TXN,NOW,BMY,LMT
date,bool,bool,bool,bool,bool
2024-01-02,False,False,False,False,False
2024-01-03,False,False,False,False,
2024-01-04,,False,False,False,
2024-01-05,False,False,False,False,False
2024-01-08,True,True,True,False,False
2024-01-09,False,True,True,False,False
2024-01-10,True,True,True,False,False


## 6.3 `if_else(condition, then_value, else_value)` - Conditional Selection

Select values based on condition.

$$\text{if\_else}(c, a, b) = \begin{cases} a & \text{if } c = \text{True} \\ b & \text{otherwise} \end{cases}$$

In [47]:
daily_return = divide(ts_delta(prices, 1), ts_delay(prices, 1))

# Cap returns at +/- 5%
capped_return = if_else(
    gt(daily_return, 0.05),
    0.05,
    if_else(lt(daily_return, -0.05), -0.05, daily_return)
)

print("Capped returns (+/-5%):")
capped_return.head(7)

Capped returns (+/-5%):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-01-02,,,,,
2024-01-03,-0.008669,-0.014888,-0.017774,-0.008719,0.006577
2024-01-04,0.004747,-0.013644,-0.005079,-0.004971,-0.002723
2024-01-05,-0.010568,0.003861,0.006385,0.003651,-0.002992
2024-01-08,0.01244,0.020836,0.029727,-0.008424,0.0046
2024-01-09,-0.006578,0.000534,0.003461,-0.009847,-0.005037
2024-01-10,0.007184,-0.008184,0.022371,-0.01209,-0.001951


## 6.4 `is_nan(x)` - Check for Nulls

$$\text{is\_nan}(x) = \mathbb{1}[x = \text{NaN}]$$

In [48]:
daily_change = ts_delta(prices, 1)
has_nan = is_nan(daily_change)

print("Has NaN:")
has_nan.head(7)

Has NaN:


timestamp,IBM,TXN,NOW,BMY,LMT
date,bool,bool,bool,bool,bool
2024-01-02,True,True,True,True,True
2024-01-03,False,False,False,False,False
2024-01-04,False,False,False,False,False
2024-01-05,False,False,False,False,False
2024-01-08,False,False,False,False,False
2024-01-09,False,False,False,False,False
2024-01-10,False,False,False,False,False


---
# 7. Group Operators

Operations within groups (e.g., sectors).

## Setup: Define Groups

In [50]:
# Extended symbols for group examples
group_symbols = ["IBM", "TXN", "NOW", "META", "BMY", "JNJ", "LMT", "GD", "SO", "NEE"]
group_prices = client.ticks(group_symbols, field="close", start="2024-01-01", end="2024-06-30")

# Sector map: Tech=1, Healthcare=2, Defense=3, Utilities=4
sector_map = {
    "IBM": 1, "TXN": 1, "NOW": 1, "META": 1,  # Tech
    "BMY": 2, "JNJ": 2,                         # Healthcare
    "LMT": 3, "GD": 3,                          # Defense
    "SO": 4, "NEE": 4,                          # Utilities
}

# Create groups DataFrame
date_col = group_prices.columns[0]
value_cols = group_prices.columns[1:]
groups = group_prices.select(
    pl.col(date_col),
    *[pl.lit(sector_map.get(c, 0)).alias(c) for c in value_cols]
)

print("Groups (1=Tech, 2=Health, 3=Defense, 4=Utilities):")
groups.head(1)

Groups (1=Tech, 2=Health, 3=Defense, 4=Utilities):


timestamp,IBM,TXN,NOW,META,BMY,JNJ,LMT,GD,SO,NEE
date,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
2024-01-02,1,1,1,1,2,2,3,3,4,4


## 7.1 `group_rank(x, groups)` - Rank Within Groups

Rank stocks within their sector.

$$\text{group\_rank}(x, g)_{t,i} = \text{rank of } x_{t,i} \text{ among stocks where } g_{t,j} = g_{t,i}$$

In [51]:
momentum = ts_delta(group_prices, 5)
sector_rank = group_rank(momentum, groups)

print("Momentum rank within sector:")
sector_rank.head(7)

Momentum rank within sector:


timestamp,IBM,TXN,NOW,META,BMY,JNJ,LMT,GD,SO,NEE
date,f64,f64,f64,f64,f64,f64,f64,f64,f64,f64
2024-01-02,,,,,,,,,,
2024-01-03,,,,,,,,,,
2024-01-04,,,,,,,,,,
2024-01-05,,,,,,,,,,
2024-01-08,,,,,,,,,,
2024-01-09,0.0,0.333333,1.0,0.666667,0.0,1.0,1.0,0.0,1.0,0.0
2024-01-10,0.333333,0.0,1.0,0.666667,0.0,1.0,1.0,0.0,0.0,1.0


## 7.2 `group_neutralize(x, groups)` - Sector-Neutral Signal

Subtract sector mean.

$$\text{group\_neutralize}(x, g)_{t,i} = x_{t,i} - \bar{x}_{t,g_{t,i}}$$

In [52]:
sector_neutral = group_neutralize(momentum, groups)

print("Sector-neutral momentum:")
sector_neutral.head(7)

Sector-neutral momentum:


timestamp,IBM,TXN,NOW,META,BMY,JNJ,LMT,GD,SO,NEE
date,f64,f64,f64,f64,f64,f64,f64,f64,f64,f64
2024-01-02,,,,,,,,,,
2024-01-03,,,,,,,,,,
2024-01-04,,,,,,,,,,
2024-01-05,,,,,,,,,,
2024-01-08,,,,,,,,,,
2024-01-09,-6.479987,-5.689977,6.089972,6.079992,-1.57,1.57,3.765005,-3.765005,0.29,-0.29
2024-01-10,-15.530005,-16.150005,22.340005,9.340005,-1.27,1.27,0.785,-0.785,-0.35,0.35


## 7.3 `group_zscore(x, groups)` - Z-Score Within Groups

Standardize within sector.

$$\text{group\_zscore}(x, g)_{t,i} = \frac{x_{t,i} - \bar{x}_{t,g_{t,i}}}{\sigma_{t,g_{t,i}}}$$

In [53]:
sector_zscore = group_zscore(momentum, groups)

print("Momentum z-score within sector:")
sector_zscore.head(7)

Momentum z-score within sector:


timestamp,IBM,TXN,NOW,META,BMY,JNJ,LMT,GD,SO,NEE
date,f64,f64,f64,f64,f64,f64,f64,f64,f64,f64
2024-01-02,,,,,,,,,,
2024-01-03,,,,,,,,,,
2024-01-04,,,,,,,,,,
2024-01-05,,,,,,,,,,
2024-01-08,,,,,,,,,,
2024-01-09,-0.921273,-0.808956,0.865824,0.864405,-0.707107,0.707107,0.707107,-0.707107,0.707107,-0.707107
2024-01-10,-0.81537,-0.847922,1.172915,0.490377,-0.707107,0.707107,0.707107,-0.707107,-0.707107,0.707107


## 7.4 `group_scale(x, groups)` - Min-Max Scale Within Groups [0, 1]

Normalize to [0, 1] within sector.

In [54]:
sector_scaled = group_scale(momentum, groups)

print("Momentum scaled within sector [0, 1]:")
sector_scaled.head(7)

Momentum scaled within sector [0, 1]:


timestamp,IBM,TXN,NOW,META,BMY,JNJ,LMT,GD,SO,NEE
date,f64,f64,f64,f64,f64,f64,f64,f64,f64,f64
2024-01-02,,,,,,,,,,
2024-01-03,,,,,,,,,,
2024-01-04,,,,,,,,,,
2024-01-05,,,,,,,,,,
2024-01-08,,,,,,,,,,
2024-01-09,0.0,0.062849,1.0,0.999206,0.0,1.0,1.0,0.0,1.0,0.0
2024-01-10,0.016108,0.0,1.0,0.66225,0.0,1.0,1.0,0.0,0.0,1.0


## 7.5 `group_mean(x, weights, groups)` - Weighted Mean Within Groups

Compute weighted average within sector.

In [55]:
group_volume = client.ticks(group_symbols, field="volume", start="2024-01-01", end="2024-06-30")
market_cap_proxy = multiply(group_prices, group_volume)
sector_avg = group_mean(momentum, market_cap_proxy, groups)

print("Market-cap weighted sector average momentum:")
sector_avg.head(7)

Market-cap weighted sector average momentum:


timestamp,IBM,TXN,NOW,META,BMY,JNJ,LMT,GD,SO,NEE
date,f64,f64,f64,f64,f64,f64,f64,f64,f64,f64
2024-01-02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2024-01-03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2024-01-04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2024-01-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2024-01-08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2024-01-09,8.88994,8.88994,8.88994,8.88994,0.40246,0.40246,-3.485361,-3.485361,0.637081,0.637081
2024-01-10,24.035537,24.035537,24.035537,24.035537,-0.455975,-0.455975,-4.38101,-4.38101,0.102835,0.102835


---
# 8. Vector Operators

Aggregate multiple columns.

## 8.1 `vec_avg(*columns)` / `vec_sum(*columns)`

Average or sum across selected columns.

In [62]:
list_data = pl.DataFrame({
    "timestamp": ["2024-01-01", "2024-01-02"],
    "IBM": [[180.0, 185.0, 190.0], [182.0, 187.0]],
    "TXN": [[200.0, 205.0], [210.0, 215.0, 220.0]],
})
avg_targets = vec_avg(list_data)
print("\nAverage of list elements:")
print(avg_targets)


Average of list elements:
shape: (2, 3)
┌────────────┬───────┬───────┐
│ timestamp  ┆ IBM   ┆ TXN   │
│ ---        ┆ ---   ┆ ---   │
│ str        ┆ f64   ┆ f64   │
╞════════════╪═══════╪═══════╡
│ 2024-01-01 ┆ 185.0 ┆ 202.5 │
│ 2024-01-02 ┆ 184.5 ┆ 215.0 │
└────────────┴───────┴───────┘


In [63]:
# Sum of all columns
sum_signal = vec_sum(list_data)

print("Sum of all prices:")
sum_signal.head(7)

Sum of all prices:


timestamp,IBM,TXN
str,f64,f64
"""2024-01-01""",555.0,405.0
"""2024-01-02""",369.0,645.0


---
# 9. Putting It All Together: Alpha Example

Combine operators to build a simple momentum + mean reversion alpha.

In [71]:
window = 5

# 1. Calculate momentum
momentum = ts_delta(prices, window)

# 2. Cross-sectional rank (handles NaN gracefully)
momentum_rank = rank(momentum)

# 3. Mean reversion component
price_zscore = ts_zscore(prices, window)
mean_rev_signal = reverse(price_zscore)

# 4. Combine signals - filter=True treats NaN as 0
combined = add(momentum_rank, mean_rev_signal, filter=True)

# 5. Standardize and scale
alpha = zscore(combined)
portfolio_weights = scale(alpha, scale=1.0)

# Skip warmup period for clean output
print("Final portfolio weights (after warmup):")
portfolio_weights.tail(10)

Final portfolio weights (after warmup):


timestamp,IBM,TXN,NOW,BMY,LMT
date,f64,f64,f64,f64,f64
2024-06-14,-0.166667,-0.166667,-0.166667,0.457406,0.042594
2024-06-17,-0.14981,-0.14981,-0.14981,0.5,-0.050569
2024-06-18,-0.018517,-0.018517,-0.018517,0.5,-0.444449
2024-06-20,0.079803,0.079803,0.079803,0.260591,-0.5
2024-06-21,0.146357,0.146357,0.146357,-0.5,0.060928
2024-06-24,0.156694,0.156694,0.156694,-0.5,0.029917
2024-06-25,0.01867,0.01867,0.01867,-0.5,0.443989
2024-06-26,-0.04367,-0.04367,-0.04367,-0.368991,0.5
2024-06-27,-0.166667,-0.166667,-0.166667,0.23646,0.26354
2024-06-28,-0.166667,-0.166667,-0.166667,0.488897,0.011103


---
# 10. Cleanup

In [72]:
client.request_stats()

{'session_count': 0,
 'today_count': 7,
 'daily_counts': {'2026-01-19': 29, '2026-01-20': 7}}

In [73]:
client.close()
print("Done!")

Done!
