# Systematic Trading Framework â€” Code Reference (Detailed)

This notebook is a **markdown-only reference** that documents what each module and every `def` function does, file by file, based on the current codebase.

> Scope: `src/`, `config/`, and `tests/` modules that define functions/classes.


## src/backtesting/engine.py

### `BacktestResult`
- Dataclass that packages backtest outputs: `equity_curve`, `returns`, `positions`, `turnover`, and `summary` metrics.

### `_compute_summary(returns, periods_per_year=252)`
- Cleans and computes key performance metrics from strategy returns:
  - Cumulative return via cumulative product.
  - Annualized return by scaling over `periods_per_year`.
  - Annualized volatility from std * sqrt(periods_per_year).
  - Sharpe ratio (`ann_ret / ann_vol`) with zero-handling.
  - Max drawdown from equity curve.
- Returns a dictionary of these metrics.

### `run_backtest(df, signal_col, returns_col, cost_per_unit_turnover=0.0, slippage_per_unit_turnover=0.0, target_vol=None, vol_col=None, max_leverage=3.0, dd_guard=True, max_drawdown=0.2, cooloff_bars=20, periods_per_year=252)`
- Vectorized backtest engine that:
  - Validates the presence of `signal_col` and `returns_col`.
  - Builds positions from the signal column (optionally vol-targeted via `vol_col`).
  - Computes turnover and transaction costs (including slippage) as a function of turnover.
  - Applies a drawdown guard that scales down exposure after drawdown breaches.
  - Calculates and returns equity curve, returns, turnover, and summary stats.


## src/backtesting/strategies.py

### `buy_and_hold_signal(df, signal_name="signal_bh")`
- Produces a constant long-only signal (1.0) for all time steps.

### `trend_state_long_only_signal(df, state_col, signal_name="signal_trend_state_long_only")`
- Converts a trend regime/state column into a long-only signal (1.0 when state > 0).

### `trend_state_signal(df, state_col, signal_name="signal_trend_state", mode="long_short_hold")`
- Wrapper around `compute_trend_state_signal` to generate regime-based signals in a specified mode.

### `rsi_strategy(df, rsi_col, buy_level=30.0, sell_level=70.0, signal_name="signal_rsi", mode="long_short_hold")`
- Wrapper around `compute_rsi_signal` to map RSI values into signals based on thresholds.

### `momentum_strategy(df, momentum_col, long_threshold=0.0, short_threshold=None, signal_name="signal_momentum", mode="long_short_hold")`
- Wrapper around `compute_momentum_signal`, with optional short threshold, to generate momentum-driven signals.

### `stochastic_strategy(df, k_col, buy_level=20.0, sell_level=80.0, signal_name="signal_stochastic", mode="long_short_hold")`
- Wrapper around `compute_stochastic_signal` for stochastic %K signals.

### `volatility_regime_strategy(df, vol_col, quantile=0.5, signal_name="signal_volatility_regime", mode="long_short_hold")`
- Wrapper around `compute_volatility_regime_signal` that goes long in low-vol regimes and short in high-vol regimes (mode permitting).

### `probabilistic_signal(df, prob_col, signal_name="signal_prob", upper=0.55, lower=0.45)`
- Converts a probability forecast column into {-1, 0, 1} using upper/lower thresholds.

### `conviction_sizing_signal(df, prob_col, signal_name="signal_prob_size", clip=1.0)`
- Maps probabilities into a continuous position size in [-clip, clip].

### `regime_filtered_signal(df, base_signal_col, regime_col, signal_name="signal_regime_filtered", active_value=1.0)`
- Keeps a base signal only when a regime column equals a specific value, otherwise zeroes it out.


## src/models/lightgbm_baseline.py

### `default_feature_columns(df)`
- Returns a default list of feature names (if present) expected to be created by the feature pipeline.

### `LGBMBaselineConfig`
- Dataclass holding default LightGBM hyperparameters for the baseline regressor.

### `train_regressor(train_df, feature_cols, target_col, cfg=None)`
- Fits an `LGBMRegressor` to the provided training data and returns the trained model.

### `predict_returns(model, df, feature_cols, pred_col="pred_next_ret")`
- Uses a fitted model to generate return predictions and appends them as a new column.

### `prediction_to_signal(df, pred_col="pred_next_ret", signal_col="signal_lgb", long_threshold=0.0, short_threshold=None)`
- Maps predicted returns into discrete trade signals with thresholds.

### `train_test_split_time(df, train_frac=0.7)`
- Splits a DataFrame into time-ordered train/test sets without shuffling.


## src/data/loaders.py

### `load_ohlcv(symbol, start=None, end=None, interval="1d", source="yahoo", api_key=None)`
- High-level loader that selects the proper provider (`YahooFinanceProvider` or `AlphaVantageFXProvider`) and returns OHLCV data in a standardized schema.


## src/data/validation.py

### `validate_ohlcv(df, required_columns=("open", "high", "low", "close", "volume"), allow_missing_volume=True)`
- Validates OHLCV data integrity:
  - Required columns present.
  - DatetimeIndex, monotonic ordering, and no duplicates.
  - High/low/open/close consistency checks.
  - Volume NaN handling depending on `allow_missing_volume`.
- Raises `ValueError` with detailed errors if invalid.


## src/data/providers/base.py

### `MarketDataProvider.get_ohlcv(...)`
- Abstract method that provider implementations must define to return OHLCV data.


## src/data/providers/yahoo.py

### `YahooFinanceProvider.get_ohlcv(symbol, start=None, end=None, interval="1d")`
- Uses `yfinance` to download OHLCV data, normalizes column names, validates expected fields, and cleans the index.


## src/data/providers/alphavantage.py

### `AlphaVantageFXProvider.get_ohlcv(symbol, start=None, end=None, interval="1d")`
- Calls the Alpha Vantage FX_DAILY API and converts JSON to OHLCV DataFrame.
- Validates API key and symbol format (e.g., `EURUSD`).
- Adds a `volume` column (0.0) since FX data does not provide volume.


## src/utils/config.py

### `ConfigError`
- Custom exception raised for invalid or inconsistent experiment configs.

### `_resolve_config_path(config_path)`
- Resolves a config path relative to `config/` or project root and ensures it exists.

### `_load_yaml(path)`
- Loads YAML and ensures the top-level is a dictionary.

### `_deep_update(base, updates)`
- Recursive merge of nested dicts; scalars and lists are overwritten.

### `_load_with_extends(path, seen=None)`
- Loads a config file and resolves inheritance via `extends` while preventing cycles.

### `_default_risk_block(risk)`
- Applies defaults for cost, slippage, target volatility, leverage, and drawdown guard settings.

### `_default_backtest_block(backtest)`
- Applies defaults for backtest settings (e.g., `periods_per_year`, `returns_type`).

### `_resolve_logging_block(logging_cfg, config_path)`
- Resolves logging output directory and default run name.

### `_validate_data_block(data)`
- Ensures required data settings (symbol, source, interval) are valid.

### `_inject_api_key_from_env(data)`
- Pulls API keys from environment into the config when requested.

### `_validate_features_block(features)`
- Ensures features are a list of `step` dictionaries.

### `_validate_model_block(model)`
- Ensures a model `kind` string is defined.

### `_validate_signals_block(signals)`
- Ensures signals `kind` string is defined.

### `_validate_risk_block(risk)`
- Validates numeric bounds for costs, vol targeting, leverage, and drawdown guard values.

### `_validate_backtest_block(backtest)`
- Ensures `returns_col` and `signal_col` are defined, and validates `returns_type`.

### `load_experiment_config(config_path)`
- Full config loader that applies inheritance, defaults, validation, and logging resolution.


## src/utils/paths.py

### `in_project(*parts)`
- Joins a path relative to the project root.

### `ensure_directories_exist()`
- Creates expected project directories (config, data, logs, etc.).

### `describe_paths()`
- Prints core project paths for debugging.


## src/experiments/registry.py

### Registries (`FEATURE_REGISTRY`, `SIGNAL_REGISTRY`, `MODEL_REGISTRY`)
- Mapping tables linking string names (in YAML config) to python functions.

### `get_feature_fn(name)`
- Returns the feature function for a named step, or raises if unknown.

### `get_signal_fn(name)`
- Returns the signal function for a named step, or raises if unknown.

### `get_model_fn(name)`
- Returns the model function for a named step, or raises if unknown.


## src/experiments/models.py

### `infer_feature_columns(df, explicit_cols=None, exclude=None)`
- Determines feature columns for model training, using explicit list, defaults, or heuristics over numeric columns.

### `_build_forward_return_target(df, target_cfg=None)`
- Builds a forward return target column and a binary label column based on threshold/quantiles.

### `train_lightgbm_classifier(df, model_cfg, returns_col=None)`
- Trains a LightGBM classifier using time-based split and writes prediction probabilities to the dataset.
- Returns the updated DataFrame, trained model, and metadata including split info and target details.


## src/experiments/runner.py

### `ExperimentResult`
- Dataclass storing the resolved config, data, backtest result, model, model metadata, and artifact paths.

### `_apply_feature_steps(df, steps)`
- Applies each feature step in config order via the registry.

### `_apply_model_step(df, model_cfg, returns_col)`
- Trains model if `kind != "none"`, otherwise returns data unchanged.

### `_apply_signal_step(df, signals_cfg)`
- Generates signal columns via registry; supports DataFrame or Series outputs.

### `_resolve_vol_col(df, backtest_cfg, risk_cfg)`
- Finds which volatility column to use for targeting (config or heuristics).

### `_validate_returns_series(returns, returns_type)`
- Performs validation on return series; currently checks simple returns for values < -1.

### `_save_artifacts(run_dir, cfg, data, bt, model_meta)`
- Writes config and backtest artifacts (summary, equity, returns, positions, turnover) to disk.

### `run_experiment(config_path)`
- End-to-end pipeline: load config, load data, apply features, train model, generate signals, run backtest, and optionally log artifacts.


## src/features/returns.py

### `compute_returns(prices, log=False, dropna=True)`
- Computes simple returns (P_t / P_{t-1} - 1) or log returns (log(P_t / P_{t-1})).

### `add_close_returns(df, log=False, col_name=None)`
- Adds a close returns column to the DataFrame using `compute_returns`.


## src/features/lags.py

### `add_lagged_features(df, cols, lags=(1,2,5), prefix="lag")`
- Adds lagged versions of the specified columns using the given lags.


## src/features/volatility.py

### `compute_rolling_vol(returns, window, ddof=1, annualization_factor=None)`
- Computes rolling volatility from returns, optionally annualized.

### `compute_ewma_vol(returns, span, annualization_factor=None)`
- Computes EWMA volatility from returns, optionally annualized.

### `add_volatility_features(df, returns_col="close_logret", rolling_windows=(10,20,60), ewma_spans=(10,20), annualization_factor=252.0, inplace=False)`
- Adds rolling and EWMA volatility features to the DataFrame.


## src/features/technical/indicators.py

### `compute_true_range(high, low, close)`
- Computes true range as max of (high-low, |high-prev_close|, |low-prev_close|).

### `compute_atr(high, low, close, window=14, method="wilder")`
- Computes Average True Range (ATR) using Wilder smoothing or SMA.

### `add_bollinger_bands(close, window=20, n_std=2.0)`
- Builds Bollinger band features including band width and %B.

### `compute_macd(close, fast=12, slow=26, signal=9)`
- Computes MACD line, signal line, and histogram.

### `compute_ppo(close, fast=12, slow=26, signal=9)`
- Computes Percentage Price Oscillator (PPO) and related features.

### `compute_roc(close, window=10)`
- Computes Rate of Change (ROC) indicator.

### `compute_volume_zscore(volume, window=20)`
- Computes rolling z-score of volume.

### `compute_adx(high, low, close, window=14)`
- Computes ADX along with DI+ and DI-.

### `compute_mfi(high, low, close, volume, window=14)`
- Computes Money Flow Index (MFI).

### `add_indicator_features(df, price_col="close", high_col="high", low_col="low", volume_col="volume", ...)`
- Adds a large bundle of indicator features (Bollinger, MACD, PPO, ROC, ATR, volume z-score, ADX, MFI) to the DataFrame.


## src/features/technical/trend.py

### `compute_sma(prices, window, min_periods=None)`
- Computes Simple Moving Average and names it based on input series.

### `compute_ema(prices, span, adjust=False)`
- Computes Exponential Moving Average and names it based on input series.

### `add_trend_features(df, price_col="close", sma_windows=(20,50,200), ema_spans=(20,50), inplace=False)`
- Adds SMA/EMA and relative price vs MA features.

### `add_trend_regime_features(df, price_col="close", base_sma_for_sign=50, short_sma=20, long_sma=50, inplace=False)`
- Adds trend regime/state features using SMA cross and sign of `price_over_sma`.


## src/features/technical/momentum.py

### `compute_price_momentum(prices, window)`
- Computes price momentum as P_t / P_{t-window} - 1.

### `compute_return_momentum(returns, window)`
- Computes cumulative return momentum over a window.

### `compute_vol_normalized_momentum(returns, volatility, window, eps=1e-8)`
- Computes momentum normalized by volatility.

### `add_momentum_features(df, price_col="close", returns_col="close_logret", vol_col="vol_rolling_20", windows=(5,20,60), inplace=False)`
- Adds price momentum, return momentum, and vol-normalized momentum features.


## src/features/technical/oscillators.py

### `compute_rsi(prices, window=14, method="wilder")`
- Computes RSI using Wilder (EWMA) or simple average method.

### `compute_stoch_k(close, high, low, window=14)`
- Computes stochastic %K.

### `compute_stoch_d(k, smooth=3)`
- Computes stochastic %D (moving average of %K).

### `add_oscillator_features(df, price_col="close", high_col="high", low_col="low", rsi_windows=(14,), stoch_windows=(14,), stoch_smooth=3, inplace=False)`
- Adds RSI and stochastic oscillator features to the DataFrame.


## src/features/technical/__init__.py & src/features/__init__.py

- Expose feature functions at package level for easier imports and registry usage.


## src/signals/trend_signal.py

### `compute_trend_state_signal(df, state_col, signal_col="trend_state_signal", long_value=1.0, flat_value=0.0, short_value=-1.0, mode="long_short_hold")`
- Converts a trend state column into a trading signal based on `mode`.


## src/signals/momentum_signal.py

### `compute_momentum_signal(df, momentum_col, long_threshold=0.0, short_threshold=None, signal_col="momentum_signal", mode="long_short_hold")`
- Converts a momentum feature into discrete trade signals.


## src/signals/rsi_signal.py

### `compute_rsi_signal(df, rsi_col, buy_level, sell_level, signal_col="rsi_signal", mode="long_short_hold")`
- Generates signals based on RSI thresholds for buy/sell levels.


## src/signals/volatility_signal.py

### `compute_volatility_regime_signal(df, vol_col, quantile=0.5, signal_col="volatility_regime_signal", mode="long_short_hold")`
- Generates regime signals: long when vol <= quantile, short when vol > quantile.


## src/signals/stochastic_signal.py

### `compute_stochastic_signal(df, k_col, buy_level=20.0, sell_level=80.0, signal_col="stochastic_signal", mode="long_short_hold")`
- Generates signals based on stochastic %K thresholds.


## src/risk/position_sizing.py

### `compute_vol_target_leverage(vol, target_vol, max_leverage=3.0, min_leverage=0.0, eps=1e-8)`
- Computes target leverage as `target_vol / vol`, clipped to allowed bounds.

### `scale_signal_by_vol(signal, vol, target_vol, max_leverage=3.0, min_leverage=0.0, eps=1e-8)`
- Scales a trading signal by volatility-target leverage.


## src/risk/controls.py

### `compute_drawdown(equity)`
- Computes drawdown series from equity curve.

### `drawdown_cooloff_multiplier(equity, max_drawdown=0.2, cooloff_bars=20, min_exposure=0.0)`
- When drawdown breaches threshold, reduces exposure for a cooling-off period.


## src/data/__init__.py, src/models/__init__.py, src/signals/__init__.py, src/risk/__init__.py, src/evaluation/__init__.py

- Package initializers that expose submodules or mark packages for import; no functions defined.


## tests/conftest.py

### Path setup
- Ensures the project root is on `sys.path` so tests can import `src.*` modules.


## tests/test_core.py

### `test_compute_returns_simple_and_log()`
- Validates numeric correctness of simple and log return computations.

### `test_add_trend_features_columns()`
- Ensures trend features create expected columns.

### `test_validate_ohlcv_flags_invalid_high_low()`
- Ensures invalid OHLCV inputs are rejected.

### `test_run_backtest_costs_and_slippage_reduce_returns()`
- Ensures applying costs+slippage reduces overall strategy returns.
