# Create Your Own Signal Detector

SignalFlow v0.5.0 provides a flexible detector framework for building custom trading signal generators. Detectors encapsulate the logic of transforming raw market data into actionable signals.

**What you'll learn:**
- The `SignalDetector` lifecycle: `preprocess()` extracts features, `detect()` generates signals
- How to register detectors with the `@sf_component` decorator for discovery and reuse
- How to combine multiple detectors with multi-detector aggregation

**Prerequisites:** [01 - Quick Start](01_quickstart.ipynb)

## 1. Setup & Data

In [1]:
from datetime import datetime
from pathlib import Path

import polars as pl
import signalflow as sf
from signalflow.data.source import VirtualDataProvider
from signalflow.data.raw_store import DuckDbSpotStore
from signalflow.data import RawDataFactory

# Generate synthetic OHLCV data (no API keys required)
db_path = Path("/tmp/custom_detector_tutorial.duckdb")
store = DuckDbSpotStore(db_path=db_path)
VirtualDataProvider(store=store, seed=42).download(
    pairs=["BTCUSDT", "ETHUSDT", "SOLUSDT"],
    n_bars=15_000,
)

# Load into RawData container
raw_data = RawDataFactory.from_duckdb_spot_store(
    spot_store_path=db_path,
    pairs=["BTCUSDT", "ETHUSDT", "SOLUSDT"],
    start=datetime(2020, 1, 1),
    end=datetime(2030, 1, 1),
)

print(f"Loaded {len(raw_data.pairs)} pairs")
print(f"Spot data shape: {raw_data.get('spot').shape}")

[32m2026-02-15 00:49:41.958[0m | [1mINFO    [0m | [36msignalflow.data.raw_store.duckdb_stores[0m:[36m_ensure_tables[0m:[36m153[0m - [1mDatabase initialized: /tmp/custom_detector_tutorial.duckdb (data_type=spot, timeframe=1m)[0m
[32m2026-02-15 00:49:42.075[0m | [34m[1mDEBUG   [0m | [36msignalflow.data.raw_store.duckdb_stores[0m:[36minsert_klines[0m:[36m220[0m - [34m[1mInserted 15,000 rows for BTCUSDT[0m
[32m2026-02-15 00:49:42.077[0m | [1mINFO    [0m | [36msignalflow.data.source.virtual[0m:[36mdownload[0m:[36m255[0m - [1mVirtualDataProvider: generated 15000 bars for BTCUSDT[0m
[32m2026-02-15 00:49:42.174[0m | [34m[1mDEBUG   [0m | [36msignalflow.data.raw_store.duckdb_stores[0m:[36minsert_klines[0m:[36m220[0m - [34m[1mInserted 15,000 rows for ETHUSDT[0m
[32m2026-02-15 00:49:42.176[0m | [1mINFO    [0m | [36msignalflow.data.source.virtual[0m:[36mdownload[0m:[36m255[0m - [1mVirtualDataProvider: generated 15000 bars for ETHUSDT[0

Loaded 3 pairs
Spot data shape: (45000, 8)


## 2. Built-in Detector Example

SignalFlow ships with `ExampleSmaCrossDetector`, a simple SMA crossover detector registered as `"example/sma_cross"`. It demonstrates the core pattern:

1. **Features** are declared in `__post_init__()` -- the base class runs them automatically in `preprocess()`
2. **`detect()`** receives the feature-enriched DataFrame and returns a `Signals` object
3. **`Signals`** wraps a Polars DataFrame with columns: `pair`, `timestamp`, `signal_type`, `signal`

Let's run it and inspect the output.

In [2]:
from signalflow.core import RawDataView
from signalflow.detector import ExampleSmaCrossDetector

# Create detector and run
sma_detector = ExampleSmaCrossDetector(fast_period=20, slow_period=50)
view = RawDataView(raw=raw_data)
sma_signals = sma_detector.run(view)

# Inspect signal output
print(f"Signal DataFrame columns: {sma_signals.value.columns}")
print(f"Total signals: {sma_signals.value.height}")
print()

# Signal count per pair
print("Signals per pair:")
print(
    sma_signals.value
    .group_by("pair")
    .agg(pl.len().alias("count"))
    .sort("pair")
)
print()

# Signal type distribution
print("Signal type distribution:")
print(
    sma_signals.value
    .group_by("signal_type")
    .agg(pl.len().alias("count"))
    .sort("signal_type")
)

Signal DataFrame columns: ['pair', 'timestamp', 'signal_type', 'signal']
Total signals: 1022

Signals per pair:
shape: (3, 2)
┌─────────┬───────┐
│ pair    ┆ count │
│ ---     ┆ ---   │
│ str     ┆ u32   │
╞═════════╪═══════╡
│ BTCUSDT ┆ 333   │
│ ETHUSDT ┆ 346   │
│ SOLUSDT ┆ 343   │
└─────────┴───────┘

Signal type distribution:
shape: (2, 2)
┌─────────────┬───────┐
│ signal_type ┆ count │
│ ---         ┆ ---   │
│ str         ┆ u32   │
╞═════════════╪═══════╡
│ fall        ┆ 511   │
│ rise        ┆ 511   │
└─────────────┴───────┘


## 3. Create an RSI Threshold Detector

Now let's build a custom detector from scratch. The pattern is:

1. **Subclass `SignalDetector`** with `@dataclass` and `@sf_component(name="...")`
2. **Set `allowed_signal_types`** to declare which signal types this detector produces
3. **Implement `detect()`** -- receive a Polars DataFrame, return `Signals`

Our RSI detector will generate `"rise"` signals when RSI drops below an oversold threshold, indicating a potential buying opportunity.

In [3]:
from dataclasses import dataclass
from typing import Any, ClassVar

import polars as pl
from signalflow.core import RawDataView, Signals, SfComponentType, sf_component
from signalflow.detector.base import SignalDetector


@dataclass
@sf_component(name="tutorial/rsi_oversold")
class RsiOversoldDetector(SignalDetector):
    """Generates BUY signals when RSI drops below threshold."""

    component_type: ClassVar[SfComponentType] = SfComponentType.DETECTOR
    rsi_period: int = 14
    oversold_threshold: float = 30.0
    allowed_signal_types: set[str] | None = None

    def __post_init__(self):
        self.allowed_signal_types = {"rise"}

    def detect(self, features: pl.DataFrame, context: dict[str, Any] | None = None) -> Signals:
        pair_col = self.pair_col
        ts_col = self.ts_col

        # Compute RSI using Polars
        delta = pl.col("close").diff().over(pair_col)
        gain = delta.clip(lower_bound=0).rolling_mean(self.rsi_period).over(pair_col)
        loss = (-delta.clip(upper_bound=0)).rolling_mean(self.rsi_period).over(pair_col)
        rs = gain / loss
        rsi = 100 - (100 / (1 + rs))

        df = features.with_columns(rsi.alias("_rsi"))

        # Generate signals where RSI < threshold
        signals_df = (
            df.filter(pl.col("_rsi") < self.oversold_threshold)
            .select([
                pair_col,
                ts_col,
                pl.lit("rise").alias("signal_type"),
                pl.lit(1).alias("signal"),
            ])
        )
        return Signals(signals_df)


print(f"Detector registered: {RsiOversoldDetector.__name__}")

Detector registered: RsiOversoldDetector


## 4. Test the Custom Detector

Run the detector the same way as any built-in detector: create a `RawDataView` and call `.run()`.

In [4]:
# Create detector instance and run
rsi_detector = RsiOversoldDetector(rsi_period=14, oversold_threshold=30)
view = RawDataView(raw=raw_data)
rsi_signals = rsi_detector.run(view)

print(f"Total RSI oversold signals: {rsi_signals.value.height}")
print()

# Signal stats per pair
print("Signals per pair:")
print(
    rsi_signals.value
    .group_by("pair")
    .agg(pl.len().alias("count"))
    .sort("pair")
)
print()

# Show first few signals
print("Sample signals:")
print(rsi_signals.value.head(5))

Total RSI oversold signals: 5762

Signals per pair:
shape: (3, 2)
┌─────────┬───────┐
│ pair    ┆ count │
│ ---     ┆ ---   │
│ str     ┆ u32   │
╞═════════╪═══════╡
│ BTCUSDT ┆ 1852  │
│ ETHUSDT ┆ 1854  │
│ SOLUSDT ┆ 2056  │
└─────────┴───────┘

Sample signals:
shape: (5, 4)
┌─────────┬─────────────────────┬─────────────┬────────┐
│ pair    ┆ timestamp           ┆ signal_type ┆ signal │
│ ---     ┆ ---                 ┆ ---         ┆ ---    │
│ str     ┆ datetime[μs]        ┆ str         ┆ i32    │
╞═════════╪═════════════════════╪═════════════╪════════╡
│ BTCUSDT ┆ 2024-01-01 01:32:00 ┆ rise        ┆ 1      │
│ BTCUSDT ┆ 2024-01-01 01:33:00 ┆ rise        ┆ 1      │
│ BTCUSDT ┆ 2024-01-01 01:34:00 ┆ rise        ┆ 1      │
│ BTCUSDT ┆ 2024-01-01 01:35:00 ┆ rise        ┆ 1      │
│ BTCUSDT ┆ 2024-01-01 01:53:00 ┆ rise        ┆ 1      │
└─────────┴─────────────────────┴─────────────┴────────┘


## 5. Use in a Backtest

Custom detectors work seamlessly with the `sf.Backtest()` fluent API. You can pass either:
- A **registry name** (string): `"tutorial/rsi_oversold"`
- A **detector instance**: `RsiOversoldDetector(...)`

In [5]:
# Option A: pass detector instance directly
result = (
    sf.Backtest("rsi_oversold")
    .data(raw=raw_data)
    .detector(RsiOversoldDetector(rsi_period=14, oversold_threshold=25))
    .exit(tp=0.03, sl=0.015)
    .capital(50_000)
    .run()
)

print(result.summary())
print()

# Option B: use registry name (works because @sf_component registered it)
result_via_registry = (
    sf.Backtest("rsi_oversold_registry")
    .data(raw=raw_data)
    .detector("tutorial/rsi_oversold", rsi_period=14, oversold_threshold=25)
    .exit(tp=0.03, sl=0.015)
    .capital(50_000)
    .run()
)

print(f"Registry-based result: {result_via_registry.n_trades} trades, "
      f"win rate {result_via_registry.win_rate:.1%}")

[32m2026-02-15 00:49:42.467[0m | [34m[1mDEBUG   [0m | [36msignalflow.core.registry[0m:[36m_discover_internal_packages[0m:[36m152[0m - [34m[1mautodiscover: failed to import signalflow.detector.adapter[0m


Backtesting: 100%|██████████| 15000/15000 [00:01<00:00, 7790.34it/s]
  spot = accessor.to_polars()



           BACKTEST SUMMARY
  Trades:                3270
  Win Rate:              0.0%
  Profit Factor:         0.00
--------------------------------------------------
  Initial Capital: $   50,000.00
  Final Capital:   $        0.00
  Total Return:       -100.0%
--------------------------------------------------




Backtesting: 100%|██████████| 15000/15000 [00:02<00:00, 5467.10it/s]

Registry-based result: 3270 trades, win rate 0.0%





## 6. Multi-Detector Strategy

SignalFlow supports combining multiple detectors with configurable aggregation. Each detector runs independently, and their signals are merged according to the chosen mode:

| Mode | Description |
|------|-------------|
| `"merge"` | Sequential merge via `Signals.__add__` (last detector has priority) |
| `"any"` | Signal fires if **any** detector agrees |
| `"majority"` | Signal fires if **majority** of detectors agree |
| `"unanimous"` | Signal fires only if **all** detectors agree |
| `"weighted"` | Weighted vote with per-detector weights |

In [6]:
result = (
    sf.Backtest("ensemble")
    .data(raw=raw_data)
    .detector("example/sma_cross", fast_period=20, slow_period=50, name="sma")
    .detector(RsiOversoldDetector(rsi_period=14, oversold_threshold=25), name="rsi")
    .aggregation(mode="any")
    .exit(tp=0.03, sl=0.015)
    .capital(50_000)
    .run()
)

print(result.summary())

Backtesting: 100%|██████████| 15000/15000 [00:05<00:00, 2537.45it/s]


           BACKTEST SUMMARY
  Trades:                4260
  Win Rate:              0.0%
  Profit Factor:         0.00
--------------------------------------------------
  Initial Capital: $   50,000.00
  Final Capital:   $        0.00
  Total Return:       -100.0%
--------------------------------------------------






## 7. Discover Available Detectors

The `default_registry` keeps track of all detectors registered via `@sf_component`. Use it to discover what's available.

In [7]:
from signalflow.core import default_registry, SfComponentType

detectors = default_registry.list(SfComponentType.DETECTOR)
print(f"Available detectors ({len(detectors)}):")
for name in sorted(detectors):
    print(f"  - {name}")

Available detectors (12):
  - anomaly_detector
  - example/sma_cross
  - funding/rate_transition
  - local_extrema_detector
  - market_wide/agreement
  - market_wide/cusum
  - market_wide/zscore
  - percentile_regime_detector
  - structure_detector
  - tutorial/rsi_oversold
  - volatility_detector
  - zscore_anomaly_detector


## Cleanup

In [8]:
store.close()
db_path.unlink(missing_ok=True)
print("Done!")

Done!


## Key Takeaways

1. **Subclass `SignalDetector`** and implement `detect()` to return a `Signals` object.
2. **Use `@sf_component(name="category/name")`** to register your detector for discovery and use via registry name.
3. **`preprocess()`** handles feature extraction automatically if you set `self.features` in `__post_init__()`.
4. **`Signals`** wraps a Polars DataFrame with required columns: `pair`, `timestamp`, `signal_type`, and `signal`.
5. **Multi-detector strategies** combine signals with aggregation modes like `"any"`, `"majority"`, or `"weighted"`.

## Next Steps

- [03 - Data Loading & Resampling](03_data_loading.ipynb): Work with multiple timeframes and auto-resampling
- [04 - Pipeline Visualization](04_visualization.ipynb): Visualize your strategy pipeline as an interactive graph