# Signal Evolution by Equity

Use this notebook to see how each symbol's signal moves over time and which signal component drives it.

Questions this answers:
- Are signals stable or noisy by symbol?
- Are short/medium/long components aligned?
- Which symbols have persistent high-magnitude signals?


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
from io import StringIO
from IPython.display import display

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 6)

from QuantConnect import *
from QuantConnect.Research import QuantBook
from config import TEAM_ID, ALPHA_SIGNAL_WEIGHTS, ALPHA_SIGNAL_TEMPERATURE, ALPHA_MIN_MAGNITUDE

qb = QuantBook()
print('QuantBook initialized')


def read_csv_from_store(key):
    try:
        if not qb.ObjectStore.ContainsKey(key):
            print(f'ObjectStore key not found: {key}')
            return None
        content = qb.ObjectStore.Read(key)
        if not content:
            print(f'Empty ObjectStore key: {key}')
            return None
        return pd.read_csv(StringIO(content))
    except Exception as e:
        print(f'Error reading {key}: {e}')
        return None


In [None]:
df_signals = read_csv_from_store(f'{TEAM_ID}/signals.csv')

if df_signals is None:
    raise ValueError('signals.csv is required. Run a backtest first.')

required_cols = ['date', 'symbol', 'direction', 'magnitude', 'price', 'sma_short', 'sma_medium', 'sma_long', 'atr']
missing = [c for c in required_cols if c not in df_signals.columns]
if missing:
    raise ValueError(f'signals.csv missing required columns: {missing}')

df = df_signals.copy()
df['date'] = pd.to_datetime(df['date'])
for col in ['magnitude', 'price', 'sma_short', 'sma_medium', 'sma_long', 'atr']:
    df[col] = pd.to_numeric(df[col], errors='coerce')

df['direction'] = df['direction'].astype(str).str.title()
df['direction_sign'] = np.where(df['direction'].eq('Up'), 1.0, -1.0)
df['signed_magnitude'] = np.where(df['direction_sign'] > 0, df['magnitude'].abs(), -df['magnitude'].abs())
df['abs_magnitude'] = df['signed_magnitude'].abs()

# Mirror alpha model constants from shared config.py
WEIGHT_SHORT, WEIGHT_MEDIUM, WEIGHT_LONG = ALPHA_SIGNAL_WEIGHTS
SIGNAL_TEMPERATURE = ALPHA_SIGNAL_TEMPERATURE

safe_atr = df['atr'].replace(0, np.nan)
df['dist_short'] = (df['price'] - df['sma_short']) / safe_atr
df['dist_medium'] = (df['price'] - df['sma_medium']) / safe_atr
df['dist_long'] = (df['price'] - df['sma_long']) / safe_atr

df['contrib_short'] = WEIGHT_SHORT * df['dist_short']
df['contrib_medium'] = WEIGHT_MEDIUM * df['dist_medium']
df['contrib_long'] = WEIGHT_LONG * df['dist_long']
df['composite_score'] = df['contrib_short'] + df['contrib_medium'] + df['contrib_long']
df['implied_magnitude'] = np.tanh(df['composite_score'] / SIGNAL_TEMPERATURE)

df['all_trends_aligned'] = (
    ((df['dist_short'] > 0) & (df['dist_medium'] > 0) & (df['dist_long'] > 0)) |
    ((df['dist_short'] < 0) & (df['dist_medium'] < 0) & (df['dist_long'] < 0))
)
df['parity_abs_error'] = (df['signed_magnitude'] - df['implied_magnitude']).abs()

df = df.sort_values(['symbol', 'date']).reset_index(drop=True)
print(f'Alpha params from config: weights={ALPHA_SIGNAL_WEIGHTS}, temperature={ALPHA_SIGNAL_TEMPERATURE}, min_magnitude={ALPHA_MIN_MAGNITUDE}')
print(f'Signal rows: {len(df):,}')
print(f'Symbols: {df["symbol"].nunique()}')
print(f'Date range: {df["date"].min().date()} to {df["date"].max().date()}')
print(f'All-horizon agreement (rounded inputs): {100 * df["all_trends_aligned"].mean():.1f}%')
print(f'Median |logged - implied|: {df["parity_abs_error"].median():.4f}')
print(f'95th pct |logged - implied|: {df["parity_abs_error"].quantile(0.95):.4f}')


## Symbol Signal Summary Table

This table summarizes each symbol's signal history: total observations, mean and median absolute magnitude, signal volatility, last signal date, and number of direction flips during the backtest. Symbols with high mean magnitude and few direction flips are persistent trend signals; symbols with frequent flips are noise-prone and may warrant closer monitoring. The table also determines which symbols to include in the charts below — either a custom focus list or the top 12 by frequency.

## Signal Magnitude Heatmap by Symbol

This heatmap displays signed signal magnitude for each selected symbol across every rebalance date, with green indicating a long signal and red a short signal. Horizontal stripes of consistent color identify symbols with stable directional trends, while rapidly alternating red/green cells flag noisy or frequently reversing signals. Color intensity encodes signal strength, so pale cells near white represent weak or borderline signals close to the 0.05 minimum threshold.

## Per-Symbol Signal Path Grid

This grid of small-multiple charts plots logged signed signal magnitude and a re-derived magnitude from the production alpha formula:

`score = w_short*dist_short + w_medium*dist_medium + w_long*dist_long` and `magnitude = tanh(score / temperature)`,
with `w_short/w_medium/w_long` and `temperature` loaded from `config.py`.

Points are rebalance snapshots (not daily recomputes). Comparing the two lines shows whether the notebook reconstruction matches logged output; persistent gaps suggest rounding effects or a logging/parity issue.


## Price And Moving-Average Diagnostic

This chart overlays each symbol's logged rebalance-date close with SMA(20/63/252), the exact ingredients used by the alpha model.
Vertical guide lines mark signal direction flips (`Up` <-> `Down`) so you can inspect whether flips line up with price crossing one or more moving averages.

Note: this uses `signals.csv` snapshots (rebalance observations), not full daily history.


## Portfolio Aggregate Diagnostics

This section adds two portfolio-level time series:
- **Portfolio turnover**: `0.5 * sum(|w_t - w_{t-1}|)` using `positions.csv` daily weights.
- **Weighted average signal**: cross-sectional average of `signed_magnitude`, weighted by absolute portfolio weights on each signal date.


## Latest Rebalance Signal Snapshot

This table shows the signal snapshot from the most recent rebalance date, ranked by absolute magnitude, along with the individual short, medium, and long horizon contributions to each signal. It answers which symbols have the strongest current signals and which horizon is driving each signal the most. Use this table as an at-a-glance view of the portfolio's directional thesis going into the next scaling week.

In [None]:
symbol_summary = (
    df.groupby('symbol', as_index=False)
      .agg(
          signal_obs=('date', 'count'),
          mean_abs_mag=('abs_magnitude', 'mean'),
          median_abs_mag=('abs_magnitude', 'median'),
          std_mag=('signed_magnitude', 'std'),
          last_signal=('date', 'max'),
          direction_flips=('direction_sign', lambda s: int((s.diff().fillna(0) != 0).sum()))
      )
      .sort_values(['signal_obs', 'mean_abs_mag'], ascending=[False, False])
)

display(symbol_summary.head(20))

# Optional: set your own list, e.g. ['AAPL', 'MSFT', 'JPM']
focus_symbols = []
max_symbols = 12

if focus_symbols:
    symbols_to_plot = [s for s in focus_symbols if s in set(df['symbol'])]
else:
    symbols_to_plot = symbol_summary.head(max_symbols)['symbol'].tolist()

print('Plotting symbols:', symbols_to_plot)


In [None]:
if not symbols_to_plot:
    raise ValueError('No symbols selected for plotting.')

heatmap_df = (
    df[df['symbol'].isin(symbols_to_plot)]
      .pivot(index='symbol', columns='date', values='signed_magnitude')
      .sort_index()
)

# Format dates and thin labels so the x-axis stays readable.
date_labels = pd.to_datetime(heatmap_df.columns).strftime('%Y-%m-%d').tolist()
max_xticks = 18
step = max(1, len(date_labels) // max_xticks)
xticklabels = [label if (i % step == 0) else '' for i, label in enumerate(date_labels)]

plt.figure(figsize=(18, max(5, 0.55 * len(symbols_to_plot) + 2)))
sns.heatmap(
    heatmap_df,
    cmap='RdYlGn',
    center=0,
    vmin=-1,
    vmax=1,
    linewidths=0.0,
    xticklabels=xticklabels,
    cbar_kws={'label': 'Signed signal magnitude'}
)
plt.title('Signal Magnitude Heatmap by Symbol')
plt.xlabel('Rebalance date')
plt.ylabel('Symbol')
plt.xticks(rotation=60, ha='right')
plt.tight_layout()
plt.show()


In [None]:
n = len(symbols_to_plot)
cols = 3
rows = int(np.ceil(n / cols))
fig, axes = plt.subplots(rows, cols, figsize=(18, max(4 * rows, 6)), sharex=True)
axes = np.array(axes).reshape(-1)

for i, symbol in enumerate(symbols_to_plot):
    ax = axes[i]
    ds = df[df['symbol'] == symbol].sort_values('date')

    ax.plot(ds['date'], ds['signed_magnitude'], label='signed magnitude', color='#1f77b4', linewidth=1.6)
    ax.plot(ds['date'], ds['implied_magnitude'], label=f'implied tanh(score / {SIGNAL_TEMPERATURE:g})', color='#ff7f0e', linewidth=1.1, alpha=0.85)
    ax.axhline(0, color='black', linewidth=0.8, alpha=0.6)
    ax.set_title(symbol)
    ax.set_ylim(-1.05, 1.05)
    ax.grid(alpha=0.25)

for j in range(i + 1, len(axes)):
    axes[j].axis('off')

handles, labels = axes[0].get_legend_handles_labels()
fig.legend(handles, labels, loc='upper center', ncol=2, frameon=False)
fig.suptitle('Per-Symbol Signal Path', y=0.995)
plt.tight_layout(rect=[0, 0, 1, 0.97])
plt.show()


In [None]:
# Optional: inspect specific names, e.g. ['UNH']
ma_focus_symbols = []
max_ma_symbols = 6

# Background shading controls:
# green = long, red = short, darker = larger |signal|
shade_signal_regime = True
min_shade_alpha = 0.04
max_shade_alpha = 0.24

if ma_focus_symbols:
    ma_symbols = [s for s in ma_focus_symbols if s in set(df['symbol'])]
else:
    ma_symbols = symbols_to_plot[:max_ma_symbols]

if not ma_symbols:
    raise ValueError('No symbols selected for price/SMA diagnostic plot.')

print('Price/SMA diagnostic symbols:', ma_symbols)

n = len(ma_symbols)
cols = 2
rows = int(np.ceil(n / cols))
fig, axes = plt.subplots(rows, cols, figsize=(18, max(4.5 * rows, 6)), sharex=True)
axes = np.array(axes).reshape(-1)

for i, symbol in enumerate(ma_symbols):
    ax = axes[i]
    ds = df[df['symbol'] == symbol].sort_values('date').reset_index(drop=True)

    if shade_signal_regime and len(ds) > 0:
        if len(ds) > 1:
            typical_gap = ds['date'].diff().dropna().median()
            if pd.isna(typical_gap) or typical_gap <= pd.Timedelta(0):
                typical_gap = pd.Timedelta(days=5)
        else:
            typical_gap = pd.Timedelta(days=5)

        for k in range(len(ds)):
            start = ds.loc[k, 'date']
            end = ds.loc[k + 1, 'date'] if k < len(ds) - 1 else start + typical_gap
            m = float(ds.loc[k, 'signed_magnitude'])
            m = max(-1.0, min(1.0, m))

            color = plt.cm.RdYlGn((m + 1.0) / 2.0)
            alpha = min_shade_alpha + (max_shade_alpha - min_shade_alpha) * min(1.0, abs(m))
            ax.axvspan(start, end, color=color, alpha=alpha, linewidth=0, zorder=0)

    ax.plot(ds['date'], ds['price'], label='Price', color='#222222', linewidth=1.9, zorder=3)
    ax.plot(ds['date'], ds['sma_short'], label='SMA 20', color='#1f77b4', linewidth=1.2, zorder=3)
    ax.plot(ds['date'], ds['sma_medium'], label='SMA 63', color='#ff7f0e', linewidth=1.2, zorder=3)
    ax.plot(ds['date'], ds['sma_long'], label='SMA 252', color='#2ca02c', linewidth=1.2, zorder=3)

    flip_mask = ds['direction_sign'].diff().fillna(0) != 0
    flip_dates = ds.loc[flip_mask, 'date']
    for d in flip_dates:
        ax.axvline(d, color='#7f7f7f', alpha=0.16, linewidth=0.8, zorder=2)

    flip_count = int(flip_mask.sum())
    ax.set_title(f'{symbol} | flips={flip_count}')
    ax.grid(alpha=0.25)

for j in range(i + 1, len(axes)):
    axes[j].axis('off')

handles, labels = axes[0].get_legend_handles_labels()
fig.legend(handles, labels, loc='upper center', ncol=4, frameon=False)
fig.suptitle('Price + SMA Diagnostic (Rebalance Snapshots)', y=0.995)
fig.text(0.5, 0.972, 'Background shading: green=long, red=short, darker=stronger |signal|', ha='center', va='center', fontsize=9, color='#444444')
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()


In [None]:
df_positions = read_csv_from_store(f'{TEAM_ID}/positions.csv')

if df_positions is None or len(df_positions) == 0:
    print('positions.csv not available; skipping portfolio aggregate diagnostics.')
else:
    pos = df_positions.copy()
    required_pos_cols = ['date', 'symbol', 'weight']
    missing_pos = [c for c in required_pos_cols if c not in pos.columns]
    if missing_pos:
        raise ValueError(f'positions.csv missing required columns: {missing_pos}')

    pos['date'] = pd.to_datetime(pos['date'])
    pos['weight'] = pd.to_numeric(pos['weight'], errors='coerce').fillna(0.0)
    pos['symbol'] = pos['symbol'].astype(str)

    # Keep one row per date/symbol in case of duplicates.
    pos = (
        pos.sort_values(['date', 'symbol'])
           .drop_duplicates(['date', 'symbol'], keep='last')
    )

    weights_wide = (
        pos.pivot(index='date', columns='symbol', values='weight')
           .sort_index()
           .fillna(0.0)
    )

    # Daily portfolio turnover in weight space.
    daily_turnover = 0.5 * weights_wide.diff().abs().sum(axis=1)
    daily_turnover = daily_turnover.fillna(0.0)
    turnover_20d = daily_turnover.rolling(20, min_periods=5).mean()

    sig = df[['date', 'symbol', 'signed_magnitude']].copy()
    sig['date'] = pd.to_datetime(sig['date'])
    sig['symbol'] = sig['symbol'].astype(str)

    sig_w = sig.merge(
        pos[['date', 'symbol', 'weight']],
        on=['date', 'symbol'],
        how='left'
    )
    sig_w['weight'] = pd.to_numeric(sig_w['weight'], errors='coerce').fillna(0.0)
    sig_w['abs_weight'] = sig_w['weight'].abs()

    denom = sig_w.groupby('date')['abs_weight'].sum()
    numer_signed = (sig_w['signed_magnitude'] * sig_w['abs_weight']).groupby(sig_w['date']).sum()
    numer_abs = (sig_w['signed_magnitude'].abs() * sig_w['abs_weight']).groupby(sig_w['date']).sum()

    weighted_signal = pd.DataFrame({
        'date': denom.index,
        'gross_weight_on_signal_set': denom.values,
        'weighted_avg_signal': np.where(denom.values > 0, numer_signed.values / denom.values, np.nan),
        'weighted_avg_abs_signal': np.where(denom.values > 0, numer_abs.values / denom.values, np.nan),
    })

    fig, axes = plt.subplots(2, 1, figsize=(16, 10), sharex=False)

    ax = axes[0]
    ax.plot(daily_turnover.index, daily_turnover.values, color='#1f77b4', linewidth=1.1, alpha=0.8, label='Daily turnover')
    ax.plot(turnover_20d.index, turnover_20d.values, color='#d62728', linewidth=2.0, label='20-day mean')
    ax.set_title('Portfolio Weighted Turnover')
    ax.set_ylabel('Turnover')
    ax.grid(alpha=0.25)
    ax.legend(frameon=False, loc='upper left')

    ax = axes[1]
    ws = weighted_signal.dropna(subset=['weighted_avg_signal']).sort_values('date')
    ax.plot(ws['date'], ws['weighted_avg_signal'], color='#2c3e50', linewidth=1.8, marker='o', markersize=3, label='Weighted avg signal (signed)')
    ax.plot(ws['date'], ws['weighted_avg_abs_signal'], color='#9467bd', linewidth=1.4, linestyle='--', alpha=0.9, label='Weighted avg |signal|')
    ax.axhline(0, color='black', linewidth=0.8, alpha=0.6)
    ax.fill_between(ws['date'], 0, ws['weighted_avg_signal'], where=(ws['weighted_avg_signal'] >= 0), color='#2ca02c', alpha=0.18, interpolate=True)
    ax.fill_between(ws['date'], 0, ws['weighted_avg_signal'], where=(ws['weighted_avg_signal'] < 0), color='#d62728', alpha=0.18, interpolate=True)
    ax.set_title('Portfolio Weighted Average Signal (Signal Dates)')
    ax.set_ylabel('Signal')
    ax.set_xlabel('Date')
    ax.set_ylim(-1.05, 1.05)
    ax.grid(alpha=0.25)
    ax.legend(frameon=False, loc='upper left')

    for a in axes:
        locator = mdates.AutoDateLocator(minticks=6, maxticks=10)
        a.xaxis.set_major_locator(locator)
        a.xaxis.set_major_formatter(mdates.ConciseDateFormatter(locator))

    plt.tight_layout()
    plt.show()


In [None]:
latest_date = df['date'].max()
latest = (
    df[df['date'] == latest_date]
      .sort_values('abs_magnitude', ascending=False)
      [['date', 'symbol', 'signed_magnitude', 'contrib_short', 'contrib_medium', 'contrib_long']]
)

print(f'Latest rebalance date: {latest_date.date()}')
display(latest.head(20))
