# Signal Evolution by Equity

Use this notebook to see how each symbol's signal moves over time and which signal component drives it.

Questions this answers:
- Are signals stable or noisy by symbol?
- Are short/medium/long components aligned?
- Which symbols have persistent high-magnitude signals?


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from io import StringIO
from IPython.display import display

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 6)

from QuantConnect import *
from QuantConnect.Research import QuantBook
from config import TEAM_ID

qb = QuantBook()
print('QuantBook initialized')


def read_csv_from_store(key):
    try:
        if not qb.ObjectStore.ContainsKey(key):
            print(f'ObjectStore key not found: {key}')
            return None
        content = qb.ObjectStore.Read(key)
        if not content:
            print(f'Empty ObjectStore key: {key}')
            return None
        return pd.read_csv(StringIO(content))
    except Exception as e:
        print(f'Error reading {key}: {e}')
        return None


In [None]:
df_signals = read_csv_from_store(f'{TEAM_ID}/signals.csv')

if df_signals is None:
    raise ValueError('signals.csv is required. Run a backtest first.')

required_cols = ['date', 'symbol', 'direction', 'magnitude', 'price', 'sma_short', 'sma_medium', 'sma_long', 'atr']
missing = [c for c in required_cols if c not in df_signals.columns]
if missing:
    raise ValueError(f'signals.csv missing required columns: {missing}')

df = df_signals.copy()
df['date'] = pd.to_datetime(df['date'])
for col in ['magnitude', 'price', 'sma_short', 'sma_medium', 'sma_long', 'atr']:
    df[col] = pd.to_numeric(df[col], errors='coerce')

df['direction'] = df['direction'].astype(str).str.title()
df['direction_sign'] = np.where(df['direction'].eq('Up'), 1.0, -1.0)
df['signed_magnitude'] = np.where(df['direction_sign'] > 0, df['magnitude'].abs(), -df['magnitude'].abs())
df['abs_magnitude'] = df['signed_magnitude'].abs()

safe_atr = df['atr'].replace(0, np.nan)
df['dist_short'] = (df['price'] - df['sma_short']) / safe_atr
df['dist_medium'] = (df['price'] - df['sma_medium']) / safe_atr
df['dist_long'] = (df['price'] - df['sma_long']) / safe_atr

df['contrib_short'] = 0.5 * df['dist_short']
df['contrib_medium'] = 0.3 * df['dist_medium']
df['contrib_long'] = 0.2 * df['dist_long']
df['composite_score'] = df['contrib_short'] + df['contrib_medium'] + df['contrib_long']
df['implied_magnitude'] = np.tanh(df['composite_score'])

df = df.sort_values(['symbol', 'date']).reset_index(drop=True)
print(f'Signal rows: {len(df):,}')
print(f'Symbols: {df["symbol"].nunique()}')
print(f'Date range: {df["date"].min().date()} to {df["date"].max().date()}')


## Symbol Signal Summary Table

This table summarizes each symbol's signal history: total observations, mean and median absolute magnitude, signal volatility, last signal date, and number of direction flips during the backtest. Symbols with high mean magnitude and few direction flips are persistent trend signals; symbols with frequent flips are noise-prone and may warrant closer monitoring. The table also determines which symbols to include in the charts below — either a custom focus list or the top 12 by frequency.

## Signal Magnitude Heatmap by Symbol

This heatmap displays signed signal magnitude for each selected symbol across every rebalance date, with green indicating a long signal and red a short signal. Horizontal stripes of consistent color identify symbols with stable directional trends, while rapidly alternating red/green cells flag noisy or frequently reversing signals. Color intensity encodes signal strength, so pale cells near white represent weak or borderline signals close to the 0.05 minimum threshold.

## Per-Symbol Signal Path Grid

This grid of small-multiple charts plots signed signal magnitude and the corresponding tanh-transformed composite score for each selected symbol over time. Comparing the two lines per symbol reveals how well the logged magnitude tracks the re-derived tanh(composite_score), which should be nearly identical if the alpha model is computing correctly. Persistent gaps between the two lines would indicate a discrepancy in how the signal was logged versus how it is reconstructed in this notebook.

## Latest Rebalance Signal Snapshot

This table shows the signal snapshot from the most recent rebalance date, ranked by absolute magnitude, along with the individual short, medium, and long horizon contributions to each signal. It answers which symbols have the strongest current signals and which horizon is driving each signal the most. Use this table as an at-a-glance view of the portfolio's directional thesis going into the next scaling week.

In [None]:
symbol_summary = (
    df.groupby('symbol', as_index=False)
      .agg(
          signal_obs=('date', 'count'),
          mean_abs_mag=('abs_magnitude', 'mean'),
          median_abs_mag=('abs_magnitude', 'median'),
          std_mag=('signed_magnitude', 'std'),
          last_signal=('date', 'max'),
          direction_flips=('direction_sign', lambda s: int((s.diff().fillna(0) != 0).sum()))
      )
      .sort_values(['signal_obs', 'mean_abs_mag'], ascending=[False, False])
)

display(symbol_summary.head(20))

# Optional: set your own list, e.g. ['AAPL', 'MSFT', 'JPM']
focus_symbols = []
max_symbols = 12

if focus_symbols:
    symbols_to_plot = [s for s in focus_symbols if s in set(df['symbol'])]
else:
    symbols_to_plot = symbol_summary.head(max_symbols)['symbol'].tolist()

print('Plotting symbols:', symbols_to_plot)


In [None]:
if not symbols_to_plot:
    raise ValueError('No symbols selected for plotting.')

heatmap_df = (
    df[df['symbol'].isin(symbols_to_plot)]
      .pivot(index='symbol', columns='date', values='signed_magnitude')
      .sort_index()
)

plt.figure(figsize=(18, max(5, 0.55 * len(symbols_to_plot) + 2)))
sns.heatmap(
    heatmap_df,
    cmap='RdYlGn',
    center=0,
    vmin=-1,
    vmax=1,
    linewidths=0.0,
    cbar_kws={'label': 'Signed signal magnitude'}
)
plt.title('Signal Magnitude Heatmap by Symbol')
plt.xlabel('Rebalance date')
plt.ylabel('Symbol')
plt.tight_layout()
plt.show()


In [None]:
n = len(symbols_to_plot)
cols = 3
rows = int(np.ceil(n / cols))
fig, axes = plt.subplots(rows, cols, figsize=(18, max(4 * rows, 6)), sharex=True)
axes = np.array(axes).reshape(-1)

for i, symbol in enumerate(symbols_to_plot):
    ax = axes[i]
    ds = df[df['symbol'] == symbol].sort_values('date')

    ax.plot(ds['date'], ds['signed_magnitude'], label='signed magnitude', color='#1f77b4', linewidth=1.6)
    ax.plot(ds['date'], ds['implied_magnitude'], label='implied tanh(score)', color='#ff7f0e', linewidth=1.1, alpha=0.85)
    ax.axhline(0, color='black', linewidth=0.8, alpha=0.6)
    ax.set_title(symbol)
    ax.set_ylim(-1.05, 1.05)
    ax.grid(alpha=0.25)

for j in range(i + 1, len(axes)):
    axes[j].axis('off')

handles, labels = axes[0].get_legend_handles_labels()
fig.legend(handles, labels, loc='upper center', ncol=2, frameon=False)
fig.suptitle('Per-Symbol Signal Path', y=0.995)
plt.tight_layout(rect=[0, 0, 1, 0.97])
plt.show()


In [None]:
latest_date = df['date'].max()
latest = (
    df[df['date'] == latest_date]
      .sort_values('abs_magnitude', ascending=False)
      [['date', 'symbol', 'signed_magnitude', 'contrib_short', 'contrib_medium', 'contrib_long']]
)

print(f'Latest rebalance date: {latest_date.date()}')
display(latest.head(20))
