# Filter Suggester

This notebook reads per-file QC summaries written by `eeg_qc_metrics.py`,
then proposes **notch** and **band-pass** settings with transparent, editable rules.

**Expected layout** (run the QC script first):
- `quality_control/<stem>/<stem>_qc_summary.json`
- `quality_control/<stem>/<stem>_qc_metrics.csv`
- `quality_control/combined_qc_summary.csv`


In [1]:
import json
from pathlib import Path
import pandas as pd

PROJECT = Path.cwd().resolve()          # open the notebook at repo root
OUTDIR  = PROJECT / 'quality_control'   # where the script wrote results

summaries = sorted(OUTDIR.rglob('*_qc_summary.json'))
print('Found summaries:', len(summaries))
OUTDIR

Found summaries: 0


WindowsPath('C:/Users/rjrie/Documents/neural_signal_db/scripts/qc/quality_control')

In [4]:
rows = []
for js in summaries:
    with open(js, 'r') as f:
        s = json.load(f)
    rows.append({
        'stem': s['stem'],
        'summary_json': str(js),
        'suggested_notch_freqs': s.get('suggested_notch_freqs', []),
        'flat_channels': s.get('flat_channels', []),
        'high_variance_channels': s.get('high_variance_channels', []),
        'n_channels': s.get('n_channels', None),
        'bandpass_hint_hz': s.get('bandpass_hint_hz', [1.0, 40.0]),
    })
df = pd.DataFrame(rows).sort_values('stem').reset_index(drop=True)
df

KeyError: 'stem'

## Rule-based proposals

Rules (edit as needed):
- **Notch**: If summaries show only 50 or only 60 Hz family, use that set of harmonics ≤ 70 Hz. If both/unclear, fall back to your site default.
- **Band-pass**: Start with `[1.0, 40.0]` Hz. If many flat channels, reduce high-pass to `0.5` Hz. If many high-variance channels, raise high-pass to `1.5` Hz.


In [5]:
FALLBACK_MAINS = 60  # change to 50 if your site uses 50 Hz
MAX_FREQ = 70

def expand_harmonics(base):
    return [int(base*k) for k in range(1, int(MAX_FREQ//base)+1)]

def propose_notch(suggested):
    s = set(int(x) for x in suggested)
    has50 = any(x % 50 == 0 for x in s)
    has60 = any(x % 60 == 0 for x in s)
    if has50 and not has60:
        return expand_harmonics(50)
    if has60 and not has50:
        return expand_harmonics(60)
    return expand_harmonics(FALLBACK_MAINS) if FALLBACK_MAINS in (50, 60) else []

def propose_bandpass(n_flat, n_highvar, n_total, default=(1.0, 40.0)):
    frac_flat = (n_flat / max(n_total, 1))
    frac_high = (n_highvar / max(n_total, 1))
    l, h = default
    if frac_flat >= 0.10:
        l = min(l, 0.5)
    if frac_high >= 0.10:
        l = max(l, 1.5)
    return [round(l, 2), h]

props = []
for _, r in df.iterrows():
    notch = propose_notch(r['suggested_notch_freqs'])
    bp    = propose_bandpass(len(r['flat_channels']), len(r['high_variance_channels']), r['n_channels'])
    props.append({
        'stem': r['stem'],
        'proposed_notch_freqs': notch,
        'proposed_bandpass': bp,
        'n_flat': len(r['flat_channels']),
        'n_highvar': len(r['high_variance_channels']),
        'n_channels': r['n_channels'],
        'source_summary': r['summary_json'],
    })
props_df = pd.DataFrame(props).sort_values('stem').reset_index(drop=True)
props_df

NameError: name 'df' is not defined

In [None]:
out_csv = OUTDIR / 'proposed_filters.csv'
props_df.to_csv(out_csv, index=False)
print('Saved ->', out_csv)
out_csv