# Deep Learning for ECoG/EEG: Interactive Learning Notebook

This notebook guides you through:
1. Understanding input signal characteristics.
2. Building & customizing model architectures.
3. Learning model training dynamics.
4. Evaluating and debugging model behavior.
5. Iterating with realistic testing.


## 🧠 1. Understand Input Signal Characteristics

We start by generating and inspecting synthetic data to see how signals are structured.

**Key concepts:**
- Shape: `(batch_size, channels, time)`
- Preprocessing: windowing, filtering, segmentation
- Why it matters: informs kernel sizes, strides, and pooling operations in your model.


In [ ]:
from ecog_eeg_dl_classifier.data.simulators.synthetic_signal import generate_synthetic_data
import matplotlib.pyplot as plt

# Generate 1 sample with 4 channels, 256 timesteps
X, y = generate_synthetic_data(n_samples=1, n_channels=4, signal_length=256, freq=12)
print(f"X shape: {X.shape}, y: {y}")

# Plot each channel
t = range(X.shape[2])
plt.figure(figsize=(10, 3))
for ch in range(X.shape[1]):
    plt.plot(t, X[0, ch], label=f"Channel {ch}")
plt.title("Synthetic EEG/ECoG Sample")
plt.xlabel("Time")
plt.ylabel("Amplitude")
plt.legend()
plt.show()

> **Try it:** Change `freq`, `n_channels`, or introduce more noise in `generate_synthetic_data()`.

## 🏗️ 2. Build & Customize Model Architectures

Explore the baseline CNN and learn layer-by-layer what each does.

**Key modules in `models/cnn_baseline.py`:**
- `nn.Conv1d(in_channels, out_channels, kernel_size, padding)`
- `nn.MaxPool1d(kernel_size, stride)`
- `nn.AdaptiveAvgPool1d(1)`
- `nn.Flatten()` + `nn.Linear()`


In [ ]:
from ecog_eeg_dl_classifier.models.cnn_baseline import EEGCNN
import torch

# Initialize model for 4-channel input
model = EEGCNN(in_channels=4, input_length=256, num_classes=2)
print(model)

# Forward pass dummy data
dummy = torch.randn(2, 4, 256)
out = model(dummy)
print(f"Output shape: {out.shape}")

> **Challenge:** Open `cnn_baseline.py` and try:
> - Adding a `nn.Dropout(0.5)` after the first pooling layer
> - Inserting `nn.BatchNorm1d(num_features)` between `Conv1d` and `ReLU`

## 🌀 3. Learn Model Training Dynamics

Inspect the training loop in `training/trainer.py`:
- `model.train()`, `loss.backward()`, `optimizer.step()`
- Hyperparameters: learning rate, optimizer type, batch size
- Enhancements: learning rate scheduler, early stopping, gradient clipping


In [ ]:
from ecog_eeg_dl_classifier.training.trainer import train_model
from ecog_eeg_dl_classifier.data.simulators.synthetic_signal import generate_synthetic_data
from torch.utils.data import DataLoader, TensorDataset
import torch

# Prepare data loader
data, labels = generate_synthetic_data(n_samples=200, n_channels=4, signal_length=256)
loader = DataLoader(TensorDataset(torch.tensor(data), torch.tensor(labels)), batch_size=16)

# Train for 3 epochs
train_model(model, loader, num_epochs=3, lr=0.005, device="cpu")

> **Try it:**
> - Switch to `torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)`
> - Add a scheduler: `torch.optim.lr_scheduler.StepLR`
> - Implement early stopping by tracking validation loss

## 📈 4. Evaluate and Debug Model Behavior

Use `dashboards/metrics_dashboard.py` to visualize performance.


In [ ]:
from ecog_eeg_dl_classifier.dashboards.metrics_dashboard import plot_metrics
from sklearn.metrics import roc_curve, auc, classification_report
import matplotlib.pyplot as plt
import torch

# Collect predictions
y_true, y_pred, y_score = [], [], []
model.eval()
for xb, yb in loader:
    with torch.no_grad():
        logits = model(xb)
        preds = torch.argmax(logits, dim=1).numpy()
        scores = torch.softmax(logits, dim=1)[:,1].numpy()
        y_true.extend(yb.numpy())
        y_pred.extend(preds)
        y_score.extend(scores)

# Confusion matrix & accuracy
plot_metrics(y_true, y_pred)

# ROC Curve
fpr, tpr, _ = roc_curve(y_true, y_score)
roc_auc = auc(fpr, tpr)
print(f"AUC: {roc_auc:.2f}")
plt.figure()
plt.plot(fpr, tpr, label=f"ROC (area = {roc_auc:.2f})")
plt.plot([0,1],[0,1],"--")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend()
plt.show()

> **Extend:** Compute F1-score or per-class precision/recall using `classification_report()`.

## 🔄 5. Iterate with Realistic Testing

Test real-time inference to see how model handles streaming data.


In [ ]:
from ecog_eeg_dl_classifier.realtime_integration.data_stream_handler import stream_simulated_signal

# Stream and predict every 2 seconds
# Uncomment to run:
# stream_simulated_signal(model, freq=2)

> **Explore:** Change signal `freq` or add Gaussian noise in the stream handler to stress-test the model.

## 🧪 Next Steps & Experimentation
- Modify data generator: add bandpass filtering or different waveforms
- Build deeper or alternative architectures: RNN/LSTM, 1D-Transformer
- Integrate preprocessing pipelines: use SciPy for filtering
- Deploy to SageMaker endpoint and measure latency

Happy learning!