# SparcNet Smoke Test (PyHealth 2.0)

This notebook uses the built-in `SleepEDFDataset` and `SleepStagingSleepEDF` task to load a small local `SleepEDF_demo` folder and run a forward-pass smoke test on the `SparcNet` model.

## Setup

The `SleepEDF_demo/` folder must have the following structure:
```
SleepEDF_demo/
  SC-subjects.xls              # from https://physionet.org/files/sleep-edfx/1.0.0/SC-subjects.xls
  sleepedf-cassette-pyhealth.csv  # auto-generated, then filtered to demo subjects only
  sleep-cassette/
    SC4001E0-PSG.edf
    SC4001EC-Hypnogram.edf
    SC4002E0-PSG.edf
    SC4002EC-Hypnogram.edf
    SC4011E0-PSG.edf
    SC4011EH-Hypnogram.edf
```


In [7]:
import torch
from pathlib import Path
from pyhealth.datasets import SleepEDFDataset, get_dataloader
from pyhealth.models.sparcnet import SparcNet

## Load SleepEDF Dataset and Set Task

Load the SleepEDF cassette subset and apply the default `SleepStagingSleepEDF` task, which extracts 30-second EEG epochs labeled with sleep stages (W, N1, N2, N3, N4, R).

In [8]:
sleepedf_root = str(Path("../SleepEDF_demo").resolve())

dataset = SleepEDFDataset(root=sleepedf_root, subset="cassette")
task_dataset = dataset.set_task()

print(f"Total samples: {len(task_dataset)}")
sample = task_dataset[0]
print(f"Sample keys: {list(sample.keys())}")
print(f"Signal shape: {tuple(sample['signal'].shape)}")
print(f"Label classes: {sorted(set(int(task_dataset[i]['label']) for i in range(len(task_dataset))))}")

No config path provided, using default config
Initializing sleepedf dataset from /Users/joshuachen/Projects/PyHealth/SleepEDF_demo (dev mode: False)
Setting task SleepStaging for sleepedf base dataset...
No cache_dir provided. Using default cache dir: /Users/joshuachen/Library/Caches/pyhealth/721549bc-455a-5885-9f13-ed4c6f3f4da0
Found cached processed samples at /Users/joshuachen/Library/Caches/pyhealth/721549bc-455a-5885-9f13-ed4c6f3f4da0/tasks/SleepStaging_b6b1d496-535e-5188-b3a5-3e9c6731405a/samples_47c27255-9fc0-5271-bd99-638ffecdb1cc.ld, skipping processing.
Total samples: 8281
Sample keys: ['patient_id', 'night', 'patient_age', 'patient_sex', 'signal', 'label']
Signal shape: (7, 3000)
Label classes: [0, 1, 2, 3, 4, 5]


## Initialize Model and Forward Pass

Create a dataloader, initialize SparcNet from the dataset, and run a forward pass to verify the model produces loss and class probabilities.

In [9]:
loader = get_dataloader(task_dataset, batch_size=8, shuffle=False)
batch = next(iter(loader))

print("batch keys:", list(batch.keys()))
print("batch['signal'] shape:", tuple(batch["signal"].shape))
print("batch['label'] shape:", tuple(batch["label"].shape))

model = SparcNet(dataset=task_dataset)

with torch.no_grad():
    ret = model(**batch)

print("loss:", float(ret["loss"]))
print("y_prob shape:", tuple(ret["y_prob"].shape))

batch keys: ['patient_id', 'night', 'patient_age', 'patient_sex', 'signal', 'label']
batch['signal'] shape: (8, 7, 3000)
batch['label'] shape: (8,)

=== Input data statistics ===
n_channels: 7
length: 3000
loss: 1.8694617748260498
y_prob shape: (8, 6)


## Create Train/Val/Test Data Loaders

Create data loaders for training, validation, and evaluation.

In [11]:
train_loader = get_dataloader(task_dataset, batch_size=32, shuffle=True)
val_loader = get_dataloader(task_dataset, batch_size=8, shuffle=False)
test_loader = get_dataloader(task_dataset, batch_size=8, shuffle=False)

print("train batches:", len(train_loader))
print("val batches:", len(val_loader))
print("test batches:", len(test_loader))

train batches: 259
val batches: 1036
test batches: 1036


## Train and Evaluate

Train SparcNet for 1 epoch and evaluate on the test set.

In [12]:
from pyhealth.trainer import Trainer

trainer = Trainer(model=model, enable_logging=True)
trainer.train(
    train_dataloader=train_loader,
    val_dataloader=val_loader,
    epochs=1,
)

metrics = trainer.evaluate(test_loader)
print("Trainer eval metrics:", metrics)

SparcNet(
  (encoder): Sequential(
    (conv0): Conv1d(7, 8, kernel_size=(7,), stride=(2,), padding=(3,))
    (norm0): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (elu0): ELU(alpha=1.0)
    (pool0): MaxPool1d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (denseblock1): DenseBlock(
      (denselayer1): DenseLayer(
        (norm1): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (elu1): ELU(alpha=1.0)
        (conv1): Conv1d(8, 256, kernel_size=(1,), stride=(1,))
        (norm2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (elu2): ELU(alpha=1.0)
        (conv2): Conv1d(256, 16, kernel_size=(3,), stride=(1,), padding=(1,))
      )
      (denselayer2): DenseLayer(
        (norm1): BatchNorm1d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (elu1): ELU(alpha=1.0)
        (conv1): Conv1d(24, 256, kernel_size=(1,), stride

Epoch 0 / 1:   0%|          | 0/259 [00:00<?, ?it/s]

--- Train epoch-0, step-259 ---
loss: 1.3366


Evaluation: 100%|██████████| 1036/1036 [01:02<00:00, 16.62it/s]

--- Eval epoch-0, step-259 ---
accuracy: 0.6929
f1_macro: 0.1364
f1_micro: 0.6929
loss: 0.9790



Evaluation: 100%|██████████| 1036/1036 [01:03<00:00, 16.41it/s]

Trainer eval metrics: {'accuracy': 0.6929114841202754, 'f1_macro': 0.13643388734336734, 'f1_micro': 0.6929114841202754, 'loss': 0.9789685920082234}



