Modular self-supervised learning for sleep signals
Models and losses live in YAML; training hyperparameters stay on the CLI.
Quick links · configs/ recipes · sleep2vec/ source · data/ datasets & loaders · preprocess/ caching scripts · utils/ helpers
- Refactored training flow: YAML defines architecture & loss, CLI sets training hyperparameters (epochs, lr, devices, etc.).
- Supports contrastive pretraining, staged modality adaptation for new sensors, plus downstream classification or regression finetuning.
- Extensible registries for backbones, tokenizers, projection heads, losses, model averaging, LoRA-backed heads, and downstream heads.
- Dataset channel names and per-token input widths now come from YAML
model.channels, so custom modalities such as wearableppgoractigraphy_vmcan be added without editing the dataset registry. - WandB logging is enabled by default; inference-only runner is included for evaluating checkpoints and writing metrics, predictions, overview rows, and run manifests.
- Python 3.10+ with CUDA GPUs recommended; PyTorch/Lightning versions are pinned in
requirements.txt(torch==2.7.0,pytorch-lightning==2.6.1). - Install:
pip install -r requirements.txt(choose the correct PyTorch wheel for your CUDA version). - Pair-accuracy heatmap logging uses
matplotlib+seaborn(already included inrequirements.txt). - Authenticate to Weights & Biases before running (
WANDB_API_KEY=...orWANDB_MODE=offline) because entrypoints callwandb.login(). - Default precision is bf16/bf16-mixed; pass
--precision 32if your GPUs do not support bf16. - Main entrypoints:
python -m sleep2vec.pretrain ...,python -m sleep2vec.adapt --phase stage1|stage2 ...,python -m sleep2vec.finetune ...,python -m sleep2vec.infer .... - Minimal checked examples live under
configs/examples/: one pretrain example plus built-in finetune examples forstage3,stage4,stage5,ahi,sex, andage. Validate configs withpython utils/check_configs.py [paths...].
- Index CSV (used by pretrain/finetune): required columns
path,split(train|val|test), andduration(seconds).ageandsexare optional for stage/AHI-only workflows, but built-inage/sextasks require valid labels after split/source filtering; generate those presets from indexes carrying the corresponding real column. Optional extra label columns (e.g., disease flags) are consumed whenmeta_data_namesis set. For the built-insextask, the normalized contract issex=female|male, encoded as0=female,1=male. If your source metadata usessexM, convert it tosexwith1 -> male,0 -> femaleduring preprocessing. - NPZ contents per row: every non-label key used at runtime must be declared in YAML
model.channelswith a matchingnameandinput_dim(frames per token). Built-in examples includeheartbeat,breath,eeg_original,ecg_original,eog_original,emg_original,spo2,resp_original, andresp_nasal_original; this branch also ships wearable examples forppgandactigraphy. In this repo,actigraphystores vector magnitude (VM).stage5remains a special per-token label channel and always uses width1. Built-inahiadditionally requires a flat 1 Hzah_eventarray plus scalar NPZ keysahiandtst. - Preset pickles: both CLIs expect a precomputed pickle of
SampleIndexobjects (seepreprocess/save_dataset_presets.py). Point--pretrain-preset-path/ YAMLdata.finetune_preset_pathto an existing pickle; these scripts do not fall back to CSV when a path is provided. Preset generation now requires a YAML config so the script can resolve channel names andinput_dimvalues frommodel.channels. preprocess/save_dataset_presets.pyalso honors an optional top-levelpreset_buildblock. Use it when preset validation must differ from runtime input modalities, for example token-level PPG staging should validate bothppgandstage5.preset_build.required_channelsis the YAML source of truth for preset validation channels; do not combine it with CLI--channels.preset_buildmust define bothrequired_channelsandmin_channels;preset_build.min_channelsoverrides CLI--min-channelsfor preset generation only.- To build presets, run:
Optional flags:
python preprocess/save_dataset_presets.py \ --config configs/sleep2vec_dense_pretrain.yaml \ --index /path/to/index.csv \ --dataset-name shhs \ --n-tokens 1535 \ --split train val test--channels eeg_original ecg_original,--meta-data-names hypertension diabetes,--include-no-metadata,--output-template 'data/{dataset}_{split}_preset_{tokens}{meta_suffix}.pickle',--dry-run,--overwrite. Explicit--meta-data-namesvalues remain strict: the CSV must contain each requested metadata column except built-in AHI summaries, which come from NPZ. --channelsis now an optional ordered subset of YAMLmodel.channels; any requested channel that is missing from the YAML or lacksinput_dimfails fast.- Missing-channel pretrain: if you enable
--allow-missing-channels, presets must carrypayload["available_channels"](auto-populated during preset creation) so the bucketed sampler can group by montage. - WatchPAT
.zzpconversion:preprocess/watchpat_zzp_to_edf.pyconverts a WatchPAT archive (Sleep.dat,Patient.dat,log.dat) into EDF for downstream inspection or external preprocessing. Example:Batch conversion example:python preprocess/watchpat_zzp_to_edf.py \ /path/to/study.zzp \ /path/to/study.edf \ --writer auto \ --json-summary /path/to/study_summary.json
Batch mode shows a file-levelpython preprocess/watchpat_zzp_to_edf.py \ /path/to/zzp_dir \ /path/to/edf_dir \ --recursive \ --skip-existing
tqdmprogress bar. Optional flags:--include-internal-1hz,--no-pulse-rate,--verbose,--json-summary /path/to/summary_dir.pyedflibis used when available; otherwise the script falls back to its built-in manual EDF writer.
Kaldi storage uses manifest.json as the preset-equivalent artifact. Do not pass legacy NPZ preset pickles with backend: kaldi; pre-windowed per-sample/channel matrices are read by sample_key, and token_start is preserved in the split manifests for downstream aggregation.
Pretrain conversion should use model input channels only unless you intentionally want label-like channels in contrastive training:
python -m preprocess.convert_npz_to_kaldi \
--index /path/to/index.csv \
--config configs/sleep2vec_dense_pretrain.yaml \
--output-dir /data/sleep2vec_kaldi/pretrain_120 \
--max-tokens 120 \
--stride-tokens 120 \
--channels-from-configFor heterogeneous datasets, add --allow-missing-channels --min-channels 2 during conversion and training so pair-first sampling uses available_channels from the manifest:
python -m sleep2vec.pretrain \
--config configs/sleep2vec_dense_pretrain.yaml \
--version-name kaldi-pretrain-120 \
--data-backend kaldi \
--kaldi-data-root /data/sleep2vec_kaldi/pretrain_120 \
--kaldi-manifest /data/sleep2vec_kaldi/pretrain_120/manifest.json \
--pretrain-preset-path null \
--allow-missing-channels --min-channels 2Finetune and inference select Kaldi from the YAML data block:
data:
backend: kaldi
kaldi_data_root: /data/sleep2vec_kaldi/ppg_stage5_1535
kaldi_manifest: /data/sleep2vec_kaldi/ppg_stage5_1535/manifest.json
finetune_preset_path: nullConvert finetune roots with model channels plus required label channels. For stage3, stage4, or stage5, include stage5; for ahi, include both ahi and stage5, and ensure manifest rows contain scalar ahi and tst metadata. Match the converter windowing to the finetune config; current whole-night PPG configs use max_tokens: 1535, so convert with --max-tokens 1535 --stride-tokens 0:
python -m preprocess.convert_npz_to_kaldi \
--index /path/to/index.csv \
--config configs/ppg_stage5_finetune.yaml \
--output-dir /data/sleep2vec_kaldi/ppg_stage5_1535 \
--max-tokens 1535 \
--stride-tokens 0 \
--channels-from-config \
--extra-channels stage5
python -m preprocess.convert_npz_to_kaldi \
--index /path/to/index.csv \
--config configs/ppg_ahi_finetune.yaml \
--output-dir /data/sleep2vec_kaldi/ppg_ahi_1535 \
--max-tokens 1535 \
--stride-tokens 0 \
--channels-from-config \
--extra-channels ahi stage5Inference reuses the same Kaldi root and manifest windowing as the checkpoint's finetune configuration. Keep --avg-ckpts 1 for built-in ahi, because its evaluation threshold is checkpoint-specific.
Update the dataset paths in the commands below (
--pretrain-data-index,--pretrain-preset-path, and YAMLdata.finetune_*) to point to your own CSV/pickle files.
python -m sleep2vec.pretrain \
--config configs/sleep2vec_dense_pretrain.yaml \
--pretrain-data-index /path/to/index.csv \
--pretrain-preset-path /path/to/pretrain_cache.pkl \
--version-name exp001 \
--epochs 120 --lr 5e-5 --batch-size 320 \
--devices 0 1 --num-workers 8Optional:
--warmup-steps Nto override the default LR warmup (3% of total steps).--pretrained-backbone-path /path/to/base.ckptto initialize the pretrain model from an existing checkpoint. Loader prefersema_model.weights and falls back tomodel.; if--ckpt-pathis also set, Lightning resume takes precedence.--allow-missing-channelsto accept samples with missing channels; pair with--min-channelsand (recommended)--bucket-by-available-channels.
Example (missing-channel pretrain):
python -m sleep2vec.pretrain \
--config configs/sleep2vec_dense_pretrain.yaml \
--pretrain-data-index /path/to/index.csv \
--pretrain-preset-path /path/to/pretrain_cache.pkl \
--version-name exp001-missing-chn \
--epochs 120 --lr 5e-5 --batch-size 320 \
--devices 0 1 --num-workers 8 \
--allow-missing-channels --min-channels 6 --bucket-by-available-channelssleep2vec.adapt runs a two-phase pretrain-style adaptation loop for newly introduced modalities while reusing an existing backbone checkpoint. The YAML must include a top-level adapt: block and the adapt.new_channels names must also appear in model.channels.
Stage 1: train only the new-modality tokenizers (and optionally the shared projection head).
python -m sleep2vec.adapt \
--config configs/sleep2vec_dense_adapt_ppg_actigraphy.yaml \
--phase stage1 \
--pretrained-backbone-path /path/to/base_pretrain.ckpt \
--pretrain-data-index /path/to/index.csv \
--pretrain-preset-path /path/to/wearable_cache.pkl \
--version-name wearable-v1 \
--epochs 40 --lr 5e-5 --batch-size 256 \
--devices 0 1Stage 2: initialize from the stage-1 checkpoint, unfreeze the encoder/legacy tokenizers, and anneal the training pair distribution back toward legacy pairs.
python -m sleep2vec.adapt \
--config configs/sleep2vec_dense_adapt_ppg_actigraphy.yaml \
--phase stage2 \
--pretrained-backbone-path /path/to/stage1.ckpt \
--version-name wearable-v1 \
--epochs 60 --lr 2e-5 --batch-size 256 \
--devices 0 1Notes:
- Provided example configs:
configs/sleep2vec_dense_adapt_ppg_actigraphy.yamlandconfigs/sleep2vec_dense_adapt_ppg_actigraphy_cls.yaml. - Adaptation defaults to missing-channel-aware pair-first sampling (
--allow-missing-channelson,--train-pair-monitor-enableon) because wearable datasets often have heterogeneous sensor availability. adapt.stage2.pair_scheduleis defined as training-progress fractions (untilin(0, 1]) and must end at1.0.- Starting a fresh adapt stage1 run uses
--pretrained-backbone-pathwith a base pretrain checkpoint. - Starting stage2 uses
--pretrained-backbone-pathwith a prior adapt stage1 checkpoint; this reuses the existing experiment directory and W&B id but writes new checkpoints underlog-adapt/<run_name>/checkpoints.stage2, restarting optimizer/scheduler/epoch state from zero. - Fresh stage2 transition refuses to reuse a non-empty
checkpoints.stage2directory; resume the old stage2 with--ckpt-path, or clear/move the old stage2 checkpoints first. - Resuming an exact in-progress adapt run uses
--ckpt-path; this is the only path that forwards a checkpoint into Lightning resume.
python -m sleep2vec.finetune \
--config configs/sleep2vec_dense_finetune_cls.yaml \
--label-name stage5 --results-csv-path outputs.csv \
--version-name exp001-stage5 \
--epochs 50 --lr 1e-5 --devices 0 1Notes:
- Built-in sleep-staging labels are
stage3,stage4, andstage5. They are all per-token sequence labeling tasks (is_seq=True) and use token-level downstream (model.cls.downstream: tokens). stage3merges rawstage5labels intoW / NREM / REM;stage4merges rawstage5labels intoW / N1N2 / N3 / REM.- Do not add
stage5orahitodata.data_channel_names; built-in sequence labels are loaded automatically intobatch["tokens"][...]whenever--label-nameisstage3,stage4,stage5, orahi. - Built-in
sexclassification is a metadata task with class order["female", "male"], so targets are encoded as0=female,1=male; presets missing validsexlabels are rejected. --pretrained-backbone-path /path/to/pretrain_or_adapt.ckptcan be used to bootstrap downstream training from a pretrain/adaptation checkpoint; loader prefersema_model.and falls back tomodel..
Imbalance controls in finetune YAML default to null / false in the provided recipes:
finetune:
loss:
class_weights: [1.0, 2.4]
pos_weight: null
sampler:
weighted_random: truefinetune.loss.class_weightsis for single-label classification only; the list length must matchfinetune.task.output_dim.finetune.loss.pos_weightis for multilabel classification only; for built-inahi, a scalar expands across the 30 BCE outputs.finetune.sampler.weighted_randomaffects only the train loader for binary non-sequence classification labels, such assexor a custom binary metadata target.
python -m sleep2vec.finetune \
--config configs/sleep2vec_dense_finetune_reg.yaml \
--label-name age --results-csv-path outputs.csv \
--version-name exp001-age \
--epochs 50 --lr 1e-5 --devices 0 1Custom metadata labels:
- Built-in
ageregression requires validagemetadata; stage/AHI-only presets that omit or carry dummyage/sexlabels are not valid for--label-name age|sex. - Set
--label-nameto the CSV column name (e.g.,bmi) and add afinetune.taskblock in the YAML to define task semantics (type/output_dim/is_seq/monitor/monitor_mod). - Use the same
--label-nameforsleep2vec.infer(required) when evaluating custom tasks. - Token-level labels (
is_seq: true) are only supported for built-in sequence labels (stage3,stage4,stage5,ahi) unless you extend the dataloader. - Built-in
ahiexpects a flat 1 Hz NPZ array namedah_event; each 30-second token is reshaped into 30 binary labels and trained with sigmoid/BCE. - Built-in
ahialso requires scalar NPZ keysahiandtst; final validation/test/infer metrics are not pointwise second-level metrics. - The built-in
ahievaluator fits the event threshold on validation only, saves it in the checkpoint, reuses it for test/infer, and then uses split post-processing: event detection metrics (ahi_event_precision/recall/f1) still operate on merged + duration-filtered events, while scalar summary AHI (ahi_mae,ahi_pearson, ICC, and severity summaries) counts stage-filtered raw predicted positive runs without merge or min-duration filtering so it aligns with scalar ground-truthahi. The scalar summary denominator remains NPZtst, samples withTST < 2hare skipped for final summary metrics, and the monitored key remainsval_ahi_pearson. - Classification metrics include
recallandspecificity. For binary classification,specificityis class-1-vs-class-0 true-negative rate; for multiclass tasks it is macro one-vs-rest specificity. Stage aliasessensandspecremain macro metrics, including two-class stage collapses. - Example YAMLs:
configs/sleep2vec_dense_finetune_custom_reg.yaml,configs/sleep2vec_dense_finetune_custom_cls.yaml.
Note
--version-name is required for pretraining/adaptation run naming; downstream runs auto-generate a version when omitted. Ensure your YAML data.* paths point to real preset pickles.
Evaluate a fine-tuned checkpoint without training:
python -m sleep2vec.infer \
--config configs/sleep2vec_dense_finetune_cls.yaml \
--ckpt-path log-finetune/exp001-stage5/checkpoints/epoch=49.ckpt \
--label-name stage5 --batch-size 12 --devices 0 \
--inference-preset-path /path/to/test_preset_1535.pickle \
--eval-split test --results-csv-path outputs.csvUse --override-dataset-names to test on a different dataset list than the YAML specifies.
Use --inference-preset-path to evaluate the same config/checkpoint against a different preset pickle without editing YAML; result CSV rows record the effective preset in preset_path.
Use the same --label-name that was used for fine-tuning; it is required.
To average checkpoints before inference, pass --avg-ckpts N (and --avg-ckpt-dir if --ckpt-path is best/last).
Use --pretrained-backbone-path if you want to preload a pretrain/adaptation initialization checkpoint before applying downstream weights.
Use --wandb to enable W&B logging during inference (needed for confusion matrix logging).
Inference writes a run-local output bundle under results/inference/<namespace>/<label>/<prediction_run_id>/:
metrics__<label>__<split>__<ckpt_tag>.csvpredictions__<label>__<split>__<ckpt_tag>.csvrun_manifest.json
It also appends a cross-run summary to results/inference/overview.csv. When --wandb is set, inference logs the metrics plus prediction_row_count and uploads the metrics, predictions, manifest, and overview files as one inference-<prediction_run_id> artifact.
Backbone
- Register builders in
sleep2vec/backbones/encoder_factory.pywith@register_backbone. - Select via YAML:
model: backbone: name: roformer hidden_size: 768 num_hidden_layers: 12 num_attention_heads: 16 vocab_size: 1 config_overrides: {} # add custom kwargs (e.g., MoE routing)
Tokenizers
- Implement and register in
sleep2vec/modules/tokenizers.pyusing@register_tokenizer("my_tokenizer"). - Set per-channel (tokenizer block must supply
nameandout_dim):model: channels: - name: eeg_original input_dim: 3840 tokenizer: name: my_tokenizer out_dim: 768 # must match across channels kwargs: {}
Projection Head
- Register in
sleep2vec/modules/projection.pyvia@register_projection. - Toggle or adjust:
model: projection: name: my_proj enabled: true hidden_dim: 768 out_dim: 256 kwargs: {}
Pretrain Loss
- Add implementations under
sleep2vec/losses/and register with@register_loss. - Choose in YAML:
loss: name: my_loss temperature: 0.2 params: {}
Downstream Head
- Heads live in
sleep2vec/downstreams/heads/and register viasleep2vec/downstreams/head_registry.py. - YAML separates channel and temporal aggregation:
model: head: name: classification # regression | temporal_conv | temporal_transformer dropout: 0.1 hidden_dim: null channel_agg: name: gated_scalar # mean | concat | gated_scalar temporal_agg: name: mean # mean | attn
- Temporal aggregation modules are in
sleep2vec/downstreams/temporal_aggregation/. - Sequence heads like
temporal_transformeraccept a padding mask from the backbone to ignore padded tokens.
CLS vs Tokens (downstream representation)
model.cls controls (1) whether a learnable CLS token is added, and (2) what representation downstream heads consume:
model:
cls:
embedding_type: null # null/none -> no CLS token; "bert" -> prepend learnable CLS token
downstream: tokens # "tokens" (token-level) or "cls" (sequence-level, non-seq tasks only)embedding_type: bertadds a BERT-style CLS token and exposes bothcls_hiddenandtoken_hidden.downstream: tokensuses token-level features (sequence tasks) or token pooling (non-seq tasks viamodel.head.temporal_agg).downstream: clsuses the CLS embedding for non-seq tasks and requiresembedding_type: bert.- For
--label-name stage3,stage4,stage5, orahi(is_seq=True), downstream is always token-level; if you setdownstream: clsit will be ignored (a warning is logged). model.clsis currently required by the config parser. To disable CLS token usage, setembedding_type: nullwithdownstream: tokens.
Layer Mix (downstream)
Learned scalar mix across transformer blocks (1..L). For sequence tasks, mixing is applied to token-level states; for non-seq tasks, each layer is pooled first and then mixed. Omit the block to disable, or set enabled: false.
finetune:
layer_mix:
enabled: true
shared_across_modalities: false # false -> per-modality weights
layer_indices: [1, 6, 12] # 1-based block indices; null -> allModel Averaging
- Strategies live in
sleep2vec/averagings/(ema.pyandrunning_mean.pyincluded). - Configure (omit the block entirely to disable):
model_averaging: name: ema params: enabled: true base_momentum: 0.996 final_momentum: 1.0 use_for_eval: true
- When
use_for_eval: true, finetune/infer will evaluate with the averaged weights if present. - Downstream loading can request averaged weights with
use_ema="ema"when callingload_pretrained_backbone.
Adaptation Config
sleep2vec.adaptreuses the pretrain model/loss schema and adds a top-leveladaptblock:adapt: new_channels: [ppg, actigraphy_vm] stage1: train_shared_projection: false stage2: lr_scales: encoder: 0.1 shared_legacy: 0.5 new_modalities: 1.0 pair_schedule: - until: 0.25 new_pair_ratio: 1.0 - until: 0.50 new_pair_ratio: 0.7 - until: 1.0 new_pair_ratio: 0.0
adapt.new_channelsmust be a non-empty subset ofmodel.channels.- Stage 1 freezes the encoder, CLS embedding, and legacy tokenizers; only the new modality tokenizers train, plus
proj_headwhentrain_shared_projection: true. - Stage 2 restores training for encoder/CLS, shared projection, legacy tokenizers, and new tokenizers, with per-group LR scales from
adapt.stage2.lr_scales. pair_schedulereallocates pair-first sampling mass toward pairs that include a new modality early in training, then anneals back toward the full pair set.
Optimization & Checkpointing
- Pretrain/finetune use linear warmup + cosine LR decay; override warmup with
--warmup-steps. - Finetune saves
best.ckptandlast.ckptplus periodic checkpoints; set--ckpt-every-n-epochsto control frequency. - Pretrain and adaptation can also warm-start from
--pretrained-backbone-path; the loader extracts the pretrain-model subtree (ema_model.first, thenmodel.) and syncs model averaging state from the loaded student weights. --precisionand--gradient-clip-valare supported by pretrain, adaptation, and finetune CLIs.
Pair-accuracy heatmap (pretrain)
- Validation uses per-pair dataloaders to log contrastive accuracy per modality pair.
- W&B logs a heatmap image (
val_pair_acc_matrix) plus scalar metrics underval_pair_acc/<pair>.
LoRA fine-tuning
- Controlled by YAML
finetuneblock (parsed byapply_finetune_config):finetune: freeze_tokenizer: true lora: freeze_backbone_and_insert_lora: true insert_lora: true separate_adapters: false r: 8 alpha: 16 dropout: 0.05 target_modules: [query, key, value] use_dora: false
freeze_tokenizer: truefreezes tokenizer parameters during downstream finetuning (default).- When enabled,
finetune.pyinjects PEFT LoRA/DoRA adapters into the transformer backbone and freezes base weights. separate_adapters: truecreates channel-specific adapters namedch_<channel>; the default LoRA adapter is frozen and only the channel adapters are trainable.
- Enable hooks with
--print-diagnostics; control duration with--diagnostics-steps(default 5). - Behavior: disables the progress bar, skips validation/checkpointing, and stops after the requested steps. Stats print to stdout.
- Example (pretrain):
(Use the same flags with
python -m sleep2vec.pretrain \ --config configs/sleep2vec_dense_pretrain.yaml \ --pretrain-data-index /path/to/index.csv \ --pretrain-preset-path /path/to/pretrain_cache.pkl \ --version-name debug-diag \ --print-diagnostics --diagnostics-steps 5 --precision 32 --devices 0
sleep2vec.adaptorsleep2vec.finetunefor adaptation/downstream diagnostics.)
Important
Prefer --precision 32 when using diagnostics; mixed precision can distort the printed tensor statistics.
- Maintain separate YAML per stage (
*_pretrain.yaml,*_finetune_*.yaml); only pretrain YAML definesloss. - When adding a new modality, first declare it in
model.channelswith the correctinput_dim, regenerate presets with the same--config, then pretrain/adapt from a checkpoint as needed. - All channels must share the same
out_dim; the builder enforces this. data.data_channel_namesin finetune YAML must matchmodel.channels(input modalities only); built-in sequence labels (stage3,stage4,stage5,ahi) load their runtime label tokens automatically when used as--label-name(stage3/stage4/stage5from rawstage5,ahifrom rawah_eventplus scalar NPZ summariesahiandtst). For built-inahi, final scalar summary AHI aligns to NPZ ground-truthahi, while event detection metrics continue to use the stricter merged + duration-filtered event path.pretrain.py,adapt.py, andfinetune.pycopy the resolvedconfig.yamlpluscli_args.yamlinto the run directory for reproducibility.- When experimenting, adjust CLI flags for training schedules and keep structural changes in YAML for reproducibility.
configs/— training recipes for pretrain, adaptation, and finetune.sleep2vec/— core library: registries, encoders, tokenizers, projection, losses, averaging, downstream heads, adaptation modules, and Lightning entrypoints.data/— dataset/index definitions, metadata helpers, NPZ loaders, channel-selection & samplers.preprocess/— scripts to build index CSVs/presets, split/merge dataset indices, inspect missing-channel stats, and run raw format converters such as WatchPAT.zzpto EDF.utils/— misc helpers.
We're Five Seasons Medical, building the full stack of AI for human health — contactless sensors, foundation models for physiology, and LLM agents that ship to real users every day. Sleep2vec is one piece of it.
The team comes from Tsinghua, Peking University, and top industry labs. Small, focused, and shipping.
Hiring across ML research, signal processing, LLM agents, and clinical science. Reach real users, not just benchmarks — chenxuesong@wuji-inc.com