# Local training — emg2qwerty

Run training on your local machine (Windows) with the project venv and GPU.

**Prerequisites:**
- Python 3.11+ with venv at `venv/`
- Dataset in `data/` (same layout as in the repo)
- For **RTX 5070 Ti** (sm_120): PyTorch nightly with CUDA 12.8 (see Step 1)

**Kernel:** Select the interpreter from `venv` (e.g. `./venv/Scripts/python.exe`) so `!python` uses it.

## Step 1: Check GPU and PyTorch

In [6]:
import torch

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"Device: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    # Quick GPU op test
    x = torch.randn(2, 2).cuda()
    print("GPU tensor test: OK")
else:
    print("No CUDA. Training will use CPU (slower).")

PyTorch: 2.12.0.dev20260225+cu128
CUDA available: True
CUDA version: 12.8
Device: NVIDIA GeForce RTX 5070 Ti Laptop GPU
GPU memory: 12.82 GB
GPU tensor test: OK


**RTX 5070 Ti (sm_120):** If you see "no kernel image is available" or "sm_120 is not compatible", install PyTorch nightly with CUDA 12.8:

```bash
pip install --pre --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
```

## Step 2: Verify project and data

In [7]:
from pathlib import Path

project_root = Path.cwd()
print(f"Project root: {project_root}")

assert (project_root / "emg2qwerty").is_dir(), "Run this notebook from the project root (where emg2qwerty/ lives)."
print("✓ emg2qwerty package found")

data_dir = project_root / "data"
if data_dir.is_dir():
    n_files = len(list(data_dir.rglob("*.h5")))
    print(f"✓ data/ found ({n_files} .h5 files)")
else:
    print("⚠ data/ not found. Put the dataset in data/ (see config user/single_user for layout).")

Project root: c:\Users\junji\Documents\C147_final\emg2qwerty
✓ emg2qwerty package found
✓ data/ found (0 .h5 files)


## Step 3: Run training

Single run with defaults from `config/base.yaml` and `config/model/cnn_transformer_ctc.yaml`:
- Model: CNN + Transformer encoder (CTC)
- Data: `dataset.root` = current working directory + `/data`
- Logs and checkpoints: `logs/YYYY-MM-DD/HH-MM-SS/`

**Windows / local:** Use **`cluster=basic`** so training runs in-process (no submitit). The default `cluster=local` uses submitit and can fail on Windows (e.g. FileNotFoundError, SIGKILL). All commands below include `cluster=basic`.

**Progress bar:** When you run from this notebook, progress may not update live (buffering). Use the **"Run training (live output)"** cell below to see progress in the notebook, or run the same command in a **terminal** (PowerShell, venv activated) for the usual live progress bar.

**Overrides you can add:**
- `trainer.max_epochs=50` — fewer epochs
- `batch_size=16` — smaller batch
- `trainer.accelerator=cpu` — force CPU
- `num_workers=4` — faster data loading (only if you have enough RAM/paging file; default 0 avoids Windows "paging file too small" with worker processes)
- `model=tds_conv_ctc` — TDS baseline (plain CTC)
- `model=tds_conv_crctc` — TDS baseline + AdamW + CR-CTC

**TDS + AdamW + CR-CTC (baseline with CR-CTC loss):**  
Single run with TDS model, AdamW optimizer, and CR-CTC (entropy regularization):

```
python -u -m emg2qwerty.train model=tds_conv_crctc cluster=basic trainer.accelerator=gpu trainer.devices=1
```

Logs and checkpoints go to `logs/YYYY-MM-DD/HH-MM-SS/`. Tune CR-CTC in `config/model/tds_conv_crctc.yaml` (e.g. `module.cr_ctc_entropy_weight`).

In [None]:
# Single-user training (GPU, default config)
# Use -u for unbuffered output so logs flush; progress bar may still not update live in notebook.
!python -m emg2qwerty.train user=single_user cluster=basic trainer.accelerator=gpu trainer.devices=1 #--multirun

[2026-02-25 13:01:52,451][HYDRA] Launching 1 jobs locally
[2026-02-25 13:01:52,452][HYDRA] 	#0 : user=single_user cluster=basic trainer.accelerator=gpu trainer.devices=1
[2026-02-25 13:01:52,533][__main__][INFO] - 
Config:
user: single_user
dataset:
  train:
  - user: 89335547
    session: 2021-06-03-1622765527-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622681518-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-04-1622863166-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627003020-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-21-1626916256-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627004019-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-05-1622885888-key

Global seed set to 1501
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: c:\Users\junji\Documents\C147_final\emg2qwerty\logs\2026-02-25\13-01-52\job0_trainer.devices=1,user=single_user/logs\tensorboard
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name      | Type                  | Params
----------------------------------------------------
0 | front_end | Sequential            | 406 K 
1 | encoder   | CNNTransformerEncoder | 83.2 M
2 | head      | Sequential            | 76.1 K
3 | ctc_loss  | CTCLoss               | 0     
4 | metrics   | ModuleDict            | 0     
----------------------------------------------------
83.7 M    Trainable params
0         Non-trainable params
83.7 M    Total params
334.778   Total estimated model params size (MB)
  rank_zero_warn(
  rank_zero_warn(
  rank_zero_warn(
Epoch 0, global step 60: 'val/CER' reached 93.81923 (best

### Optional: short sanity run (1 epoch)

In [None]:
# Uncomment and run to test the pipeline with 1 epoch only:
# !python -u -m emg2qwerty.train user=single_user cluster=basic trainer.accelerator=gpu trainer.devices=1 trainer.max_epochs=1

## Step 4: TensorBoard (optional)

**TensorBoard** (from project root):

```bash
tensorboard --logdir logs/
```

Then open http://localhost:6006 to monitor loss and metrics.

**CSV metrics (CER, loss, etc.):** Each run writes `logs/YYYY-MM-DD/HH-MM-SS/.../logs/csv/version_0/metrics.csv` with columns for `train_loss_epoch`, `val/loss`, `val/CER`, and other logged metrics per epoch. In multirun, each job has its own directory and thus its own `metrics.csv` for easy comparison.