## Imports 

In [4]:
import sys
# sys project root
sys.path.append('/home/yigit/codebase/emg2qwerty')

## Train & Evaluate TDS ConvNet (Baseline)

### Step 1: Start your experiments!

- Remember to download and copy the dataset to this directory: `Your_Dir/emg2qwerty/data`.
- You may now start your experiments with any scripts! Below are examples of single-user training and testing (greedy decoding).
- **There are two ways to track the logs:**
  - 1. Keep `--multirun`, and the logs will not be printed here, but they will be saved in the folder `logs`, e.g., `logs/2025-02-09/18-24-15/submitit_logs/`.
  - 2. Comment out `--multirun` and the logs will be printed in this notebook, but they will not be saved.

#### Training

- The checkpoints are saved in the folder `logs`, e.g., `logs/2025-02-09/18-24-15/checkpoints/`.

In [None]:
# Single-user training
!HYDRA_FULL_ERROR=1 python -m emg2qwerty.train \
  user="glob(single_user)" \
  trainer.accelerator=gpu trainer.devices=4 \
  --multirun

In [1]:
# get plots from tensorboard and show them in the notebook
%load_ext tensorboard

In [4]:
%tensorboard --logdir logs/2025-03-08/BASELINE/

Reusing TensorBoard on port 6006 (pid 1728857), started 0:12:20 ago. (Use '!kill 1728857' to kill it.)

#### Testing:

- Replace `Your_Path_to_Checkpoint` with your checkpoint path.

In [15]:
# Single-user testing
!python -m emg2qwerty.train \
  user="single_user" \
  checkpoint="/home/yigit/codebase/emg2qwerty/logs/2025-03-08/14-10-13/Baseline/checkpoints/best_checkpoint_baseline.ckpt" \
  train=False trainer.accelerator=gpu \
  decoder=ctc_greedy \
  hydra.launcher.mem_gb=64 \
  # --multirun

[2025-03-08 15:02:17,292][__main__][INFO] - 
Config:
user: single_user
dataset:
  train:
  - user: 89335547
    session: 2021-06-03-1622765527-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622681518-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-04-1622863166-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627003020-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-21-1626916256-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627004019-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-05-1622885888-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622679967-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f

### 2. CNN + LSTM Module
- Use CNN layers to process spatial features across channels
- Feed CNN features into LSTM layers for temporal processing
- Project to character space

CNN-LSTM Architecture

Input Processing:

Takes the same input as the baseline model: EMG signals with shape (T, N, bands=2, channels=16, freq)
Uses the same SpectrogramNorm as the baseline for consistent preprocessing


CNN Feature Extractor:

Processes spatial features across EMG channels
Uses a sequence of Conv1D layers with batch normalization and ReLU activation
Extracts increasingly complex features through multiple CNN layers


LSTM Temporal Processor:

Bidirectional LSTM to capture temporal dependencies in both directions
Multiple LSTM layers with dropout between them for regularization
Processes the temporal sequence of CNN features


Output Projection:

Projects LSTM outputs to character space using fully connected layers
Includes dropout for regularization
Ends with LogSoftmax for compatibility with CTCLoss

In [None]:
!HYDRA_FULL_ERROR=1 python -m emg2qwerty.train \
  user="glob(single_user)" \
  trainer.accelerator=gpu trainer.devices=4 \
  --multirun

[2025-03-08 16:55:01,963][HYDRA] Submitit 'local' sweep output dir : logs/2025-03-08/16-55-01
[2025-03-08 16:55:01,964][HYDRA] 	#0 : user=single_user trainer.accelerator=gpu trainer.devices=4


In [2]:
%load_ext tensorboard

In [3]:
%tensorboard --logdir logs/2025-03-08/CNNLSTM/cnnlstm/ --port 6012

Reusing TensorBoard on port 6012 (pid 1731493), started 0:00:24 ago. (Use '!kill 1731493' to kill it.)

In [8]:
# Single-user testing
!python -m emg2qwerty.train \
  user="single_user" \
  checkpoint="/home/yigit/codebase/emg2qwerty/logs/2025-03-08/CNNLSTM/cnnlstm/checkpoints/best_checkpoint_cnn_lstm.ckpt" \
  train=False trainer.accelerator=gpu \
  decoder=ctc_greedy \
  hydra.launcher.mem_gb=64 \
  # --multirun

[2025-03-11 01:02:51,710][__main__][INFO] - 
Config:
user: single_user
dataset:
  train:
  - user: 89335547
    session: 2021-06-03-1622765527-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622681518-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-04-1622863166-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627003020-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-21-1626916256-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627004019-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-05-1622885888-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622679967-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f

### 3. CNN + GRU Module

In [1]:
# Single-user training
!HYDRA_FULL_ERROR=1 python -m emg2qwerty.train \
  user="glob(single_user)" \
  trainer.accelerator=gpu trainer.devices=4 \
  --multirun 

[2025-03-11 00:38:00,468][HYDRA] Submitit 'local' sweep output dir : logs/2025-03-11/00-38-00
[2025-03-11 00:38:00,469][HYDRA] 	#0 : user=single_user trainer.accelerator=gpu trainer.devices=4


In [None]:
# Single-user testing
!python -m emg2qwerty.train \
  user="single_user" \
  checkpoint="/home/yigit/codebase/emg2qwerty/logs/2025-03-11/CNN_GRU/cnn_gru/checkpoints/best_checkpoint_rnn_gru.ckpt" \
  train=False trainer.accelerator=gpu \
  decoder=ctc_greedy \
  hydra.launcher.mem_gb=64 \
  # --multirun

[2025-03-11 01:04:14,936][__main__][INFO] - 
Config:
user: single_user
dataset:
  train:
  - user: 89335547
    session: 2021-06-03-1622765527-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622681518-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-04-1622863166-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627003020-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-21-1626916256-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627004019-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-05-1622885888-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622679967-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f

### 4. CNN + Transformer Module

In [7]:
!HYDRA_FULL_ERROR=1 \
  python -m emg2qwerty.train \
  user="glob(single_user)" \
  trainer.accelerator=gpu trainer.devices=4 \
  lr_scheduler=cosine_annealing_warm_restarts \
  lr_scheduler.scheduler.T_0=10 \
  lr_scheduler.scheduler.T_mult=2 \
  lr_scheduler.scheduler.eta_min=1e-6 \
  trainer.max_epochs=400 \
  --multirun 

[2025-03-11 04:03:11,099][HYDRA] Submitit 'local' sweep output dir : logs/2025-03-11/04-03-10
[2025-03-11 04:03:11,100][HYDRA] 	#0 : user=single_user trainer.accelerator=gpu trainer.devices=4 lr_scheduler=cosine_annealing_warm_restarts lr_scheduler.scheduler.T_0=10 lr_scheduler.scheduler.T_mult=2 lr_scheduler.scheduler.eta_min=1e-06 trainer.max_epochs=400
Error executing job with overrides: ['user=single_user', 'trainer.accelerator=gpu', 'trainer.devices=4', 'lr_scheduler=cosine_annealing_warm_restarts', 'lr_scheduler.scheduler.T_0=10', 'lr_scheduler.scheduler.T_mult=2', 'lr_scheduler.scheduler.eta_min=1e-06', 'trainer.max_epochs=400']
Traceback (most recent call last):
  File "/home/yigit/anaconda3/envs/emg2qwerty/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/yigit/anaconda3/envs/emg2qwerty/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/yigit/codebase/emg2qwerty/emg2qwe

In [16]:
# Single-user testing
!python -m emg2qwerty.train \
  user="single_user" \
  checkpoint="/home/yigit/codebase/emg2qwerty/epoch_149-step_4500.ckpt" \
  train=False trainer.accelerator=gpu \
  decoder=ctc_greedy \
  hydra.launcher.mem_gb=64 \
  # --multirun

[2025-03-11 01:43:25,795][__main__][INFO] - 
Config:
user: single_user
dataset:
  train:
  - user: 89335547
    session: 2021-06-03-1622765527-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622681518-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-04-1622863166-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627003020-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-21-1626916256-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-07-22-1627004019-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-05-1622885888-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f
  - user: 89335547
    session: 2021-06-02-1622679967-keystrokes-dca-study@1-0efbe614-9ae6-4131-9192-4398359b4f5f

### Baseline Preprocessing and Data Augmentation Experiments

1. Data Preprocessing

In [1]:
# Single-user training
!HYDRA_FULL_ERROR=1 \
  python -m emg2qwerty.train \
  user="glob(single_user)" \
  trainer.accelerator=gpu trainer.devices=4 \
  --multirun

[2025-03-10 23:28:16,284][HYDRA] Submitit 'local' sweep output dir : logs/2025-03-10/23-28-16
[2025-03-10 23:28:16,285][HYDRA] 	#0 : user=single_user trainer.accelerator=gpu trainer.devices=4
Error executing job with overrides: ['user=single_user', 'trainer.accelerator=gpu', 'trainer.devices=4']
Traceback (most recent call last):
  File "/home/yigit/anaconda3/envs/emg2qwerty/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/yigit/anaconda3/envs/emg2qwerty/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/yigit/codebase/emg2qwerty/emg2qwerty/train.py", line 129, in <module>
    main()
  File "/home/yigit/anaconda3/envs/emg2qwerty/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/home/yigit/anaconda3/envs/emg2qwerty/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/home/yigit/anaco