# Train Neural Speech Decoding on Google Colab

**Requirements:** Colab Pro (for 24hr sessions + better GPU)

**Total Time:** ~16 hours (6hrs Stage 1 + 10hrs Stage 2 on A100)

# GitHub Push Code

In [34]:
!git add colab_training.ipynb
!git commit -m "update notebook"
!git push origin main

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
Everything up-to-date


In [14]:
!git add .
!git commit -m "update"
!git push origin main

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
Everything up-to-date


In [13]:
!git pull origin main --rebase
!git push origin main

From https://github.com/atomiiw/neural_speech_decoding
 * branch            main       -> FETCH_HEAD
Already up to date.
Everything up-to-date


In [36]:
!git config --global user.email "maidouatomwang@gmail.com"
!git config --global user.name "atomiiw"
!git remote set-url origin https://atomiiw:<TOKEN>@github.com/atomiiw/neural_speech_decoding.git

/bin/bash: line 1: TOKEN: No such file or directory


## Step 1: Check GPU

In [1]:
!nvidia-smi

import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")

Fri Nov  7 02:41:21 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:00:04.0 Off |                    0 |
| N/A   33C    P0             45W /  400W |       0MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

## Step 2: Clone Repository

In [2]:
# Clone the repo into /content
%cd /content
!git clone https://github.com/flinkerlab/neural_speech_decoding.git

# Enter the repo - this is our workspace
%cd neural_speech_decoding

!pwd

/content
fatal: destination path 'neural_speech_decoding' already exists and is not an empty directory.
/content/neural_speech_decoding
/content/neural_speech_decoding


In [2]:
%cd /content
%cd neural_speech_decoding

/content
/content/neural_speech_decoding


## Step 3: Install Dependencies

In [3]:
!pip install yacs h5py librosa scipy soundfile pesq pystoi tqdm matplotlib seaborn -q

In [None]:
!pip install -r requirements.txt

In [10]:
# Install PyTorch 2.2+ which supports Python 3.12
!pip install torch==2.2.0+cu118 torchvision==0.17.0+cu118 torchaudio==2.2.0+cu118 --index-url https://download.pytorch.org/whl/cu118

# Install other dependencies
!pip install yacs h5py librosa scipy soundfile pesq pystoi tqdm matplotlib seaborn -q

# Verify installation
import torch
import sys
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Python: {sys.version}")

Looking in indexes: https://download.pytorch.org/whl/cu118
PyTorch: 2.2.0+cu118
CUDA available: True
Python: 3.12.12 (main, Oct 10 2025, 08:52:57) [GCC 11.4.0]


## Step 4: Mount Google Drive & Setup Folders

In [5]:
from google.colab import drive
drive.mount('/content/drive')

# Create persistent storage in Google Drive
!mkdir -p /content/drive/MyDrive/nsd_data
!mkdir -p /content/drive/MyDrive/nsd_outputs

# Link them to the repo workspace
!mkdir -p example_data
!ln -s /content/drive/MyDrive/nsd_data example_data/data
!ln -s /content/drive/MyDrive/nsd_outputs output

print("✓ Google Drive mounted and linked")
print(f"  Data: example_data/data -> Google Drive")
print(f"  Output: output -> Google Drive")

Mounted at /content/drive
✓ Google Drive mounted and linked
  Data: example_data/data -> Google Drive
  Output: output -> Google Drive


## Step 5: Upload Data to Google Drive

**Before running training, you need to:**

1. Download HB02 dataset from: https://data.mendeley.com/datasets/fp4bv9gtwk/2
2. Upload the files to: **MyDrive/nsd_data/** in your Google Drive
3. Verify they're there by running the cell below

In [42]:
# Check if data is present
!ls -lh /content/drive/MyDrive/nsd_data/

# Should see HB02 data files (*.hdf5 or *.h5)
print("\nIf empty, please upload HB02 data to: MyDrive/nsd_data/ in Google Drive")

total 1.1G
-rw------- 1 root root 1.1G Nov  7 03:11 HB02.h5

If empty, please upload HB02 data to: MyDrive/nsd_data/ in Google Drive


## Step 6: Update Config

In [43]:
import json

# Update data path in config
with open('configs/AllSubjectInfo.json', 'r') as f:
    config = json.load(f)

config['Shared']['RootPath'] = './example_data/data/'

with open('configs/AllSubjectInfo.json', 'w') as f:
    json.dump(config, f, indent=4)

print(f"✓ Config updated: RootPath = {config['Shared']['RootPath']}")

✓ Config updated: RootPath = ./example_data/data/


## Step 7: Stage 1 - Audio-to-Audio Training (a2a)

**Time:** ~6 hours on A100

In [11]:
!python train_a2a.py \
  --OUTPUT_DIR output/a2a/HB02 \
  --trainsubject HB02 \
  --testsubject HB02 \
  --param_file configs/a2a_production.yaml \
  --batch_size 16 \
  --reshape 1 \
  --DENSITY "HB" \
  --wavebased 1 \
  --n_filter_samples 80 \
  --n_fft 256 \
  --formant_supervision 1 \
  --intensity_thres -1 \
  --epoch_num 60


A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/content/neural_speech_decoding/train_a2a.py", line 17, in <module>
    import torch
  File "/usr/local/lib/python3.12/dist-packages/torch/__init__.py", line 1471, in <module>
    from .functional import *  # noqa: F403
  File "/usr/local/lib/python3.12/dist-packages/torch/functional.py", line 9, in <module>
    import torch.nn.functional as F
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/usr/local/lib/python3.12/dist-pac

In [None]:
# Check Stage 1 completed
!ls output/a2a/HB02/*.pth | wc -l
print("Expected: 60 checkpoint files (model_epoch0.pth to model_epoch59.pth)")

## Step 8: Stage 2 - ECoG-to-Audio Training (e2a)

**Time:** ~10 hours on A100

**This produces the weights you need for phoneme classification!**

In [None]:
!python train_e2a.py \
  --OUTPUT_DIR output/e2a/resnet_HB02 \
  --trainsubject HB02 \
  --testsubject HB02 \
  --param_file configs/e2a_production.yaml \
  --batch_size 16 \
  --MAPPING_FROM_ECOG ECoGMapping_ResNet \
  --reshape 1 \
  --DENSITY "HB" \
  --wavebased 1 \
  --dynamicfiltershape 0 \
  --n_filter_samples 80 \
  --n_fft 256 \
  --formant_supervision 1 \
  --intensity_thres -1 \
  --epoch_num 60 \
  --pretrained_model_dir output/a2a/HB02 \
  --causal 0

In [None]:
# Check Stage 2 completed
!ls output/e2a/resnet_HB02/*.pth | wc -l
!ls -lh output/e2a/resnet_HB02/model_epoch59.pth

print("\n✓✓✓ TRAINING COMPLETE ✓✓✓")
print("\nYour pretrained weights:")
print("  output/e2a/resnet_HB02/model_epoch59.pth")
print("\nAlso saved to Google Drive:")
print("  /content/drive/MyDrive/nsd_outputs/e2a/resnet_HB02/model_epoch59.pth")

## Step 9: Download Weights (Optional)

In [None]:
from google.colab import files

# Uncomment to download the final checkpoint to your computer:
# files.download('output/e2a/resnet_HB02/model_epoch59.pth')

print("Weights are in Google Drive at: MyDrive/nsd_outputs/e2a/resnet_HB02/")

## Next Steps: Use for Phoneme Classification

Update your `ecog_decoder_finetune.ipynb` with:

```python
checkpoint_path = "output/e2a/resnet_HB02/model_epoch59.pth"
# Or from Google Drive:
# checkpoint_path = "/content/drive/MyDrive/nsd_outputs/e2a/resnet_HB02/model_epoch59.pth"
```