# 🍍 LibriBrain Competition: Phoneme Classification Submission
You've trained a model for the Phoneme Classification track and are now ready to submit your results? Congratulations! - let's walk through the process.

Broadly, you will need to do the following:
1. Run model predictions on our holdout data
2. Submit the .CSV file containing your results (find the detailed instructions [here](https://neural-processing-lab.github.io/2025-libribrain-competition/participate/#4-submit-on-evalai)).

This tutorial will walk you through step (1), generating the .CSV file for you to submit.

In case of any questions or problems, please get in touch through [our Discord server](https://neural-processing-lab.github.io/2025-libribrain-competition/links/discord).

⚠️ **Note**: We have only comprehensively validated the notebook to work on Colab and Unix. Your experience in other environments (e.g., Windows) may vary.

## Setting up dependencies
Run the code below *as is*. It will download all required dependencies, including our own [PNPL](https://pypi.org/project/pnpl/) package. On Windows, you might have to restart your Kernel after the installation has finished.

In [None]:
# Install additional dependencies
%pip install -q lightning torchmetrics scikit-learn plotly ipywidgets tqdm pnpl

# Set up base path for dataset and related files (base_path is assumed to be set in the cells below!)
base_path = "./libribrain"
try:
    import google.colab  # This module is only available in Colab.
    in_colab = True
    base_path = "/content"  # This is the folder displayed in the Colab sidebar
except ImportError:
    in_colab = False

## Things to note about the holdout data
The input data consists of MEG segments with shape `(306, 125)` - that's 306 MEG channels over 125 time points (0.5 seconds at 250Hz sampling rate). Your model should process these segments and output a probability distribution over 39 phoneme classes.

**⚠️ The 125 timepoints represent the time around the phoneme using tmin=0 and tmax=0.5, meaning you get the signal immediately upon start of the phoneme and the 0.5 seconds following it.**

The majority of samples are created by signal averaging 100 samples. For some phonemes, fewer than 100 samples were available, so they were averaged on less datapoints.

## Generating submission CSV
For the phoneme classification task, you will be asked to classify **each segment** of the "competition holdout" split into one of 39 phoneme categories. Each prediction should be a probability distribution over the 39 phonemes, so your model should output 39 normalized probabilities that sum to 1.0. These predictions should then be packaged into a .csv file you can upload on EvalAI.



Here is how to generate the submission:

In [None]:
from torch.utils.data import DataLoader
from pnpl.datasets import LibriBrainCompetitionHoldout
from tqdm import tqdm
import torch

# First, instantiate the Competition Holdout dataset for phoneme classification
phoneme_holdout_dataset = LibriBrainCompetitionHoldout(
    data_path=base_path,  # Same as in the other LibriBrain dataset - this is where we'll store the data
    task="phoneme"        # Use "phoneme" for phoneme classification task
)

print(f"Dataset loaded: {len(phoneme_holdout_dataset)} segments")
print(f"Each segment shape: {phoneme_holdout_dataset[0].shape}")  # Should be (306, 125)

# Next, create a DataLoader for the dataset
dataloader = DataLoader(
    phoneme_holdout_dataset,
    batch_size=1,
    shuffle=False,
    num_workers=4   # Increase workers to speed up sample loading
)

# The final array of predictions must contain len(phoneme_holdout_dataset) 39-dimensional probability vectors
segments_to_predict = len(phoneme_holdout_dataset)
print(f"Total segments to predict: {segments_to_predict}")

# For now, we will fill the submission with random probability distributions
# Each prediction should be a 39-dimensional probability vector (one for each phoneme)
random_predictions = []

for i, sample in enumerate(tqdm(dataloader)):
    # For your submission, this is where you would generate your model prediction:
    # segment = sample  # The actual segment data - shape (306, 125)
    # logits = model(segment)  # Your model outputs logits of shape (39,)
    # probabilities = torch.softmax(logits, dim=0)  # Convert to probabilities
    #
    # Here, we generate random probabilities instead
    random_logits = torch.randn(39)  # Random logits for 39 phoneme classes
    random_probs = torch.softmax(random_logits, dim=0)  # Convert to probabilities
    random_predictions.append(random_probs)

# Generate the submission CSV
phoneme_holdout_dataset.generate_submission_in_csv(
    random_predictions,
    "holdout_phoneme_predictions.csv"
)
print("Submission file created: holdout_phoneme_predictions.csv")

## Using a real PyTorch model

Now let's use the simple model introduced in the tutorial notebook to generate a submission!

In order to achieve that, we'll
1. Define the model architecture (identical to tutorial notebook)
2. Download and instantiate a pre-trained checkpoint
3. Use the instantiated model to generate a valid submission CSV

In [None]:
import torch
import lightning as L
from torch import nn
from torch.utils.data import DataLoader
from tqdm import tqdm
from torchmetrics import F1Score
import os
import urllib.request


class PhonemeClassificationModel(L.LightningModule):
    """
    Lightning model for phoneme classification from MEG data (identical to tutorial notebook!)

    Architecture:
    - Conv1d: 306 channels -> 128 channels (1x1 convolution)
    - ReLU activation
    - Flatten: (128, 125) -> (16000,)
    - Linear: 16000 -> 39 phoneme classes

    Input: (batch_size, 306, 125) - 306 MEG channels, 125 time points
    Output: (batch_size, 39) - logits for 39 phoneme classes
    """
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(
            nn.Conv1d(306, 128, 1),  # 1x1 convolution
            nn.ReLU(),
            nn.Flatten(),            # (128, 125) -> (16000,)
            nn.Linear(16000, 39)     # 39 phoneme classes
        )
        self.criterion = nn.CrossEntropyLoss()
        self.f1_macro = F1Score(num_classes=39, average='macro', task="multiclass")

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.criterion(y_hat, y)
        f1_macro = self.f1_macro(y_hat, y)
        self.log('train_loss', loss, prog_bar=True)
        self.log('train_f1_macro', f1_macro)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.criterion(y_hat, y)
        f1_macro = self.f1_macro(y_hat, y)
        self.log('val_loss', loss)
        self.log('val_f1_macro', f1_macro, prog_bar=True)
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.model.parameters(), lr=0.0005)

# Load the trained model from checkpoint
checkpoint_path = f"{base_path}/phoneme_model.ckpt"


# Download model if it doesn't exist
if not os.path.exists(checkpoint_path):
    print(f"Downloading model to {checkpoint_path}")
    url = "https://neural-processing-lab.github.io/2025-libribrain-competition/phoneme_model.ckpt"
    urllib.request.urlretrieve(url, checkpoint_path)
    print("Model downloaded successfully!")


print(f"Loading trained model from: {checkpoint_path}")
try:
    # Load the trained Lightning model
    model = PhonemeClassificationModel.load_from_checkpoint(checkpoint_path)
    model.eval()  # Set to evaluation mode

    # Count parameters
    total_params = sum(p.numel() for p in model.parameters())
    print(f"✅ Model loaded successfully!")
    print(f"📊 Model has {total_params:,} parameters")

    # Test with a single sample
    test_input = torch.randn(1, 306, 125)
    with torch.no_grad():
        test_output = model(test_input)
        test_probs = torch.softmax(test_output, dim=1)
        print(f"✓ Test output shape: {test_output.shape}")
        print(f"✓ Test probabilities sum to: {test_probs.sum().item():.6f}")

except FileNotFoundError:
    print(f"❌ Checkpoint not found at: {checkpoint_path}")
    print("Creating a randomly initialized model for demonstration")
    model = PhonemeClassificationModel()
    model.eval()
    total_params = sum(p.numel() for p in model.parameters())
    print(f"Random model has {total_params:,} parameters")

except Exception as e:
    print(f"❌ Error loading checkpoint: {e}")
    print("Creating a randomly initialized model for demonstration")
    model = PhonemeClassificationModel()
    model.eval()


In [None]:
def generate_model_predictions(model, dataset, batch_size=32, device='cpu'):
    model = model.to(device)
    model.eval()

    # Create data loader
    dataloader = DataLoader(
        dataset,
        batch_size=batch_size,
        shuffle=False,
        num_workers=4
    )

    predictions = []

    print(f"Generating predictions for {len(dataset)} segments...")
    print(f"Using batch size: {batch_size}, device: {device}")

    with torch.no_grad():
        for batch_idx, batch_data in enumerate(tqdm(dataloader, desc="Processing batches")):
            # batch_data shape: (batch_size, 306, 125)
            batch_data = batch_data.to(device)

            # Forward pass
            logits = model(batch_data)  # Shape: (batch_size, 39)
            probs = torch.softmax(logits, dim=1)  # Convert to probabilities

            # Move back to CPU and store individual predictions
            probs_cpu = probs.cpu()
            for i in range(probs_cpu.shape[0]):
                predictions.append(probs_cpu[i])  # Shape: (39,)

    print(f"Generated {len(predictions)} predictions")
    return predictions

# Load the dataset
print("Loading phoneme holdout dataset...")
phoneme_dataset = LibriBrainCompetitionHoldout(
    data_path=base_path,
    task="phoneme"
)

# Check if CUDA is available
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

# Generate predictions with the trained model
print("Generating predictions with trained model...")
model_predictions = generate_model_predictions(
    model=model,
    dataset=phoneme_dataset,
    batch_size=32,  # Adjust based on your GPU memory
    device=device
)

# Create submission file
submission_filename = "trained_model_phoneme_submission.csv"
phoneme_dataset.generate_submission_in_csv(model_predictions, submission_filename)

print(f"\n✅ Trained model submission created: {submission_filename}")
print(f"📊 Contains {len(model_predictions)} predictions")
print("🎯 Ready for upload to EvalAI!")


Some things to note when using your own model:
- Ensure your model takes input shape `(batch_size, 306, 125)`
- Ensure your model outputs shape `(batch_size, 39)`
- Remember to use `torch.softmax()` to convert logits to normalized probabilities during inference


## Ready to submit?
After generating the predictions file, the next step is to submit it for evaluation. Don't worry, you are allowed to submit multiple times. Please, take a look at the [Submit on EvalAI](https://neural-processing-lab.github.io/2025-libribrain-competition/participate/#4-submit-on-evalai) section on the website to learn more.

### Expected Submission Format
Your CSV file should have the following structure:
- **Header**: `segment_idx,phoneme_1,phoneme_2,...,phoneme_39`
- **Each row**: One segment with 39 probability values (one for each phoneme class)
- **Probabilities**: Should sum to 1.0 for each row (the model uses softmax to ensure this)
- **Total rows**: Number of segments in the holdout dataset (14,053) + 1 header row

## That's it! 🥳
Thanks for taking the time to look at and/or participate in our competition!

If you have any open questions, please get in touch through [our Discord server](https://neural-processing-lab.github.io/2025-libribrain-competition/links/discord). Good luck with the competition!