⚠️ **Note**: **Submission is currently limited to only the speech detection tasks. We'll be releasing the obfuscated holdout data and an updated submission tutorial for the Phoneme Classification tasks in time for the second half of the competition.**

# 🍍 LibriBrain Competition: Submission
You've trained a model for one of our tracks and are now ready to submit your results? Congratulations! - let's walk through the process.

Broadly, you will need to do the following:
1. Run model predictions on our holdout data
2. Submit the .CSV file containing your results (find the detailed instructions [here](https://neural-processing-lab.github.io/2025-libribrain-competition/participate/#4-submit-on-evalai)).

This tutorial will walk you through step (1), generating the .CSV file for you to submit.

In case of any questions or problems, please get in touch through [our Discord server](https://neural-processing-lab.github.io/2025-libribrain-competition/links/discord).

⚠️ **Note**: We have only comprehensively validated the notebook to work on Colab and Unix. Your experience in other environments (e.g., Windows) may vary.

## Setting up dependencies
Run the code below *as is*. It will download all required dependencies, including our own [PNPL](https://pypi.org/project/pnpl/) package. On Windows, you might have to restart your Kernel after the installation has finished.

In [None]:
# Install additional dependencies
%pip install -q lightning torchmetrics scikit-learn plotly ipywidgets tqdm pnpl

# Set up base path for dataset and related files (base_path is assumed to be set in the cells below!)
base_path = "./libribrain"
try:
    import google.colab  # This module is only available in Colab.
    in_colab = True
    base_path = "/content"  # This is the folder displayed in the Colab sidebar
except ImportError:
    in_colab = False

## Generating submission CSV
For the speech detection task, you will be asked to evaluate **for each sample** of the "competition holdout" split of the data if it is speech or not - this means we expect a total of 560,638 predictions (that is how many samples there are in that split). These predictions should then be packaged into a .csv file you can upload on EvalAI. As we don't have labels to train against, the way you download the holdout data differs slightly from the regular `LibriBrainSpeech` dataset.

Here is how to generate the submission:

In [None]:
from torch.utils.data import DataLoader
from pnpl.datasets import LibriBrainCompetitionHoldout
from tqdm import tqdm
import torch

# First, instantiate the Competition Holdout dataset
speech_holdout_dataset = LibriBrainCompetitionHoldout(
    data_path=base_path,  # Same as in the other LibriBrain dataset - this is where we'll store the data
    tmax=0.8,             # Also identical to the other datasets - how many samples to return/group together
    task="speech"         # "speech" or "phoneme" ("phoneme" is not supported until Phoneme track launch)
)

# Next, create a DataLoader for the dataset
dataloader = DataLoader(
    speech_holdout_dataset,
    batch_size=1,
    shuffle=False,
    num_workers=4   # Increase workers to speed up sample loading
)

# The final array of predictions must contain len(speech_holdout_dataset) values between 0..1
segments_to_predict = len(speech_holdout_dataset)
print(segments_to_predict)

# Finally, we loop over every sample to generate a prediction.
# For now, we will fill the submission with random values
all_random = torch.rand((segments_to_predict, 1))
random_predictions = [None] * segments_to_predict

for i, sample in enumerate(tqdm(dataloader)):
    # For your submission, this is where you would generate your model prediction:
    # segment = sample[0]                  # The actual segment data is at sample[0]
    # prediction = model.predict(segment)  # Assuming model has a predict method
    #
    # Here, we simply pull the precomputed random tensor instead
    random_predictions[i] = all_random[i]

speech_holdout_dataset.generate_submission_in_csv(
    random_predictions,
    "holdout_speech_predictions.csv"
)


If you don't wish to wait the ~20min it takes to generate the file above, you can generate a mock (valid, but filled with random values) submission file without iterating over all samples:

In [None]:
from pnpl.datasets import LibriBrainCompetitionHoldout
import torch


speech_holdout_dataset = LibriBrainCompetitionHoldout(
    data_path=base_path,
    tmax=0.8,
    task="speech"
)

segments_to_predict = len(speech_holdout_dataset)
all_random = torch.rand((segments_to_predict, 1))  # build all (1,) tensors at once
random_predictions = list(all_random)              # convert to list of shape-(1,) tensors
speech_holdout_dataset.generate_submission_in_csv(
    random_predictions,
    "holdout_speech_predictions.csv"
)
print("Submission file created!")

### Generating the correct number of predictions
The code above is all you need to generate your submission!

We understand that while training your model, you may have played around with averaging samples, combining multiple samples into a singular output - in fact, the baseline model used in the [Speech Detection Notebook](https://neural-processing-lab.github.io/2025-libribrain-competition/colabs/LibriBrain_Competition_Speech_Detection.ipynb) did just that.

But, for the submission to be valid, it will need to contain 560,638 predictions - one per sample. There are multiple ways to resolve this (predicting a baseline value if no prediction can be performed, interpolating between results,...).

## Ready to submit?
After generating the predictions file, the next step is to submit it for evaluation. Don't worry, you are allowed to submit multiple times. Please, take a look at the [Submit on EvalAI](https://neural-processing-lab.github.io/2025-libribrain-competition/participate/#4-submit-on-evalai) section on the website to learn more.

## That's it! 🥳
Thanks for taking the time to look at and/or participate in our competition. If this caught your interest, you might also want to take a look at the more advanced version of the task, focussed on Phoneme Classification - you can find the corresponding Colab [here](https://neural-processing-lab.github.io/2025-libribrain-competition/links/phoneme-colab). If you have any open questions, please get in touch through [our Discord server](https://neural-processing-lab.github.io/2025-libribrain-competition/links/discord).