# BirdCLEF+ 2025: Simple Submission

![](https://www.kaggle.com/competitions/91844/images/header)

This notebook shows a simple way to setup an inference pipeline for the [BirdCLEF+ 2025 competition](https://www.kaggle.com/competitions/birdclef-2025). 

Credits to [Stefan Kahl](https://www.kaggle.com/stefankahl) who set up [one of the first sample submission notebooks](https://www.kaggle.com/code/stefankahl/birdclef-2025-sample-submission). 

I simplified the process and optimized the loading/chunking process with [numpy](https://numpy.org/) and [soundfile](https://github.com/bastibe/python-soundfile). It can be helpful to first think about the problem yourself and then check out both approaches.

## Dependencies

All these dependencies are included in the standard Kaggle Notebooks environment.

In [17]:
import numpy as np
import pandas as pd
import soundfile as sf
# Extension to pathlib.Path to simplify parsing directories
from fastcore.xtras import Path

## Get Paths and Labels

From the [competition description](https://www.kaggle.com/competitions/birdclef-2025/data) we know that the data is resampled to 32kHz, so we use a sample rate of `32_000`. A submission requires us to submit in 5 second chunks, so we chunk on `32000 * 5 = 160,000` samples.

In [18]:
BASE_PATH = "../kaggle/input/birdclef-2025/"
TAXONOMY_PATH = f"{BASE_PATH}taxonomy.csv"
TEST_SCAPES_PATH = f"{BASE_PATH}test_soundscapes/"
SR, CHUNK_SEC = 32_000, 5
FIVE_SEC_SR = SR * CHUNK_SEC

We retrieve the labels from the taxonomy file.

In [19]:
t = pd.read_csv(TAXONOMY_PATH)
class_labels = list(t['primary_label'])
class_labels[:5], class_labels[-5:]

(['1139490', '1192948', '1194042', '126247', '1346504'],
 ['yehcar1', 'yelori1', 'yeofly1', 'yercac1', 'ywcpar'])

`TEST_SCAPES_PATH` is populated during submission of the notebook.

In [20]:
# Populated during submission of notebook
scape_paths = list(Path(TEST_SCAPES_PATH).glob("*.ogg"))
scape_paths[:10]

[]

## Helper functions

Loading of audio is done efficiently using `soundfile`. `np.array_split` is an efficient way to split data into chunks.

In [21]:
def load_audio(path) -> np.ndarray:
    with sf.SoundFile(path) as f: audio = f.read()
    return audio

def get_chunks(path) -> list[np.ndarray]:
    """ Create 5 second chunks (1D arrays) from audio file. """
    audio = load_audio(path)
    return np.array_split(audio, np.ceil(audio.shape[0] / FIVE_SEC_SR))

## Inference

We make a prediction for each chunk within each soundscape.

In [22]:
cols = ["row_id"] + class_labels
rows = []
for path in scape_paths:
    for i, chunk in enumerate(get_chunks(path), start=1):
        row_id = f"{Path(path).stem}_{i * CHUNK_SEC}"
        # Place your inference function here (Random predictions as placeholder)
        #########################################
        pred = np.random.uniform(low=0.0, high=0.72, size=len(class_labels))
        #########################################
        rows.append([row_id] + pred.tolist())

We make sure that the final file contains `row_id` and the `206` class labels. Be careful if you have shuffled the order of your class labels in training.

In [23]:
preds = pd.DataFrame(rows, columns=cols)
preds.head(2)

Unnamed: 0,row_id,1139490,1192948,1194042,126247,1346504,134933,135045,1462711,1462737,...,yebfly1,yebsee1,yecspi2,yectyr1,yehbla2,yehcar1,yelori1,yeofly1,yercac1,ywcpar


## Submission

In [24]:
preds.to_csv("submission.csv", index=False)

**That's it! Hope this helps you to get started with the competition!**

**If you like this Kaggle kernel, consider giving an upvote and leaving a comment. Your feedback is very welcome! I will try to implement your suggestions in this kernel.**