# BirdCLEF 2025 Competition: Bird Song Classification

## Introduction

This notebook explores the BirdCLEF 2025 competition, a machine learning challenge focused on bird song classification. The competition is hosted on Kaggle and aims to develop algorithms that can identify bird species from audio recordings.

Bird song classification is a challenging task with important applications in biodiversity monitoring, conservation efforts, and ecological research. Automated identification systems can help researchers process large volumes of audio data collected in the field, enabling more efficient and comprehensive studies of bird populations and behavior.

### Competition Overview
- **Goal**: Classify bird songs into one of 2000+ species
- **Dataset**: Audio recordings of bird vocalizations with species labels
- **Evaluation**: Models will be assessed on their ability to correctly identify bird species from audio samples
- **Competition Link**: [BirdCLEF 2025 on Kaggle](https://www.kaggle.com/competitions/birdclef-2025/overview)

Let's begin by exploring the dataset structure and understanding the nature of the bird song recordings we'll be working with.

### Files
**train_audio/** The training data consists of short recordings of individual bird, amphibian, mammal and insects sounds generously uploaded by users of xeno-canto.org, iNaturalist and the Colombian Sound Archive (CSA) of the Humboldt Institute for Biological Resources Research in Colombia. These files have been resampled to 32 kHz where applicable to match the test set audio and converted to the `ogg` format. Filenames consist of `[collection][file_id_in_collection].ogg`. The training data should have nearly all relevant files; we expect there is no benefit to looking for more on xeno-canto.org or iNaturalist and appreciate your cooperation in limiting the burden on their servers. If you do, please make sure to adhere to the scraping rules of these data portals.

**test_soundscapes/** When you submit a notebook, the **test_soundscapes** directory will be populated with approximately 700 recordings to be used for scoring. They are 1 minute long and in `ogg` audio format, resampled to 32 kHz. The file names are randomized, but have the general form of `soundscape_xxxxxx.ogg`. It should take your submission notebook approximately five minutes to load all the test soundscapes. Not all species from the train data actually occur in the test data.

**train_soundscapes/** Unlabeled audio data from the same recording locations as the test soundscapes. Filenames consist of `[site]_[date]_[local_time].ogg`; although recorded at the same location, precise recording sites of unlabeled soundscapes do NOT overlap with recording sites of the hidden test data.

**train.csv** A wide range of metadata is provided for the training data. The most directly relevant fields are:

- `primary_label`: A code for the species (eBird code for birds, iNaturalist taxon ID for non-birds). You can review detailed information about the species by appending codes to eBird and iNaturalis taxon URL, such as `https://ebird.org/species/gretin1` for the Great Tinamou or `https://www.inaturalist.org/taxa/24322` for the Red Snouted Tree Frog. Not all species have their own pages; some links might fail.
- `secondary_labels`: List of species labels that have been marked by recordists to also occur in the recording. Can be incomplete.
- `latitude` & `longitude`: Coordinates for where the recording was taken. Some bird species may have local call 'dialects,' so you may want to seek geographic diversity in your training data.
- `author`: The user who provided the recording. Unknown if no name was provided.
- `filename`: The name of the associated audio file.
- `rating`: Values in 1..5 (1 - low quality, 5 - high quality) provided by users of Xeno-canto; 0 implies no rating is available; iNaturalist and the CSA do not provide quality ratings.
- `collection`: Either `XC`, `iNat` or `CSA`, indicating which collection the recording was taken from. Filenames also reference the collection and the ID within that collection.

**sample_submission.csv** A valid sample submission.

- `row_id`: A slug of `soundscape_[soundscape_id]_[end_time]` for the prediction; e.g., Segment 00:15-00:20 of 1-minute test soundscape `soundscape_12345.ogg` has row ID `soundscape_12345_20`.
- `[species_id]`: There are 206 species ID columns. You will need to predict the probability of the presence of each species for each row.

**taxonomy.csv** - Data on the different species, including iNaturalist taxon ID and class name (Aves, Amphibia, Mammalia, Insecta).

**recording_location.txt** - Some high-level information on the recording location (El Silencio Natural Reserve).

In a nutshell, here is what we will do with the training data:

1. **train_audio/**
   - Contains individual, labeled bird sounds
   - These are clean, single-species recordings
   - Primary use: This will be our main training data for learning species-specific features
1. **train_soundscapes/**
   - Contains full 1-minute recordings from actual environments
   - Contains background noise, multiple species
   - Similar to the test data format
   - Primary use: Fine-tuning and validation

In [10]:
import glob
import numpy as np
import os
import pandas as pd
from pathlib import Path
import torch
import torchaudio
import torchaudio.transforms as AT
from tqdm.notebook import tqdm

From the [BirdNET paper](https://www.sciencedirect.com/science/article/pii/S1574954121000273), we can extract the following key insights:
1. Spectrogram Parameters:
   - Using mel-spectrograms with 64 bands
   - Frequency range: 150 Hz to 15 kHz
   - FFT window size adjusted for 32kHz sampling rate
   - 25% overlap between frames
2. Signal Processing:
   - 3-second chunks for processing
   - Signal strength-based detection for extracting relevant segments
   - Log scaling for magnitude (better for noisy environments)
3. Data Augmentation:
   - Pitch shifting within the frequency range
   - Temporal shifting within the 3-second window


In [23]:
class BirdSongPreprocessor:
    def __init__(self):
        # Key parameters from the paper:
        self.sample_rate = 32000  # Competition data is 32kHz
        self.n_fft = 1024  # FFT window size (~32ms at 32kHz)
        self.hop_length = 256  # 25% overlap as mentioned in BirdNET paper
        self.f_min = 150  # Min frequency 150 Hz
        self.f_max = 15000  # Max frequency 15 kHz
        self.n_mels = 64  # 64 mel bands
        
        # Initialize mel spectrogram transformer
        self.mel_spectrogram = AT.MelSpectrogram(
            sample_rate=self.sample_rate,
            n_fft=self.n_fft,
            win_length=self.n_fft,
            hop_length=self.hop_length,
            f_min=self.f_min,
            f_max=self.f_max,
            n_mels=self.n_mels,
            mel_scale="htk",  # Using HTK-style mel scaling
            power=2.0,  # Power spectrogram
            normalized=True,
            norm='slaney'  # Slaney-style mel normalization
        )

    def extract_signal_segments(self, waveform, threshold=0.5):
        """
        Extract segments containing bird vocalizations based on signal strength
        Similar to Sprengel et al., 2016 approach mentioned in the paper
        """
        # Convert to spectrogram
        spec = self.mel_spectrogram(waveform)
        
        # Calculate signal strength (simplified version)
        signal_strength = torch.mean(spec, dim=1)  # Average across frequency bins
        
        # Find segments above threshold
        mask = signal_strength > (threshold * torch.max(signal_strength))
        
        return mask

    def process_audio(self, audio_path, segment_duration=3.0):
        """
        Process audio file into mel spectrograms
        Uses 3-second chunks as recommended in the paper
        """
        # Load audio
        waveform, sr = torchaudio.load(audio_path)
        
        # Resample if necessary
        if sr != self.sample_rate:
            resampler = AT.Resample(sr, self.sample_rate)
            waveform = resampler(waveform)
        
        # Calculate number of samples for segment_duration
        segment_samples = int(segment_duration * self.sample_rate)
        
        # Extract segments with bird sounds
        signal_mask = self.extract_signal_segments(waveform)
        
        # Find continuous segments where signal is present
        segments = []
        start_idx = None
        
        for i, is_signal in enumerate(signal_mask[0]):  # [0] because waveform has shape [1, length]
            if is_signal and start_idx is None:
                start_idx = i
            elif not is_signal and start_idx is not None:
                # Extract segment
                end_idx = i
                if end_idx - start_idx >= segment_samples:
                    segment = waveform[:, start_idx:start_idx + segment_samples]
                    segments.append(segment)
                start_idx = None
        
        # If no segments found or they're too short, use the whole audio
        if not segments:
            # Pad or truncate to segment_duration
            if waveform.shape[1] < segment_samples:
                # Pad
                padding = segment_samples - waveform.shape[1]
                waveform = torch.nn.functional.pad(waveform, (0, padding))
            else:
                # Take center segment
                start = (waveform.shape[1] - segment_samples) // 2
                waveform = waveform[:, start:start + segment_samples]
            segments = [waveform]
        
        # Process all segments to mel spectrograms
        mel_specs = []
        for segment in segments:
            # Convert to mel spectrogram
            mel_spec = self.mel_spectrogram(segment)
            # Apply log scaling (nonlinear magnitude scale as mentioned in paper)
            mel_spec = torch.log(mel_spec + 1e-9)
            mel_specs.append(mel_spec)
        
        # Stack all spectrograms if multiple segments
        return torch.stack(mel_specs) if len(mel_specs) > 1 else mel_specs[0]

    def augment_audio(self, waveform):
        """
        Implement augmentations mentioned in the paper
        """
        # Pitch shift within the frequency range
        pitch_shift = AT.PitchShift(self.sample_rate, n_steps=2)

        # Apply pitch shift
        augmented = waveform
        if np.random.random() > 0.5:
            augmented = pitch_shift(augmented)

        # Time stretch using a different approach
        if np.random.random() > 0.5:
            stretch_factor = np.random.uniform(0.9, 1.1)
            length = augmented.shape[1]
            new_length = int(length * stretch_factor)
            augmented = AT.Resample(
                self.sample_rate, int(self.sample_rate * stretch_factor)
            )(augmented)

            # Pad or trim to maintain original length
            if new_length > length:
                augmented = augmented[:, :length]
            else:
                augmented = torch.nn.functional.pad(augmented, (0, length - new_length))

        return augmented
    
    def augment_spectrogram(self, spec, ambient_specs=None):
        """
        Apply domain-specific augmentations to spectrograms as described in BirdNET paper

        Args:
            spec (torch.Tensor): Input spectrogram
            ambient_specs (list): List of ambient noise spectrograms from non-salient chunks
        """
        # Maximum of three augmentations per sample as mentioned in the paper
        num_augmentations = np.random.randint(1, 4)
        augmented = spec.clone()

        # List of possible augmentations
        augmentations = [
            self._frequency_shift,
            self._time_shift,
            self._spec_warp,
            lambda x: self._add_ambient_noise(x, ambient_specs) if ambient_specs else x,
        ]

        # Randomly select and apply augmentations
        selected_augs = np.random.choice(
            augmentations, size=num_augmentations, replace=False
        )

        for aug in selected_augs:
            if np.random.random() > 0.5:  # 0.5 probability as mentioned in paper
                augmented = aug(augmented)

        return augmented

    def _frequency_shift(self, spec, max_shift=10):
        """Vertical roll - Shift in frequency domain"""
        shift = np.random.randint(-max_shift, max_shift)
        return torch.roll(spec, shifts=shift, dims=1)

    def _time_shift(self, spec, max_shift=50):
        """Horizontal roll - Shift in time domain"""
        shift = np.random.randint(-max_shift, max_shift)
        return torch.roll(spec, shifts=shift, dims=2)

    def _spec_warp(self, spec):
        """
        Spectrogram warping similar to SpecAugment
        Applies random partial stretching in time and frequency
        """
        freq_dim, time_dim = spec.shape[1:]

        # Create warping parameters
        w = np.random.randint(5, 20)  # window size
        center_freq = np.random.randint(w, freq_dim - w)
        center_time = np.random.randint(w, time_dim - w)

        # Create warping matrix
        factor = np.random.uniform(0.8, 1.2)
        warped = spec.clone()

        # Apply warping around center point
        warped[
            :, center_freq - w : center_freq + w, center_time - w : center_time + w
        ] *= factor

        return warped

    def _add_ambient_noise(self, spec, ambient_specs, max_weight=0.5):
        """Add random ambient noise from non-salient chunks"""
        if not ambient_specs:
            return spec
            
        # Get target shape
        _, freq_dim, time_dim = spec.shape
        
        # Randomly select an ambient noise spectrogram
        noise_spec = ambient_specs[np.random.randint(len(ambient_specs))]
        
        # Resize noise spectrogram to match target shape
        if noise_spec.shape[1:] != spec.shape[1:]:
            # Center crop or pad the time dimension
            if noise_spec.shape[2] > time_dim:
                # Center crop
                start = (noise_spec.shape[2] - time_dim) // 2
                noise_spec = noise_spec[:, :, start:start + time_dim]
            else:
                # Pad
                pad_size = time_dim - noise_spec.shape[2]
                pad_left = pad_size // 2
                pad_right = pad_size - pad_left
                noise_spec = torch.nn.functional.pad(noise_spec, (pad_left, pad_right))
    
        # Random weighting for noise
        weight = np.random.uniform(0, max_weight)
        
        # Add weighted noise
        augmented = (1 - weight) * spec + weight * noise_spec
        
        return augmented

    def collect_ambient_noise(self, audio_path, threshold=0.3):
        """
        Collect non-salient chunks for ambient noise augmentation
        """
        waveform, sr = torchaudio.load(audio_path)
        if sr != self.sample_rate:
            resampler = AT.Resample(sr, self.sample_rate)
            waveform = resampler(waveform)

        # Get signal mask
        signal_mask = self.extract_signal_segments(waveform, threshold)

        # Collect non-salient chunks (where signal is below threshold)
        non_salient_mask = ~signal_mask

        if non_salient_mask.any():
            # Convert to spectrogram
            spec = self.mel_spectrogram(waveform)
            spec = torch.log(spec + 1e-9)

            # Only keep non-salient parts
            return spec * non_salient_mask.float()

        return None

In [24]:
# Initialize preprocessor
preprocessor = BirdSongPreprocessor()

def prepare_batch(audio_files, save_dir="train_audio_processed", training=True, show_progress=True):
    """
    Prepare a batch of audio files for model training or inference

    Args:
        audio_files (list): List of audio file paths
        save_dir (str): Directory to save the processed audio files
        training (bool): Whether to apply augmentation
        show_progress (bool): Whether to show progress bars
    """

    # create save_dir if it doesn't exist
    save_dir = Path(save_dir)
    save_dir.mkdir(parents=True, exist_ok=True)

    # create metadata file to store mapping
    metadata = []
    specs = []
    ambient_specs = []

    # Group files by folder for better progress tracking
    files_by_folder = {}
    for file in audio_files:
        folder = os.path.basename(os.path.dirname(file))
        if folder not in files_by_folder:
            files_by_folder[folder] = []
        files_by_folder[folder].append(file)

    # First collect ambient noise
    ambient_folder_iter = tqdm(
        files_by_folder.items(),
        desc="Collecting ambient noise samples",
        disable=not show_progress,
    )
    if training:
        # Parameters for sampling
        max_files_per_folder = 3  # Maximum files to check per folder
        target_noise_samples = 1000  # Target number of noise samples to collect
        
        ambient_specs = []
        for folder, folder_files in ambient_folder_iter:
            # Randomly sample files from this folder
            sampled_files = np.random.choice(
                folder_files,
                size=min(max_files_per_folder, len(folder_files)),
                replace=False
            )
            
            for audio_file in tqdm(
                sampled_files, 
                desc=f"Sampling noise from {folder}", 
                leave=False, 
                disable=not show_progress
            ):
                ambient_spec = preprocessor.collect_ambient_noise(audio_file)
                if ambient_spec is not None:
                    ambient_specs.append(ambient_spec)
                    
                # Break if we have collected enough samples
                if len(ambient_specs) >= target_noise_samples:
                    break
                    
            # Break outer loop if we have enough samples
            if len(ambient_specs) >= target_noise_samples:
                print(f"\nReached target of {target_noise_samples} noise samples")
                break

        print(f"Collected {len(ambient_specs)} ambient noise samples")

    # Process audio files
    folder_iter = tqdm(
        files_by_folder.items(),
        desc="Processing folders",
        disable=not show_progress,
    )
    for folder, folder_files in folder_iter:
        for audio_file in tqdm(
            folder_files, desc="Processing audio files", leave=False, disable=not show_progress
        ):
            # Create unique filename for the processed spec
            spec_filename = Path(audio_file).stem + ".pt"
            spec_path = save_dir / folder / spec_filename

            if spec_path.exists():
                # Load existing spectrogram
                spec = torch.load(spec_path)
                specs.append(spec)
                metadata.append(
                    {
                        "original_file": audio_file,
                        "processed_file": str(spec_path),
                        "folder": folder,
                    }
                )
                continue

            # Create folder if it doesn't exist
            (save_dir / folder).mkdir(exist_ok=True)

            try:
                # Process audio file
                spec = preprocessor.process_audio(audio_file)

                # Data augmentation
                spec = preprocessor.augment_spectrogram(spec, ambient_specs)

                # Save spectrogram
                torch.save(spec, spec_path)

                specs.append(spec)
                metadata.append(
                    {
                        "original_file": audio_file,
                        "processed_file": str(spec_path),
                        "folder": folder,
                    }
                )

            except Exception as e:
                print(f"\nError processing {audio_file}: {str(e)}")
                continue

    # Save metadata
    metadata_df = pd.DataFrame(metadata)
    metadata_df.to_csv(save_dir / "metadata.csv", index=False)

    # Print summary
    print("\nProcessing Summary:")
    print(f"Total files processed: {len(specs)}")
    print("Files per folder:")
    summary = metadata_df["folder"].value_counts()
    print(summary.head().to_string())

    return specs, metadata_df

In [25]:
# Get all .ogg files recursively
train_files = glob.glob("data/train_audio/**/*.ogg", recursive=True)

print(f"Found {len(train_files)} audio files")

train_specs = prepare_batch(train_files, training=True)

Found 28564 audio files


Collecting ambient noise samples:   0%|          | 0/206 [00:00<?, ?it/s]

Sampling noise from crbtan1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 48124:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 476537:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from 66016:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from 42087:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from crcwoo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blcant4:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 787625:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 24292:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 21116:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from 46010:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from compau:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from gybmar:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 50186:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from brtpar1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whwswa1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 52884:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 868458:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from royfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from cinbec1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 963335:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 476538:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from leagre:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from greibi1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from ampkin1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from plukit1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from greani1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from savhaw1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 22333:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rosspo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yelori1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from recwoo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rutjac1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 41970:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from baymac:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from butsal1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 555142:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from grnkin:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 21038:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from 41778:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from cotfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yebfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bafibi1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from amakin1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 548639:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from greegr:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 66531:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from blbgra1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from norscr1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from spepar1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from y00678:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 24322:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from smbani:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1139490:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from 65349:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from watjac1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65962:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 21211:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from laufal1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 67252:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65336:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from strcuc1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 66578:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from spbwoo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from amekes:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whttro1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from trokin:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yehbla2:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blkvul:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from grekis:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from ywcpar:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from sahpar1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 134933:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from fotfly:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from strfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 42113:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from speowl1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from gohman1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 566513:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blcjay1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 715170:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rtlhum:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bucmot3:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from chbant1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 47067:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from stbwoo2:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 135045:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whtdov:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from sobtyr1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from turvul:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from piwtyr1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from cregua1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whbman1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1462711:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 22973:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rugdov:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yehcar1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from cargra1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 523060:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bobfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blctit1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from trsowl:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from paltan1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bicwre1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rufmot1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65448:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 24272:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from piepuf1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yebsee1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from pavpig2:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 41663:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from anhing:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bbwduc:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from chfmac1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from cocwoo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from thlsch3:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rinkin1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from snoegr:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bkcdon:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from purgal2:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bobher1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bkmtou1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bugtan:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rebbla1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from orcpar:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from gretin1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1462737:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1564122:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from linwoo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from pirfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65373:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from olipic1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from soulap1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 126247:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from thbeup1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rubsee1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 81930:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from socfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blhpar1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 67082:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from grasal4:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bubcur1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65547:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from roahaw:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65344:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1192948:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from tbsfin1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from colara1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from neocor:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1194042:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 64862:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from labter1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 714022:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from palhor2:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yercac1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from plctan1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from plbwoo1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from compot1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from colcha1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from mastit1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from strowl1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from tropar:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from banana:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from bubwre1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from shtfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from grepot1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from cocher1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yeofly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whmtyr1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from solsan:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whfant1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blbwre1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 517119:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from grysee1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from srwswa1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from creoro1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from ragmac1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from woosto:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from grbhaw1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from eardov1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from littin1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from secfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yebela1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from shghum1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yecspi2:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from babwar:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from whbant1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 42007:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 1346504:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from verfly:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 65419:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from saffin:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rumfly1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 555086:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from cattyr:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 528041:   0%|          | 0/2 [00:00<?, ?it/s]

Sampling noise from strher:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from blchaw1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from gycwor1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from wbwwre1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from crebob1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from yectyr1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 22976:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from 66893:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from ruther1:   0%|          | 0/3 [00:00<?, ?it/s]

Sampling noise from rutpuf1:   0%|          | 0/3 [00:00<?, ?it/s]

Collected 600 ambient noise samples


Processing folders:   0%|          | 0/206 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/56 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/20 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/144 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/105 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/7 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/3 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/5 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/808 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/164 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/30 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/74 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/50 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/33 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/4 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/40 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/67 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/5 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/5 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/108 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/109 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/28 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/38 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/127 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/54 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/47 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/34 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/60 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/90 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/261 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/15 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/149 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/298 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/6 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/132 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/260 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/188 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/27 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/89 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/5 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/340 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/380 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/14 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/81 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/169 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/13 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/287 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/16 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/133 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/21 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/76 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/467 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/14 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/6 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/431 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/48 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/409 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/75 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/787 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/79 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/70 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/990 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/142 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/14 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/4 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/80 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/377 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/201 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/134 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/23 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/108 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/17 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/148 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/47 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/270 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/2 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/210 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/10 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/572 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/478 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/11 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/19 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/77 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/246 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/3 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/53 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/138 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/238 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/94 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/4 [00:00<?, ?it/s]

Processing audio files:   0%|          | 0/514 [00:00<?, ?it/s]

KeyboardInterrupt: 