# Create Base CSV Files

This Jupyter Notebook is designed to convert TXT annotation files into CSV format for a bird song classification project. 

The annotations contain data related to the start, end, and species of bird songs within audio files. The goal is to create CSV files that store this information in a structured format.

## Process Overview

1. The notebook will recursively search for TXT annotation files within a specified dataset directory, including all its subdirectories.

2. For each TXT file found, it will extract relevant information, including the audio file name, start time, end time, and bird species.

3. The extracted data will be organized into a structured DataFrame.

4. The data will be saved as a CSV file with a name matching the original TXT file in the "Data/Annotations" directory.

In [3]:
import os
import pandas as pd

In [19]:
ROOT_PATH = "../"

# Path to the root folder containing the dataset
dataset_path = ROOT_PATH + "Dataset"

# Path to the folder where you want to save the CSV files
output_path = ROOT_PATH + "Data/Annotations"

In [21]:
# Iterate through all subdirectories in the dataset path
for root, dirs, files in os.walk(dataset_path):
    for file in files:
        if file.endswith(".txt"):
            txt_file_path = os.path.join(root, file)
            
            # Read the TXT file
            df = pd.read_csv(txt_file_path, sep='\t')
            
            # Select the desired columns
            df = df[['file', 'start', 'end', 'specie']]
            
            # Define the CSV file path in the Annotations folder
            csv_file_name = file.replace('.txt', '.csv')
            csv_file_path = os.path.join(output_path, csv_file_name)
            
            # Save the selected columns to the CSV file
            df.to_csv(csv_file_path, index=False)
            
            print(f"Created CSV file: {csv_file_path}")


README.md
AM1_20230510_060000.WAV
AM1_20230510_061000.WAV
AM1_20230510_070000.WAV
AM1_20230510_073000.WAV
AM1_20230510_080000.WAV
AM1_20230510_083000.WAV
AM1_20230510_090000.WAV
AM1_20230510_093000.WAV
AM1_20230510_100000.WAV
AM1_20230510_103000.WAV
AM1_20230510_110000.WAV
AM01_20230510_Labels_without frequencies_mod.txt
Created CSV file: ../Data/Annotations\AM01_20230510_Labels_without frequencies_mod.csv
AM1_20230511_060000.WAV
AM1_20230511_063000.WAV
AM1_20230511_070000.WAV
AM1_20230511_080000.WAV
AM1_20230511_083000.WAV
AM1_20230511_090000.WAV
AM1_20230511_093000.WAV
AM1_20230511_100000.WAV
AM1_20230511_103000.WAV
AM1_20230511_110000.WAV
AM01_20230511_Labels_without frequencies_mod.txt
Created CSV file: ../Data/Annotations\AM01_20230511_Labels_without frequencies_mod.csv
AM1_20230512_060000.WAV
AM1_20230512_063000.WAV
AM1_20230512_070000.WAV
AM1_20230512_080000.WAV
AM1_20230512_083000.WAV
AM1_20230512_090000.WAV
AM1_20230512_093000.WAV
AM1_20230512_100000.WAV
AM1_20230512_103000.WA