# Add Information to Base CSV Files

This Jupyter Notebook is designed to add more annotations to  CSV base files.

The new annotations contain data related to the date and time of bird songs recordings. The goal is to create new CSV files that store this information in a structured format.

## Process Overview

1. The notebook will recursively search for CSV annotation files within a specified data directory.

2. For each CSV file found, it will extract relevant information, including the audio file name, start time, end time, and bird species.

3. The extracted data will be organized into a structured DataFrame.

4. New interesing data will be created and added to each audio file as date and time of the recording.

5. The data will be saved as a CSV file with a name matching the original TXT file in the "Data/Annotations" directory.

In [1]:
import pandas as pd

In [2]:
ROOT_PATH = "../../../desarrollo/"

# Load the CSV file
input_file = ROOT_PATH + "Data/Annotations/" + "b01_audio_annotations.csv"
df = pd.read_csv(input_file)

# Path to the folder where you want to save the CSV files
output_file = ROOT_PATH + "Data/Annotations/" + "c01_audio_annotations.csv"

In [3]:
# Rename columns 'start' and 'end' to 'start_time' and 'end_time'
# df.rename(columns={"start": "start_time", "end": "end_time"}, inplace=True)

In [3]:
# Ensure the 'file' column is of string type
df['path'] = df['path'].astype(str)

# Define a function to construct the 'recorder' value
def create_recorder(row):
    parts = row['path'].split("/")
    recorder_part = f"{parts[0]}"
    return f"{recorder_part}"

# Define a function to construct the 'path' value
def create_path(row):
    # delete the subdirectory EtiquetasAudios from path
    parts = row['path'].split("/")
    path_part = f"{parts[0]}/{parts[1]}/{parts[3]}"
    return f"{path_part}"

# Define a function to construct the 'date' value %YYYY/mm/dd
def create_date(row):
    parts = row['path'].split("/")
    date_part = f"{parts[1].split('_')[0]}/{parts[1].split('_')[1]}/{parts[1].split('_')[2]}"
    return f"{date_part}"

# Define a function to construct the 'time' value %HH:MM:SS
def create_time(row):
    parts = row['path'].split("/")[2].split("_")
    time_part = f"{parts[2][:2]}:{parts[2][2:4]}:{parts[2][4:6]}"
    return f"{time_part}"

# Apply the function to create the 'recorder' column
df['recorder'] = df.apply(create_recorder, axis=1)

# Apply the function to create the 'path' column
df['path'] = df.apply(create_path, axis=1)

# Apply the function to create the 'date' column
df['date'] = df.apply(create_date, axis=1)

# Apply the function to create the 'path' column
df['time'] = df.apply(create_time, axis=1)

In [4]:
# Rearrange the columns
df = df[['path', 'annotator', 'recorder', 'date', 'time', 'start_time', 'end_time', 'low_frequency', 'high_frequency', 'specie']]

In [5]:
# Save the transformed file
df.to_csv(output_file, index=False)