# Generate MetaData for Species Audio Files

This program is designed to create metadata for audio files in a specified directory. It accomplishes this in several steps:

CalcAudioDuration(length): This function calculates the audio duration in seconds, given the length of the audio in milliseconds. It converts milliseconds to seconds and returns the duration.

ComputeAudioDuration(filename): This function retrieves the duration of audio files. It first checks if the file exists and has a supported format (e.g., '.wav' or '.mp3'). If supported, it loads the audio file, calculates its length, and uses CalcAudioDuration to get the duration in seconds. Any errors during this process are caught and printed.

ExtractClassAndFile(path): This function extracts class names and filenames from subdirectories within a specified directory. It lists the subdirectories, iterates through them, and collects the class and filename information.

GetAudioDurations(path, classes, filenames): This function calculates the durations of audio files in the specified directory. It uses the ComputeAudioDuration function to process each audio file, collecting the durations into a list.

CreateMetaData(path): This function generates metadata for the audio files. It combines the class, filename, and duration information into a Pandas DataFrame and returns it.

WriteMetadata(path): This function writes the metadata DataFrame to a CSV file named 'species_metadata.csv' in the specified directory.

The script sets the 'path' variable to the directory containing the audio files and then calls 'WriteMetadata' to generate and save the metadata as a CSV file. This code ensures that the audio file durations are calculated accurately and included in the metadata for further analysis or use.

In [1]:
# Import the libraries
import os
from pydub import AudioSegment
import pandas as pd



In [2]:
def CalcAudioDuration(length):
    """
    Function to compute the duration. You could add more features like hours and minutes here, 
    but for now it would just be seconds.
    """
    seconds = length / 1000  # Convert milliseconds to seconds
    return seconds

def ComputeAudioDuration(filename):
    """
    This function will retrieve the duration from any file passed to it
    """
    try:
        # Check if the file exists and has a supported format
        ext = os.path.splitext(filename)[-1].lower()
        if ext not in ['.wav', '.mp3']:
            raise ValueError("Unsupported audio format")

        audio = AudioSegment.from_file(filename)
        length = len(audio)
        seconds = CalcAudioDuration(length)
        return seconds
    except Exception as e:
        print(f"Error loading {filename}: {e}")
        return None

def ExtractClassAndFile(path):
    classes = []
    filenames = []
    folders = os.listdir(path)

    for folder in folders:
        filepath = os.path.join(path, folder)
        for file in os.listdir(filepath):
            classes.append(folder)
            filenames.append(file)
    return classes, filenames

def GetAudioDurations(path, classes, filenames):
    seconds = []
    for cls, filename in zip(classes, filenames):
        fp = os.path.join(path, cls, filename)
        second = ComputeAudioDuration(fp)
        seconds.append(second)
    return seconds

def CreateMetaData(path):
    classes, filenames = ExtractClassAndFile(path)
    seconds = GetAudioDurations(path, classes, filenames)
    
    feature_colname = ['filename', 'seconds', 'class']
    df = pd.DataFrame({'Filename': filenames, 'Seconds': seconds, 'Class': classes})
    return df

def WriteMetadata(path):
    df = CreateMetaData(path)
    df.to_csv(os.path.join(path, 'species_metadata.csv'), index=False)

path = r"./Data2/"
WriteMetadata(path)


Finally, the script sets the 'path' variable to "./Data2/" and calls the 'WriteMetadata' function with this path, which generates metadata for audio files in the "./Data2/" directory and saves it in a CSV file named 'species_metadata.csv'.

In [7]:
pd.read_csv(r'Data2/species_metadata.csv')

Unnamed: 0,Filename,Seconds,Class
0,powerful owl (1).wav,0.75,Ninox strenua - powerful owl
1,powerful owl (10).wav,4.00,Ninox strenua - powerful owl
2,powerful owl (100).wav,2.65,Ninox strenua - powerful owl
3,powerful owl (101).wav,3.65,Ninox strenua - powerful owl
4,powerful owl (102).wav,3.35,Ninox strenua - powerful owl
...,...,...,...
156,ground parrot (54).wav,3.99,Pezoporus wallicus - ground parrot
157,ground parrot (6).wav,4.00,Pezoporus wallicus - ground parrot
158,ground parrot (7).wav,4.00,Pezoporus wallicus - ground parrot
159,ground parrot (8).wav,1.60,Pezoporus wallicus - ground parrot
