# Batch Audio Analysis

This notebook demonstrates how to perform batch analysis on multiple audio files using the CTC-SpeechRefinement package. We'll explore techniques for analyzing a collection of audio files, comparing their characteristics, and generating summary reports.

## Setup

First, let's import the necessary libraries and set up the environment.

In [None]:
# Add the project root to the Python path
import sys
import os
sys.path.append(os.path.abspath('..'))

# Import libraries
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
import pandas as pd
import seaborn as sns
from IPython.display import Audio, display, HTML
import glob
from pathlib import Path
from tqdm.notebook import tqdm

# Import from the project
from ctc_speech_refinement.core.preprocessing.audio import load_audio
from ctc_speech_refinement.core.eda.descriptive_stats import analyze_descriptive_stats, batch_analyze_descriptive_stats
from ctc_speech_refinement.core.eda.time_domain import analyze_time_domain, batch_analyze_time_domain
from ctc_speech_refinement.core.eda.frequency_domain import analyze_frequency_domain, batch_analyze_frequency_domain
from ctc_speech_refinement.core.eda.pitch_timbre import analyze_pitch_timbre, batch_analyze_pitch_timbre
from ctc_speech_refinement.core.eda.anomaly_detection import analyze_anomalies, batch_analyze_anomalies
from ctc_speech_refinement.core.utils.file_utils import get_audio_files
from ctc_speech_refinement.core.eda.audio_eda import analyze_directory

# Set up plotting
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['figure.dpi'] = 100

## 1. Load Multiple Audio Files

Let's load multiple audio files from a directory.

In [None]:
# Define the path to the directory containing audio files
audio_dir = "../data/speech2text/input"  # Path to the audio directory

# Get list of audio files
audio_files = get_audio_files(audio_dir)
print(f"Found {len(audio_files)} audio files in {audio_dir}")
for file in audio_files:
    print(f"- {file}")

# Load audio files
audio_data_dict = {}
for file in audio_files:
    audio_data, sample_rate = load_audio(file)
    audio_data_dict[file] = (audio_data, sample_rate)
    
print(f"\nLoaded {len(audio_data_dict)} audio files")

## 2. Basic Information

Let's display basic information about each audio file.

In [None]:
# Create a DataFrame with basic information
basic_info = []
for file, (audio_data, sample_rate) in audio_data_dict.items():
    duration = len(audio_data) / sample_rate
    basic_info.append({
        'File': os.path.basename(file),
        'Sample Rate (Hz)': sample_rate,
        'Duration (s)': duration,
        'Num Samples': len(audio_data),
        'Mean Amplitude': np.mean(audio_data),
        'RMS': np.sqrt(np.mean(np.square(audio_data)))
    })
    
basic_info_df = pd.DataFrame(basic_info)
basic_info_df.set_index('File', inplace=True)
basic_info_df

## 3. Batch Descriptive Statistics Analysis

Let's perform descriptive statistics analysis on all audio files.

In [None]:
# Perform batch descriptive statistics analysis
descriptive_stats_results = batch_analyze_descriptive_stats(audio_data_dict)

# Create a DataFrame with descriptive statistics
stats_df = pd.DataFrame()
for file, results in descriptive_stats_results.items():
    file_name = os.path.basename(file)
    stats = results['stats']
    stats_df[file_name] = pd.Series(stats)
    
# Display the statistics
stats_df.T

## 4. Compare Waveforms

Let's compare the waveforms of all audio files.

In [None]:
# Plot waveforms of all audio files
n_files = len(audio_data_dict)
fig, axes = plt.subplots(n_files, 1, figsize=(14, 4 * n_files))
if n_files == 1:
    axes = [axes]
    
for i, (file, (audio_data, sample_rate)) in enumerate(audio_data_dict.items()):
    file_name = os.path.basename(file)
    librosa.display.waveshow(audio_data, sr=sample_rate, ax=axes[i])
    axes[i].set_title(f'Waveform: {file_name}')
    axes[i].set_xlabel('Time (s)')
    axes[i].set_ylabel('Amplitude')
    
plt.tight_layout()
plt.show()

## 5. Compare Spectrograms

Let's compare the spectrograms of all audio files.

In [None]:
# Plot spectrograms of all audio files
n_files = len(audio_data_dict)
fig, axes = plt.subplots(n_files, 1, figsize=(14, 5 * n_files))
if n_files == 1:
    axes = [axes]
    
for i, (file, (audio_data, sample_rate)) in enumerate(audio_data_dict.items()):
    file_name = os.path.basename(file)
    D = librosa.amplitude_to_db(np.abs(librosa.stft(audio_data)), ref=np.max)
    img = librosa.display.specshow(D, sr=sample_rate, x_axis='time', y_axis='log', ax=axes[i])
    fig.colorbar(img, ax=axes[i], format='%+2.0f dB')
    axes[i].set_title(f'Spectrogram: {file_name}')
    
plt.tight_layout()
plt.show()

## 6. Batch Time Domain Analysis

Let's perform time domain analysis on all audio files.

In [None]:
# Perform batch time domain analysis
time_domain_results = batch_analyze_time_domain(audio_data_dict)

# Display silent regions for each file
for file, results in time_domain_results.items():
    file_name = os.path.basename(file)
    silent_regions = results['silent_regions']
    print(f"\nSilent regions in {file_name}:")
    for i, (start, end) in enumerate(silent_regions):
        print(f"Region {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")

## 7. Batch Frequency Domain Analysis

Let's perform frequency domain analysis on all audio files.

In [None]:
# Perform batch frequency domain analysis
frequency_domain_results = batch_analyze_frequency_domain(audio_data_dict)

# Create a DataFrame with spectral features
spectral_df = pd.DataFrame()
for file, results in frequency_domain_results.items():
    file_name = os.path.basename(file)
    spectral_features = results['spectral_features']
    spectral_df[file_name] = pd.Series(spectral_features)
    
# Display the spectral features
spectral_df.T

## 8. Batch Pitch and Timbre Analysis

Let's perform pitch and timbre analysis on all audio files.

In [None]:
# Perform batch pitch and timbre analysis
pitch_timbre_results = batch_analyze_pitch_timbre(audio_data_dict)

# Create DataFrames with pitch and MFCC statistics
pitch_df = pd.DataFrame()
mfcc_df = pd.DataFrame()
for file, results in pitch_timbre_results.items():
    file_name = os.path.basename(file)
    pitch_stats = results['pitch_stats']
    mfcc_stats = results['mfcc_stats']
    pitch_df[file_name] = pd.Series(pitch_stats)
    mfcc_df[file_name] = pd.Series(mfcc_stats)
    
# Display the pitch statistics
print("Pitch Statistics:")
pitch_df.T

In [None]:
# Display the MFCC statistics
print("MFCC Statistics:")
mfcc_df.T

## 9. Batch Anomaly Detection

Let's perform anomaly detection on all audio files.

In [None]:
# Perform batch anomaly detection
anomaly_results = batch_analyze_anomalies(audio_data_dict)

# Display anomalies for each file
for file, results in anomaly_results.items():
    file_name = os.path.basename(file)
    amplitude_anomalies = results['amplitude_anomalies']
    spectral_anomalies = results['spectral_anomalies']
    
    print(f"\nAnomalies in {file_name}:")
    
    print("Amplitude Anomalies:")
    for i, (start, end) in enumerate(amplitude_anomalies):
        print(f"Anomaly {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")
    
    print("Spectral Anomalies:")
    for i, (start, end) in enumerate(spectral_anomalies):
        print(f"Anomaly {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")

## 10. Generate Summary Report

Let's generate a summary report of the batch analysis.

In [None]:
# Create a summary DataFrame
summary_df = pd.DataFrame()

# Add basic information
summary_df['Duration (s)'] = basic_info_df['Duration (s)']
summary_df['Sample Rate (Hz)'] = basic_info_df['Sample Rate (Hz)']
summary_df['RMS'] = basic_info_df['RMS']

# Add key statistics
for file_name in stats_df.columns:
    summary_df.loc[file_name, 'Mean Amplitude'] = stats_df[file_name]['mean']
    summary_df.loc[file_name, 'Std Dev'] = stats_df[file_name]['std']
    summary_df.loc[file_name, 'Dynamic Range'] = stats_df[file_name]['dynamic_range']
    summary_df.loc[file_name, 'Zero Crossings'] = stats_df[file_name]['zero_crossings']

# Add key spectral features
for file_name in spectral_df.columns:
    summary_df.loc[file_name, 'Spectral Centroid (Hz)'] = spectral_df[file_name]['spectral_centroid_mean']
    summary_df.loc[file_name, 'Spectral Bandwidth (Hz)'] = spectral_df[file_name]['spectral_bandwidth_mean']
    summary_df.loc[file_name, 'Spectral Flatness'] = spectral_df[file_name]['spectral_flatness_mean']

# Add key pitch statistics
for file_name in pitch_df.columns:
    summary_df.loc[file_name, 'Mean Pitch (Hz)'] = pitch_df[file_name]['mean_pitch']
    summary_df.loc[file_name, 'Pitch Range (Hz)'] = pitch_df[file_name]['pitch_range']

# Add anomaly counts
for file, results in anomaly_results.items():
    file_name = os.path.basename(file)
    summary_df.loc[file_name, 'Amplitude Anomalies'] = len(results['amplitude_anomalies'])
    summary_df.loc[file_name, 'Spectral Anomalies'] = len(results['spectral_anomalies'])

# Display the summary report
summary_df

## 11. Visualize Summary Statistics

Let's visualize some key statistics across all audio files.

In [None]:
# Plot key statistics
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Duration
summary_df['Duration (s)'].plot(kind='bar', ax=axes[0, 0])
axes[0, 0].set_title('Duration (s)')
axes[0, 0].set_ylabel('Seconds')
axes[0, 0].tick_params(axis='x', rotation=45)

# RMS
summary_df['RMS'].plot(kind='bar', ax=axes[0, 1])
axes[0, 1].set_title('RMS Amplitude')
axes[0, 1].tick_params(axis='x', rotation=45)

# Spectral Centroid
summary_df['Spectral Centroid (Hz)'].plot(kind='bar', ax=axes[1, 0])
axes[1, 0].set_title('Spectral Centroid (Hz)')
axes[1, 0].set_ylabel('Frequency (Hz)')
axes[1, 0].tick_params(axis='x', rotation=45)

# Mean Pitch
summary_df['Mean Pitch (Hz)'].plot(kind='bar', ax=axes[1, 1])
axes[1, 1].set_title('Mean Pitch (Hz)')
axes[1, 1].set_ylabel('Frequency (Hz)')
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 12. Using the analyze_directory Function

Let's use the analyze_directory function to perform a comprehensive analysis of all audio files in a directory.

In [None]:
# Define output directory for results
output_dir = "../data/speech2text/eda_results"
os.makedirs(output_dir, exist_ok=True)

# Perform comprehensive analysis of all audio files in the directory
results = analyze_directory(
    audio_dir,
    output_dir=output_dir,
    normalize=True,
    remove_silence=False
)

print(f"Analysis completed. Results saved to {output_dir}")

## Conclusion

In this notebook, we've performed a comprehensive batch analysis of multiple audio files using the CTC-SpeechRefinement package. We've explored various techniques for analyzing a collection of audio files, comparing their characteristics, and generating summary reports. This approach is useful for understanding the overall characteristics of a dataset, identifying outliers, and ensuring consistency across multiple recordings.