# Basic Audio Exploratory Data Analysis

This notebook demonstrates how to perform basic exploratory data analysis on audio files using the CTC-SpeechRefinement package.

## Setup

First, let's import the necessary libraries and set up the environment.

In [None]:
# Add the project root to the Python path
import sys
import os
sys.path.append(os.path.abspath('..'))

# Import libraries
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
import pandas as pd
import seaborn as sns
from IPython.display import Audio, display
import glob
from pathlib import Path

# Import from the project
from ctc_speech_refinement.core.preprocessing.audio import load_audio
from ctc_speech_refinement.core.eda.descriptive_stats import analyze_descriptive_stats
from ctc_speech_refinement.core.eda.time_domain import analyze_time_domain
from ctc_speech_refinement.core.eda.frequency_domain import analyze_frequency_domain
from ctc_speech_refinement.core.eda.pitch_timbre import analyze_pitch_timbre
from ctc_speech_refinement.core.eda.anomaly_detection import analyze_anomalies

# Set up plotting
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['figure.dpi'] = 100

## Load Audio Data

Let's load an audio file and examine its basic properties using our package's load_audio function.

In [None]:
# Define the path to an audio file
audio_file = "../data/speech2text/input/test1_01.wav"  # Path to the audio file

# Load the audio file using our package's function
audio_data, sample_rate = load_audio(audio_file)

# Print basic information
print(f"Audio file: {audio_file}")
print(f"Sample rate: {sample_rate} Hz")
print(f"Duration: {len(audio_data) / sample_rate:.2f} seconds")
print(f"Number of samples: {len(audio_data)}")

# Play the audio
display(Audio(audio_data, rate=sample_rate))

## Descriptive Statistics Analysis

Let's use our package's analyze_descriptive_stats function to compute and visualize descriptive statistics of the audio data.

In [None]:
# Use our package's function to analyze descriptive statistics
descriptive_stats_results = analyze_descriptive_stats(audio_data, sample_rate, title_prefix="Sample Audio")

# Display the statistics
print("\nDescriptive Statistics:")
for stat, value in descriptive_stats_results['stats'].items():
    print(f"{stat}: {value}")

# Display the figures
for fig_name, fig in descriptive_stats_results['figures'].items():
    plt.figure(fig.number)
    plt.tight_layout()
    plt.show()

## Time Domain Analysis

Let's use our package's analyze_time_domain function to analyze the audio in the time domain.

In [None]:
# Use our package's function to analyze time domain features
time_domain_results = analyze_time_domain(audio_data, sample_rate, title_prefix="Sample Audio")

# Display the figures
for fig_name, fig in time_domain_results['figures'].items():
    plt.figure(fig.number)
    plt.tight_layout()
    plt.show()

## Frequency Domain Analysis

Let's use our package's analyze_frequency_domain function to analyze the audio in the frequency domain.

In [None]:
# Use our package's function to analyze frequency domain features
frequency_domain_results = analyze_frequency_domain(audio_data, sample_rate, title_prefix="Sample Audio")

# Display the figures
for fig_name, fig in frequency_domain_results['figures'].items():
    plt.figure(fig.number)
    plt.tight_layout()
    plt.show()

## Pitch and Timbre Analysis

Let's use our package's analyze_pitch_timbre function to analyze the pitch and timbre characteristics of the audio.

In [None]:
# Use our package's function to analyze pitch and timbre features
pitch_timbre_results = analyze_pitch_timbre(audio_data, sample_rate, title_prefix="Sample Audio")

# Display the figures
for fig_name, fig in pitch_timbre_results['figures'].items():
    plt.figure(fig.number)
    plt.tight_layout()
    plt.show()

## Anomaly Detection

Let's use our package's analyze_anomalies function to detect anomalies in the audio.

In [None]:
# Use our package's function to detect anomalies
anomaly_results = analyze_anomalies(audio_data, sample_rate, title_prefix="Sample Audio")

# Display the figures
for fig_name, fig in anomaly_results['figures'].items():
    plt.figure(fig.number)
    plt.tight_layout()
    plt.show()

## Silence Detection

Let's use our package's time domain analysis to detect silent regions in the audio.

In [None]:
# Use our package's function to detect silent regions
silence_regions = time_domain_results['silent_regions']

# Print silent regions
print("Non-silent regions:")
for i, (start, end) in enumerate(silence_regions):
    print(f"Region {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")

# Plot waveform with non-silent regions highlighted
plt.figure(figsize=(14, 5))
librosa.display.waveshow(audio_data, sr=sample_rate, alpha=0.5)

# Highlight non-silent regions
for start, end in silence_regions:
    plt.axvspan(start, end, color='red', alpha=0.3)

plt.title('Waveform with Non-Silent Regions Highlighted')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.tight_layout()
plt.show()

## Conclusion

In this notebook, we've performed a basic exploratory data analysis of an audio file using the CTC-SpeechRefinement package. We've examined its waveform, computed descriptive statistics, and analyzed it in both the time and frequency domains using the package's built-in functions. We've also analyzed pitch and timbre characteristics and detected anomalies in the audio.

This analysis provides a good starting point for understanding the characteristics of the audio data, which can be useful for preprocessing and feature extraction for speech recognition tasks.