# Anomaly Detection in Audio

This notebook demonstrates how to detect anomalies in audio files using the CTC-SpeechRefinement package. We'll explore various techniques for identifying unusual patterns, outliers, and artifacts in audio signals.

## Setup

First, let's import the necessary libraries and set up the environment.

In [None]:
# Add the project root to the Python path
import sys
import os
sys.path.append(os.path.abspath('..'))

# Import libraries
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
import pandas as pd
import seaborn as sns
from IPython.display import Audio, display
import glob
from pathlib import Path
from scipy import stats
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler

# Import from the project
from ctc_speech_refinement.core.preprocessing.audio import load_audio
from ctc_speech_refinement.core.eda.anomaly_detection import analyze_anomalies

# Set up plotting
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['figure.dpi'] = 100

## Load Audio Data

Let's load an audio file and examine its basic properties.

In [None]:
# Define the path to an audio file
audio_file = "../data/speech2text/input/test1_01.wav"  # Path to the audio file

# Load the audio file using our package's function
audio_data, sample_rate = load_audio(audio_file)

# Print basic information
print(f"Audio file: {audio_file}")
print(f"Sample rate: {sample_rate} Hz")
print(f"Duration: {len(audio_data) / sample_rate:.2f} seconds")
print(f"Number of samples: {len(audio_data)}")

# Play the audio
display(Audio(audio_data, rate=sample_rate))

## Comprehensive Anomaly Detection

Let's use our package's analyze_anomalies function to perform a comprehensive analysis of anomalies in the audio.

In [None]:
# Use our package's function to detect anomalies
anomaly_results = analyze_anomalies(audio_data, sample_rate, title_prefix="Sample Audio")

# Display the amplitude anomalies
print("\nAmplitude Anomalies:")
for i, (start, end) in enumerate(anomaly_results['amplitude_anomalies']):
    print(f"Anomaly {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")

# Display the spectral anomalies
print("\nSpectral Anomalies:")
for i, frame_idx in enumerate(anomaly_results['spectral_anomalies']):
    # Convert frame index to time
    time = librosa.frames_to_time(frame_idx, sr=sample_rate, hop_length=512)
    print(f"Anomaly {i+1}: at time {time:.2f}s")

# Display the figures
for fig_name, fig in anomaly_results['figures'].items():
    plt.figure(fig.number)
    plt.tight_layout()
    plt.show()

## Detailed Anomaly Detection

Now let's explore anomaly detection in more detail.

### 1. Amplitude Anomalies using Z-Score

Z-score measures how many standard deviations a data point is from the mean. We can use it to detect amplitude anomalies.

In [None]:
# Compute Z-scores for amplitude
z_scores = stats.zscore(np.abs(audio_data))

# Define threshold for anomalies
threshold = 3.0  # 3 standard deviations from the mean

# Find anomalies
anomalies = np.where(z_scores > threshold)[0]

# Group consecutive anomalies
anomaly_regions = []
if len(anomalies) > 0:
    start = anomalies[0]
    for i in range(1, len(anomalies)):
        if anomalies[i] - anomalies[i-1] > 1:  # Not consecutive
            anomaly_regions.append((start, anomalies[i-1]))
            start = anomalies[i]
    anomaly_regions.append((start, anomalies[-1]))

# Convert to time
anomaly_regions_time = [(start / sample_rate, end / sample_rate) for start, end in anomaly_regions]

# Print anomaly regions
print("Amplitude Anomalies (Z-score method):")
for i, (start, end) in enumerate(anomaly_regions_time):
    print(f"Anomaly {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")

# Plot waveform with anomalies highlighted
plt.figure(figsize=(14, 7))
librosa.display.waveshow(audio_data, sr=sample_rate, alpha=0.5)

# Highlight anomaly regions
for start, end in anomaly_regions_time:
    plt.axvspan(start, end, color='red', alpha=0.3)

plt.title('Waveform with Amplitude Anomalies Highlighted (Z-score method)')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.tight_layout()
plt.show()

### 2. Spectral Anomalies using Isolation Forest

Isolation Forest is an unsupervised learning algorithm that detects anomalies by isolating observations. We can use it to detect spectral anomalies.

In [None]:
# Compute spectral features
hop_length = 512
frame_length = 2048

# Compute various spectral features
spectral_centroid = librosa.feature.spectral_centroid(y=audio_data, sr=sample_rate, n_fft=frame_length, hop_length=hop_length)[0]
spectral_bandwidth = librosa.feature.spectral_bandwidth(y=audio_data, sr=sample_rate, n_fft=frame_length, hop_length=hop_length)[0]
spectral_rolloff = librosa.feature.spectral_rolloff(y=audio_data, sr=sample_rate, n_fft=frame_length, hop_length=hop_length)[0]
spectral_flatness = librosa.feature.spectral_flatness(y=audio_data, n_fft=frame_length, hop_length=hop_length)[0]

# Create a feature matrix
features = np.vstack([spectral_centroid, spectral_bandwidth, spectral_rolloff, spectral_flatness]).T

# Standardize features
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)

# Apply Isolation Forest
contamination = 0.05  # Expected proportion of anomalies
iso_forest = IsolationForest(contamination=contamination, random_state=42)
anomaly_scores = iso_forest.fit_predict(features_scaled)

# Convert to binary (1: normal, -1: anomaly)
anomalies = np.where(anomaly_scores == -1)[0]

# Group consecutive anomalies
anomaly_regions = []
if len(anomalies) > 0:
    start = anomalies[0]
    for i in range(1, len(anomalies)):
        if anomalies[i] - anomalies[i-1] > 1:  # Not consecutive
            anomaly_regions.append((start, anomalies[i-1]))
            start = anomalies[i]
    anomaly_regions.append((start, anomalies[-1]))

# Convert frame indices to time
anomaly_regions_time = [(start * hop_length / sample_rate, (end + 1) * hop_length / sample_rate) for start, end in anomaly_regions]

# Print anomaly regions
print("Spectral Anomalies (Isolation Forest method):")
for i, (start, end) in enumerate(anomaly_regions_time):
    print(f"Anomaly {i+1}: {start:.2f}s - {end:.2f}s (duration: {end-start:.2f}s)")

# Plot spectrogram with anomalies highlighted
plt.figure(figsize=(14, 7))

# Compute spectrogram
D = librosa.amplitude_to_db(np.abs(librosa.stft(audio_data, n_fft=frame_length, hop_length=hop_length)), ref=np.max)

# Plot spectrogram
librosa.display.specshow(D, sr=sample_rate, x_axis='time', y_axis='log', hop_length=hop_length)
plt.colorbar(format='%+2.0f dB')

# Highlight anomaly regions
for start, end in anomaly_regions_time:
    plt.axvspan(start, end, color='red', alpha=0.3)

plt.title('Spectrogram with Spectral Anomalies Highlighted (Isolation Forest method)')
plt.tight_layout()
plt.show()

### 3. Visualizing Anomaly Scores

Let's visualize the anomaly scores from the Isolation Forest algorithm.

In [None]:
# Get decision function scores (higher values indicate more anomalous)
anomaly_scores_raw = -iso_forest.decision_function(features_scaled)
frame_times = librosa.frames_to_time(np.arange(len(anomaly_scores_raw)), sr=sample_rate, hop_length=hop_length)

# Plot anomaly scores
plt.figure(figsize=(14, 7))
plt.plot(frame_times, anomaly_scores_raw)
plt.axhline(y=0, color='r', linestyle='--', label='Threshold')
plt.title('Spectral Anomaly Scores (Isolation Forest)')
plt.xlabel('Time (s)')
plt.ylabel('Anomaly Score')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### 4. Extracting and Playing Anomalous Segments

Let's extract and play the anomalous segments to hear what they sound like.

In [None]:
# Extract and play amplitude anomalies
print("Amplitude Anomalies:")
for i, (start, end) in enumerate(anomaly_regions_time):
    start_sample = int(start * sample_rate)
    end_sample = int(end * sample_rate)
    anomaly_segment = audio_data[start_sample:end_sample]
    
    print(f"Anomaly {i+1} ({start:.2f}s - {end:.2f}s):")
    display(Audio(anomaly_segment, rate=sample_rate))

## Conclusion

In this notebook, we've performed comprehensive anomaly detection on an audio file using the CTC-SpeechRefinement package. We've explored various techniques for detecting amplitude and spectral anomalies, including Z-score analysis and Isolation Forest. These techniques can be useful for identifying unusual patterns, outliers, and artifacts in audio signals, which is important for quality control, preprocessing, and feature extraction in speech recognition tasks.