# Audio Classification: CNN Baseline Model

**Course:** CSCI 6366 (Neural Networks and Deep Learning)  
**Project:** Audio Classification using CNN  
**Notebook:** Baseline CNN Model Implementation

## Overview

This notebook implements a baseline Convolutional Neural Network (CNN) for audio classification. We will:
1. 
2. 
3. 

The goal is to establish a baseline model that can classify audio samples into categories (dog, cat, bird).


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
from pathlib import Path
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import warnings
warnings.filterwarnings('ignore')


## Configuration and Data Paths

Set up paths and hyperparameters for the baseline CNN model.


In [None]:
# Data directory
DATA_DIR = Path("../data").resolve()

# Audio processing parameters (matching exploration notebook)
SAMPLE_RATE = 16000
N_FFT = 1024
HOP_LENGTH = 512
N_MELS = 128

# Model parameters
BATCH_SIZE = 32
EPOCHS = 50
VALIDATION_SPLIT = 0.2
TEST_SPLIT = 0.2

# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)


## Data Loading and Preprocessing

Load audio files and convert them to mel-spectrogram representations suitable for CNN input.


In [None]:
def load_mel_spectrogram(
    audio_path: Path,
    sr: int = SAMPLE_RATE,
    n_fft: int = N_FFT,
    hop_length: int = HOP_LENGTH,
    n_mels: int = N_MELS,
) -> np.ndarray:
    """Load an audio file and compute its Mel-spectrogram in dB scale."""
    y, sr = librosa.load(audio_path, sr=sr)
    S = librosa.feature.melspectrogram(
        y=y, sr=sr, n_fft=n_fft, hop_length=hop_length, n_mels=n_mels, power=2.0
    )
    S_db = librosa.power_to_db(S, ref=np.max)
    return S_db


## Baseline CNN Architecture

Define a simple CNN architecture as the baseline model for audio classification.


In [None]:
# Model definition will go here


## Training

Train the baseline CNN model on the audio classification dataset.


In [None]:
# Training code will go here


## Evaluation

Evaluate the trained model's performance on test data.


In [None]:
# Evaluation code will go here
