# **Classification of Birds based on audio of their call**

# § I: Introduction

Bird species classification based on audio recordings has significant implications for ecological monitoring, biodiversity studies, and conservation efforts, as well as being applicable and interesting to a population of bird watchers and nature lovers. This project focuses on developing a robust machine learning pipeline capable of identifying bird species from basic metadata recordable by a cell phone recording. By leveraging advanced audio signal processing techniques and modern classification algorithms, the project aims to create a reliable and scalable solution for bird call recognition.

## A. Overview of project goals

The primary objectives of this project are:

 1. Data Collection and Integration: Analyze, preprocess, and merge audio datasets containing bird calls to create a comprehensive, high-quality dataset suitable for training machine learning models.
 2. Feature Engineering: Extract meaningful features from raw audio signals, including frequency-domain representations, time-frequency representations.
 3. Model Training and Evaluation: Train and compare multiple classification models, including neural networks and traditional classification algorithms, to identify the most effective approach for bird call classification.
 4. Practical Application: Build a functional tool capable of processing new audio files to classify bird species throughout an audio file.

# § II. Data Preperation and Exploration

## A. Initial Setup

### 1. Enviroment Setup

#### a) Building Virtual Enviroment

### 2. Importing Libraries

#### a) Overview of Libraries and their purposes

## B. Dataset Exploration and Enhancement

### 1. Initial Dataset Assesment

#### a) Downloading and Viewing Primary Dataset

### 2. Acquisition and Integration of Additional Data

#### a) Acquiring Secondary Dataset and Verifying Dataset Compatibility

#### a) Merging The Datasets

## C. Data Cleaning and Preprocessing

### 1. Data Cleaning

#### a) Removing Unnecessary Columns

#### b) Removing Outliers

#### c) Removing Problematic Sample Rates

#### d) Audio Metadata Verification

### 2. Data Preprocessing

#### a) Normalizing Date and Time values

#### b) Normalizing Latitude, Longitude and Elevation values

### 3. Final Columns

#### a) Explanation for Feature Columns

#### b) Explaination for Target Columns

#### c) Explination for Remaining Columns

# § III. Establishing Utility Functions

## A. Simple Aliases

## B. Data Conversions

## C. Process Monitoring

# § IV. Audio Processing Techniques

## A. Audio Metering and Signal Analysis

### 1. Understanding Different Scales

#### a) Decibel (dB)

#### b) Frequency Scaling

##### (i) Mel Scale

##### (ii) Bark Scale

##### (iii) Equivalent Rectangular Bandwidth Scale (ERB)

##### (iv) Visualizing Different Scales

### 2. Plotting Waveforms

#### a) RMS Calculations

### 3. Understanding Different Signal Transformations

#### a) The Fourier Transformation

##### (i) Understanding The Complex Result

##### (ii) Advantage of GPU Parallelization when Performing Fourier Transformations

##### (iii) Grouping Frequencies Together

##### (iv) Example Frequency Spectrums

#### b) The Short-Time Fourier Transformation (STFT)

##### (i) Advantage of GPU Parallelization when Performing Short-Time Fourier Transformations

##### (ii) The Inverse Short-Time Fourier Transformation

##### (iii) The Constant OverLap Add (COLA) Constraint

##### (iv) The Nonzero OverLap Add (NOLA) Constraint

##### (v) Grouping Frequencies Together

##### (vi) Example Spectrograms

#### c) The Hilbert Transformation

##### (i) Understanding the Analytic Signal

##### (ii) Advantage of GPU Parallelization when Performing Hilbert Transformations

##### (iii) Extracting the Real Signal

##### (iv) Extracting the Hilbert Envelope

##### (v) Extracting the Instantaneous Phase Angle

##### (vi) Example Plots of Hilbert Envelopes

##### (vii) Example Plots of Signal Phase Angles

## B. Generation of Simple Frequency Representation Dataset

### 1. Performing the Bulk Calculation

### 2. Visualizations of Resulting Data

#### a) Individual Frequency Spectrums

#### b) Heatmaps of Frequency Spectrums

## C. Audio Signal Processing Techniques

### 1. Reduction of the Noise Floor

#### a) Issues Caused by Noise Floor

#### b) Method of Noise Reduction

#### c) Noise Reduction Examples

### 2. Reduction of Momentary Clicks

#### a) Issued Caused by Clicks

#### b) Means of Detecting Clicks

##### (i) Peak Detection

##### (ii) Calcuating the Click-Sensitive Signal

#### c) Reducing Magnitude around Clicks

### 3. Retuction Of Transient Response

#### a) Introduction to Signal Transients

#### b) Issues Caused by Transients

#### c) Calculating the Transient-Sensitive Signal

#### d) Reducing Magnitude around Transients

### 4. Segmentation of Audio into Normalized Windows

## D. Generation of Filtered Frequency Representation Dataset

## E. Generation of Filtered Time-Frequency Representation Dataset

# § V. Model Training and Evaluation

## A. Simple Frequency Representation Dataset

## B. Filtered Frequency Representation Dataset

# § V. Advanced Model Training and Species Detection

## A. Training a Convolutional Neural Network (CNN) Against the Filtered Time-Frequency Representation Dataset

## B. Model Training with Species as a Target

## C. Training a Convolutional Neural Network (CNN) with Species as a Target

# § VI. Practical Application and Use of Given Test Data

## A. Recreating the Audio Processing Pipeline

## B. Making the Processing Function

## C. Testing Against the Given Test Data