# YAMNet Audio Classification Model

This repository contains code for a deep learning model that performs audio classification using the YAMNet (Yet Another MusicNet) architecture. YAMNet is designed to classify audio into a set of pre-defined classes, making it suitable for tasks such as identifying musical instruments, vocal sounds, and other auditory events.

## Overview

The code in yamnet.py is organised as follows:

### Importing Libraries: 

The code begins by importing essential libraries, including csv, numpy, tensorflow, and specific modules from the YAMNet library (yamnet.features and yamnet.params).

### Utility Functions:

_batch_norm(name): Defines a batch normalisation layer with a given name.

_conv(name, kernel, stride, filters): Defines a convolutional layer with a given name, kernel size, stride, and number of filters.

_separable_conv(name, kernel, stride, filters): Defines a separable convolutional layer, consisting of depthwise and pointwise convolutions.

### YAMNet Layer Definitions:

_YAMNET_LAYER_DEFS: A list of tuples, each containing information about a layer in the YAMNet architecture. These definitions are used to build the YAMNet model layer by layer.

yamnet(features) Function: Defines the core YAMNet model in Keras. It takes features as input, reshapes them, applies layers defined by _YAMNET_LAYER_DEFS, and performs global average pooling and classification.

yamnet_frames_model(feature_params) Function: Defines the YAMNet waveform-to-class-scores model. It accepts waveform data, computes the log mel spectrogram, converts it into patches, and feeds these patches to the yamnet function for classification. It returns a model that produces class scores per time frame and the spectrogram feature matrix.

class_names(class_map_csv) Function: Reads a class name definition file in CSV format and returns a list of class names. It extracts display names while skipping the header row.

## Usage

To use this code for audio classification, follow these steps:

1. Ensure you have the required libraries installed, including csv, numpy, and tensorflow.

2. Import the necessary modules from this repository.

3. Prepare your audio data or waveform.

4. Use the yamnet_frames_model(feature_params) function to create the YAMNet model.

5. Pass your audio data to the model for classification, and it will produce class scores.

## Class Names

The model relies on a class name definition file in CSV format. The class_names(class_map_csv) function reads this file to retrieve the class names. Please ensure your class name definition file adheres to the expected format.

## Acknowledgments
This code is based on the YAMNet audio classification model developed by Google.

Created by Sizhe Wang.