Cry Baby

I recently had my first kid, to celebrate this I built Cry Baby, a small piece of software which provides a probability that your baby is crying by continuously recording audio, chunking it into 4-second clips, and feeding them into a Convolutional Neural Network (CNN).

Installation

Create a virtual environment and install the dependencies with Poetry using the command poetry install.

Depending on your hardware architecture, Poetry should automatically install the correct version of TensorFlow or TensorFlow Lite. This has been tested on a 2022 M1 MacBook Pro and an Intel NUC running Ubuntu 20.04.

Usage

You will need a hugging face account and an API token. Once you have the token copy example.env to .env and add your token there.

A Makefile is provided for running Cry Baby.

make run

Every 4 seconds, Cry Baby will print the probability of a baby crying in each audio clip it records. And saves the timestamp, a pointer to the audio file, and the probability to a CSV file.

About the model

The codebase for training the model is currently not included in this repository due to its preliminary state. If there is interest, I plan to refine and share it.

Training data

The model was trained on 1.2 GB of evenly distributed labeled audio files, consisting of both crying and non-crying samples.

Data sources include:

Ubenwa CryCeleb2023
ESC-50 Dataset for environmental sound classification, also featuring non-crying samples
Audio clips downloaded from YouTube, such as this one
Friends who were willing to record there babies crying

Preprocessing

The raw audio files were processed into 4-second Mel Spectrogram clips using Librosa, which provides a two-dimensional representation of the sound based on the mel-scale frequencies.

The preprocessing routine is integral to Cry Baby's runtime operations and is available here.

Dataset partitioning

The dataset underwent an initial split: 80% for training and the remaining 20% for validation and testing. The latter was further divided equally between validation and testing.

Model architecture

The model's architecture is inspired by the design presented in this research paper. Below is a visualization of the model structure:

Training and evaluation

Training was conducted over 10 epochs with a batch size of 32. The corresponding training and validation loss and accuracy metrics are illustrated below.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
cry_baby		cry_baby
.example.env		.example.env
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.tool-versions		.tool-versions
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cry_baby

cry_baby

.example.env

.example.env

.flake8

.flake8

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

.tool-versions

.tool-versions

Makefile

Makefile

README.md

README.md

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

Repository files navigation

Cry Baby

Installation

Usage

About the model

Training data

Preprocessing

Dataset partitioning

Model architecture

Training and evaluation

About

Releases

Packages

Languages

BaronBonet/cry-baby

Folders and files

Latest commit

History

Repository files navigation

Cry Baby

Installation

Usage

About the model

Training data

Preprocessing

Dataset partitioning

Model architecture

Training and evaluation

About

Resources

Stars

Watchers

Forks

Languages