RibbitRadar

RibbitRadar is a python-based application designed to accurately identify the presence of specific frog species within audio recordings. Leveraging my fine-tuned version of the Audio Spectrogram Transformer (AST), RibbitRadar processes audio data, preforms inference, and generates reports with detailed information on detection.

Overview

Ribbit Radar is part of a broader project focused on automated frog call recognition. The application performs the following key tasks:

Preprocessing: Converts audio files into a format suitable for model inference.
Inference: Uses pre-trained models to identify frog species in the recordings.
Reporting: Generates results in various report formats, providing both detailed and summary-level information.
Features: Adjustable prediction mode, thresholds, and report formatting.

A more detailed flowchart of the application logic is below

Functionality

Performance

Rana draytonii: Accuracy: 96.52% - Precision: 96.09% - Recall: 91.87%
Rana catesbeiana: Accuracy: 94.60% - Precision: 95.61% - Recall: 82.43%

Based on a test set of 10-second audio files with a split of 455 rana draytonii, 370 Rana catesbeiana, and 1111 Negative.

Getting Started

To use RibbitRadar, download the latest release from the Releases page. The release includes a packaged application for macOS and Windows.

Prerequisites

macOS or Windows operating system.
Audio recordings in WAV format to analyze.

Running RibbitRadar

Extract the RibbitRadar.zip file.
Navigate to the RibbitRadar directory.
Double-click on main.exe to run the application.

Citing

If you utilize RibbitRadar in your research, please consider citing the original AST paper and any subsequent works that this project builds upon.

The first paper proposes the Audio Spectrogram Transformer while the second paper describes the training pipeline that they applied on AST to achieve the new state-of-the-art on AudioSet.

@inproceedings{gong21b_interspeech,
  author={Yuan Gong and Yu-An Chung and James Glass},
  title={{AST: Audio Spectrogram Transformer}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={571--575},
  doi={10.21437/Interspeech.2021-698}
}

@ARTICLE{gong_psla, 
    author={Gong, Yuan and Chung, Yu-An and Glass, James},  
    journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},   
    title={PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation},   
    year={2021}, 
    doi={10.1109/TASLP.2021.3120633}
}

Contact

If you have a question, would like to develop something similar for another species, or just want to share how you have used this, send me an email at tylerschwenk1@yahoo.com.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
config		config
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Ribbit_Radar.png		Ribbit_Radar.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RibbitRadar

Table of Contents

Overview

Functionality

Performance

Getting Started

Prerequisites

Running RibbitRadar

Citing

Contact

About

Releases 1

Languages

License

Tyler-Schwenk/RibbitRadar

Folders and files

Latest commit

History

Repository files navigation

RibbitRadar

Table of Contents

Overview

Functionality

Performance

Getting Started

Prerequisites

Running RibbitRadar

Citing

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages