Skip to content

jsreed2003/gunthar

Repository files navigation

Gunthar

Elephant infrasound analysis and denoising. Gunthar processes low-frequency elephant vocalizations (0–500 Hz), running a full pipeline from raw field recordings through preprocessing, visualization, and neural denoising.


Overview

Gunthar was trained using transfer learning from pseudo-clean vocalization audio, where obvious background noise was removed, paired against samples of background noise without any vocalization. The model randomly pairs noisy and pseudo-clean audio during training and iterates until it converges on removing the background noise.


Project Structure

gunthar/
├── main.py                  # FastAPI backend
├── run.sh                   # Start frontend + backend together
├── functions/
│   ├── dataset.py           # Build dataset from CSV + raw audio
│   ├── preprocess.py        # High-pass filter → STFT → PCEN pipeline
│   ├── analyze.py           # Waveform, spectrogram, FFT, and stats for visualization
│   └── denoising.py         # Biodenoising model inference
├── frontend/
│   ├── app.py               # Streamlit entry point
│   ├── home.py              # Clip browser + call-type chart
│   ├── settings.py          # Dataset management UI
│   └── api.py               # HTTP client for the backend
├── data/
│   ├── raw_audio/           # Source WAV files
│   ├── dataset/             # Extracted clips
│   ├── new_spectro/         # Spectrogram PNGs
│   ├── processed/           # PCEN arrays (.npy)
│   ├── denoised_data/       # Denoised dataset clips
│   └── user_uploads/        # User-uploaded files
├── checkpoint/
│   └── checkpoint_step3.th  # Trained denoising model
└── biodenoising/            # Local editable install of biodenoising

Setup

uv sync

Running

./run.sh

Starts both the backend and the Streamlit frontend.

Service URL
Backend http://localhost:8000
Frontend http://localhost:8501
API Docs http://localhost:8000/docs

Preprocessing Pipeline

Each clip is processed through three stages:

  1. High-pass filter (5 Hz, 4th-order Butterworth) — removes DC drift
  2. STFT (n_fft=8192, hop=512) — long window for infrasound frequency resolution (~0.6 Hz bins)
  3. PCEN normalization — adaptive background suppression

API Endpoints

Method Path Description
POST /dataset/build Extract clips from CSV + raw audio
POST /preprocess Run preprocessing pipeline on a file/folder
POST /raw-audio/rename Rename raw audio to friendly convention
POST /upload Upload a WAV file
GET /clips List all clips
GET /clips/{filename}/wav Stream a WAV clip
GET /clips/{filename}/npy Stream a PCEN array
GET /clips/{filename}/analysis Full analysis (waveform, spectrogram, FFT)
GET /clips/{filename}/denoise Run denoising and stream the result

File Naming Convention

Output clips follow the pattern:

{index} {call_type} {description} {recording_id} {selection}.wav

Example: 001 rumble airplane 01 99-22A 130.wav

Field Example Source
index 001 Row order in the CSV
call_type rumble Label from the annotation CSV
description airplane 01 Descriptive part of source name
recording_id 99-22A Recording ID from source name
selection 130 Selection number from CSV

Data

Place raw WAV recordings in data/raw_audio/ and the annotation CSV in data/. The CSV must have columns: Selection, Sound_file, Start_time, End_time, Call_type.

Trigger a build via POST /dataset/build or through the Settings page in the frontend.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors