Gunthar

Elephant infrasound analysis and denoising. Gunthar processes low-frequency elephant vocalizations (0–500 Hz), running a full pipeline from raw field recordings through preprocessing, visualization, and neural denoising.

Overview

Gunthar was trained using transfer learning from pseudo-clean vocalization audio, where obvious background noise was removed, paired against samples of background noise without any vocalization. The model randomly pairs noisy and pseudo-clean audio during training and iterates until it converges on removing the background noise.

Project Structure

gunthar/
├── main.py                  # FastAPI backend
├── run.sh                   # Start frontend + backend together
├── functions/
│   ├── dataset.py           # Build dataset from CSV + raw audio
│   ├── preprocess.py        # High-pass filter → STFT → PCEN pipeline
│   ├── analyze.py           # Waveform, spectrogram, FFT, and stats for visualization
│   └── denoising.py         # Biodenoising model inference
├── frontend/
│   ├── app.py               # Streamlit entry point
│   ├── home.py              # Clip browser + call-type chart
│   ├── settings.py          # Dataset management UI
│   └── api.py               # HTTP client for the backend
├── data/
│   ├── raw_audio/           # Source WAV files
│   ├── dataset/             # Extracted clips
│   ├── new_spectro/         # Spectrogram PNGs
│   ├── processed/           # PCEN arrays (.npy)
│   ├── denoised_data/       # Denoised dataset clips
│   └── user_uploads/        # User-uploaded files
├── checkpoint/
│   └── checkpoint_step3.th  # Trained denoising model
└── biodenoising/            # Local editable install of biodenoising

Setup

uv sync

Running

./run.sh

Starts both the backend and the Streamlit frontend.

Service	URL
Backend	http://localhost:8000
Frontend	http://localhost:8501
API Docs	http://localhost:8000/docs

Preprocessing Pipeline

Each clip is processed through three stages:

High-pass filter (5 Hz, 4th-order Butterworth) — removes DC drift
STFT (n_fft=8192, hop=512) — long window for infrasound frequency resolution (~0.6 Hz bins)
PCEN normalization — adaptive background suppression

API Endpoints

Method	Path	Description
`POST`	`/dataset/build`	Extract clips from CSV + raw audio
`POST`	`/preprocess`	Run preprocessing pipeline on a file/folder
`POST`	`/raw-audio/rename`	Rename raw audio to friendly convention
`POST`	`/upload`	Upload a WAV file
`GET`	`/clips`	List all clips
`GET`	`/clips/{filename}/wav`	Stream a WAV clip
`GET`	`/clips/{filename}/npy`	Stream a PCEN array
`GET`	`/clips/{filename}/analysis`	Full analysis (waveform, spectrogram, FFT)
`GET`	`/clips/{filename}/denoise`	Run denoising and stream the result

File Naming Convention

Output clips follow the pattern:

{index} {call_type} {description} {recording_id} {selection}.wav

Example: 001 rumble airplane 01 99-22A 130.wav

Field	Example	Source
`index`	`001`	Row order in the CSV
`call_type`	`rumble`	Label from the annotation CSV
`description`	`airplane 01`	Descriptive part of source name
`recording_id`	`99-22A`	Recording ID from source name
`selection`	`130`	Selection number from CSV

Data

Place raw WAV recordings in data/raw_audio/ and the annotation CSV in data/. The CSV must have columns: Selection, Sound_file, Start_time, End_time, Call_type.

Trigger a build via POST /dataset/build or through the Settings page in the frontend.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
biodenoising @ d432f60		biodenoising @ d432f60
checkpoint		checkpoint
data		data
frontend		frontend
functions		functions
.DS_Store		.DS_Store
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
run.sh		run.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gunthar

Overview

Project Structure

Setup

Running

Preprocessing Pipeline

API Endpoints

File Naming Convention

Data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gunthar

Overview

Project Structure

Setup

Running

Preprocessing Pipeline

API Endpoints

File Naming Convention

Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages