Elephant infrasound analysis and denoising. Gunthar processes low-frequency elephant vocalizations (0–500 Hz), running a full pipeline from raw field recordings through preprocessing, visualization, and neural denoising.
Gunthar was trained using transfer learning from pseudo-clean vocalization audio, where obvious background noise was removed, paired against samples of background noise without any vocalization. The model randomly pairs noisy and pseudo-clean audio during training and iterates until it converges on removing the background noise.
gunthar/
├── main.py # FastAPI backend
├── run.sh # Start frontend + backend together
├── functions/
│ ├── dataset.py # Build dataset from CSV + raw audio
│ ├── preprocess.py # High-pass filter → STFT → PCEN pipeline
│ ├── analyze.py # Waveform, spectrogram, FFT, and stats for visualization
│ └── denoising.py # Biodenoising model inference
├── frontend/
│ ├── app.py # Streamlit entry point
│ ├── home.py # Clip browser + call-type chart
│ ├── settings.py # Dataset management UI
│ └── api.py # HTTP client for the backend
├── data/
│ ├── raw_audio/ # Source WAV files
│ ├── dataset/ # Extracted clips
│ ├── new_spectro/ # Spectrogram PNGs
│ ├── processed/ # PCEN arrays (.npy)
│ ├── denoised_data/ # Denoised dataset clips
│ └── user_uploads/ # User-uploaded files
├── checkpoint/
│ └── checkpoint_step3.th # Trained denoising model
└── biodenoising/ # Local editable install of biodenoising
uv sync./run.shStarts both the backend and the Streamlit frontend.
| Service | URL |
|---|---|
| Backend | http://localhost:8000 |
| Frontend | http://localhost:8501 |
| API Docs | http://localhost:8000/docs |
Each clip is processed through three stages:
- High-pass filter (5 Hz, 4th-order Butterworth) — removes DC drift
- STFT (
n_fft=8192,hop=512) — long window for infrasound frequency resolution (~0.6 Hz bins) - PCEN normalization — adaptive background suppression
| Method | Path | Description |
|---|---|---|
POST |
/dataset/build |
Extract clips from CSV + raw audio |
POST |
/preprocess |
Run preprocessing pipeline on a file/folder |
POST |
/raw-audio/rename |
Rename raw audio to friendly convention |
POST |
/upload |
Upload a WAV file |
GET |
/clips |
List all clips |
GET |
/clips/{filename}/wav |
Stream a WAV clip |
GET |
/clips/{filename}/npy |
Stream a PCEN array |
GET |
/clips/{filename}/analysis |
Full analysis (waveform, spectrogram, FFT) |
GET |
/clips/{filename}/denoise |
Run denoising and stream the result |
Output clips follow the pattern:
{index} {call_type} {description} {recording_id} {selection}.wav
Example: 001 rumble airplane 01 99-22A 130.wav
| Field | Example | Source |
|---|---|---|
index |
001 |
Row order in the CSV |
call_type |
rumble |
Label from the annotation CSV |
description |
airplane 01 |
Descriptive part of source name |
recording_id |
99-22A |
Recording ID from source name |
selection |
130 |
Selection number from CSV |
Place raw WAV recordings in data/raw_audio/ and the annotation CSV in data/. The CSV must have columns: Selection, Sound_file, Start_time, End_time, Call_type.
Trigger a build via POST /dataset/build or through the Settings page in the frontend.