A command-line tool to generate visual spectrogram images (PNG) from WAV audio files. It processes directories recursively, handling mono and stereo files by averaging channels to mono for simplicity. Optimized for RIFF (little-endian) WAVE audio with Microsoft PCM format (16-bit signed integers), mono at 16000 Hz, but supports general WAV files via normalization.
- Recursive Processing: Scans input directories and subdirectories for
.wav
files. - Spectrogram Generation: Computes Short-Time Fourier Transform (STFT) using Hann windowing and FFT, rendering frequency vs. time as grayscale images (brighter = higher amplitude in dB).
- Stereo Support: Averages left/right channels to mono.
- 16-bit PCM Handling: Reads signed 16-bit integers and normalizes to f32 in [-1.0, 1.0] for accurate processing.
- Output Mirroring: Preserves input directory structure in the output folder.
- Error Handling: Robust with detailed context via
anyhow
.
- Ensure Rust is installed (via rustup).
- Clone the repository:
git clone https://github.com/RustedBytes/wav-files-spectrogram.git cd wav-files-spectrogram
- Build and install:
cargo install --path .
Alternatively, install directly from crates.io (once published):
cargo install wav-files-spectrogram
wav-files-spectrogram -i <INPUT_DIR> -o <OUTPUT_DIR>
-i, --input <INPUT_DIR>
: Path to the directory containing WAV files (required).-o, --output <OUTPUT_DIR>
: Path to the directory where PNG spectrograms will be saved (required).
Process all WAV files in ./audio/
and save spectrograms to ./spectrograms/
:
wav-files-spectrogram -i ./audio -o ./spectrograms
This will generate PNG files like ./spectrograms/subdir/song.wav.png
mirroring the input structure.
- FFT Parameters: Fixed at 1024-point FFT with 75% overlap (hop size 256) for balance between resolution and smoothness. Adjustable in source if needed.
- Image Scaling: dB range clamped to -100..0 for visual contrast; frequencies from DC to Nyquist on y-axis.
- Normalization: For 16-bit PCM, samples are scaled by
1.0 / 32768.0
to f32 range [-1.0, 1.0].
clap
: Argument parsing.hound
: WAV reading.image
: PNG generation.rustfft
: Efficient FFT computation.walkdir
: Recursive directory traversal.anyhow
: Error handling with context.num-complex
: Complex numbers for FFT.
See Cargo.toml
for versions.
Run the test suite:
cargo test
Tests cover spectrogram computation, file processing (including 16-bit PCM normalization), and edge cases (e.g., empty/short files).
Contributions welcome! Fork the repo, create a feature branch, and submit a PR. Ensure code passes cargo fmt
, cargo clippy
, and cargo test
.
- Report issues: GitHub Issues.
- Discussions: GitHub Discussions.
MIT License - see LICENSE file for details.