TimbreMind is a hands-on exploration of how audio descriptors can be
translated into visually rich artwork. The project ships with a
commented Python implementation that analyses .wav files and renders
Perlin-noise-based flow fields whose colours and motion respond to the
sound's loudness and brightness.
- Detailed audio descriptors – RMS loudness and spectral centroid are extracted with clear explanations in the code so the maths stays understandable.
- Perlin noise flow field – A small custom implementation of Perlin noise drives flowing line trajectories reminiscent of particle motion in a fluid.
- Accessible CLI – Generate artwork with a single command and tune parameters such as resolution, grid density, and random seed.
- Live visualiser – A Tkinter screen lets you watch the Perlin field react in real time while the audio plays.
- Python 3.10+
pipfor installing dependencies
python -m venv .venv
source .venv/bin/activate # On Windows use `.venv\\Scripts\\activate`
pip install -r requirements.txtThe dependencies stay approachable: NumPy for numerical computations,
SciPy for robust .wav loading, Matplotlib for rendering, and
Pygame for cross-platform playback inside the live viewer.
-
Prepare a
.wavfile you would like to visualise. -
Run the generator:
python -m timbremind.main path/to/audio.wav --output artwork.png
-
Inspect
artwork.pngto see the audio translated into flowing lines.
Prefer to experience the artwork as it evolves? Launch the Tkinter interface:
python -m timbremind.ui- Click Load WAV and pick the file you want to explore.
- Press Start to play the audio and animate the flow field in real time.
- Use Stop to pause playback and freeze the current field state.
The window is intentionally basic: a couple of buttons, a status label, and a
Tkinter Canvas that redraws short line segments 30 times per second. The
heavy code comments in timbremind/ui.py walk through the full process so
you can tweak the look without much prior GUI experience.
Audio playback uses Pygame's mixer module, which ships as pre-built wheels on all major platforms so Windows users do not need to install extra compilers.
The short line segments show the direction and energy of the flow. Loud moments stretch the lines and make the field move faster, while bright timbres push the colours toward warm orange tones.
python -m timbremind.main --help
Key switches include:
--resolution WIDTH HEIGHT– change the output image size.--grid COLUMNS ROWS– adjust how many flow lines are seeded.--seed– choose a different random seed to explore new textures.
-
Audio analysis –
timbremind/audio.pyloads the.wavfile, normalises it to mono, and splits it into short overlapping windows. For each window we compute:- RMS loudness – approximates perceived volume and smooths transients to avoid jittery visuals.
- Spectral centroid – indicates whether the sound is dark or bright, which later influences the colour palette.
-
Perlin noise flow field –
timbremind/perlin.pyimplements a compact version of Ken Perlin's gradient noise. By sampling the noise at grid points we derive angles that represent the direction of the flow at each location. -
Visual synthesis –
timbremind/visualize.pyseeds lines across the canvas, walks each line by following the flow direction, and uses the audio descriptors to modulate:- the number of steps taken (energy of the path),
- the colour hue and opacity, and
- the line thickness.
The result is a high-resolution PNG rendered with Matplotlib.
-
Command line orchestration –
timbremind/main.pyties everything together, providing an ergonomic interface for running the analysis and saving the artwork.
- Modify
FlowFieldConfiginvisualize.pyto adjust resolution, grid density, or step sizes globally. - Experiment with different Perlin
scalevalues or angle mapping to produce tighter or more relaxed flows. - Incorporate additional audio descriptors (e.g. spectral roll-off) by
expanding
audio.pyand mapping them to new visual dimensions.
python -m timbremind.main demo.wav --output demo.png --resolution 1280 720 --grid 100 60 --seed 42The command above creates a 1280×720 artwork using 100×60 seed points and a deterministic random seed so the result is repeatable.
- Ensure the input file is a
.wav. Compressed formats such as MP3 are not directly supported. - Very long files can take a few seconds to analyse because the Fourier
transform scales with window count. Lower the
--gridvalues or use a shorter excerpt for faster iteration.