Ozen

A Python-based acoustic analysis and annotation tool inspired by Praat, built for rapid waveform annotation with extended acoustic measurements.

Authors

Uriel Cohen Priva (@ucpresearch) - Design, testing, and vibe-coding
Claude (Anthropic) - Implementation

Features

Waveform and Spectrogram Display - Synchronized views with zoom/pan
Acoustic Overlays - Pitch, formants, intensity, center of gravity, HNR
Audio Playback - Play selections, visual cursor tracking
Annotation System - Multiple tiers, Praat TextGrid import/export
Data Collection Points - Click to mark positions and capture acoustic measurements
Undo Support - Ctrl+Z for boundary and label changes

Installation

macOS / Linux

# Clone the repository
git clone https://github.com/ucpresearch/ozen.git
cd ozen

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Windows

# Clone the repository
git clone https://github.com/ucpresearch/ozen.git
cd ozen

# Create and activate virtual environment
python -m venv .venv
.venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Note: On Windows, sounddevice includes PortAudio automatically. On macOS/Linux, you may need to install it separately:

macOS: brew install portaudio
Ubuntu/Debian: sudo apt install portaudio19-dev

Updating

cd ozen
git pull
pip install -r requirements.txt  # if dependencies changed

On Windows, remember to activate the virtual environment first with .venv\Scripts\activate.

Usage

Basic Usage

# Open the application
python -m ozen

# Open with an audio file
python -m ozen audio.wav

With TextGrid File

# Import existing TextGrid annotations
python -m ozen audio.wav annotations.TextGrid

With Predefined Tier Names

# Create tiers automatically when audio loads
python -m ozen audio.wav -t words,phones

With Custom Config File

# Use a custom configuration file
python -m ozen audio.wav -c myconfig.yaml

Config files can customize colors, formant presets, default tiers, and more. See ozen/config.py for available options.

Keyboard Shortcuts

Playback

Key	Action
Space	Play selection / pause
Escape	Stop playback
Tab	Play visible window

Navigation

Key	Action
↑ (Up arrow)	Zoom in
↓ (Down arrow)	Zoom out
← (Left arrow)	Pan left
→ (Right arrow)	Pan right
Scroll wheel	Zoom in/out (centered on cursor)
Horizontal scroll	Pan left/right

Annotation

Key	Action
Double-click	Add boundary at position
Enter	Add boundary at cursor position
Delete	Delete hovered boundary (highlighted in orange)
Ctrl+Z	Undo (add/delete boundary, text changes)
Escape	Deselect interval / close text editor
1-5	Switch to annotation tier 1-5

Annotation Workflow

Select an interval - Click on a tier to select an interval
Edit text - Type to add/edit the interval label
Add boundaries - Double-click or press Enter to split intervals
Delete boundaries - Hover over a boundary (turns orange) and press Delete
Play interval - Click the green play button on selected intervals

Data Collection Points

Action	Method
Add point	Double-click on spectrogram
Move point	Click and drag
Remove point	Right-click → "Remove"
Copy all points	Ctrl+C (copies visible measurements as TSV)
Export points	File > Export Point Information...
Import points	File > Import Point Information...

Data points capture acoustic measurements at specific time-frequency positions on the spectrogram. When you press Ctrl+C, all points are copied to the clipboard as tab-separated values, including only the measurements that are currently visible (checked in the overlay toggles). This allows quick data export to spreadsheets.

File Operations

Key	Action
Ctrl+O	Open audio file
Ctrl+S	Save TextGrid (to current path, or prompts if none)
Ctrl+Shift+S	Save TextGrid as...
Ctrl+C	Copy all data points to clipboard (visible measurements only)

Save Behavior

Ctrl+S saves to the current TextGrid path if one exists (from opening a file or previous save)
If no path is set, Ctrl+S prompts for a location (same as Save As)
Auto-save: Every 60 seconds, annotations are saved to a .autosave backup file
Exit confirmation: If you have unsaved changes, you'll be prompted to save before closing
When starting with a non-existing TextGrid path, you'll be asked if you want to create it

Offline Rendering

Ozen includes a headless spectrogram renderer for generating publication-quality figures without the GUI. Useful for batch processing, scripting, and paper figures.

python -m ozen.render recording.wav -o fig.png --overlays pitch,formants --legend

Quick Examples

# Windowed view with annotations
python -m ozen.render recording.wav -o fig.pdf \
    --start 0.5 --end 2.0 \
    --overlays pitch,formants,intensity \
    --textgrid recording.TextGrid --tiers words,phones

# Wideband spectrogram with custom colormap
python -m ozen.render recording.wav -o fig.png \
    --bandwidth wideband --colormap inferno \
    --overlays pitch,formants --preset male

# Multiple colored data point sets
python -m ozen.render recording.wav -o fig.png \
    --overlays pitch,formants \
    --points red=midvowels.tsv --points blue=pitch-peaks.tsv

# Points as markers only (no vertical lines), semi-transparent
python -m ozen.render recording.wav -o fig.png \
    --overlays pitch,formants \
    --points "#4488CC"=vowels.tsv --point-markers-only --point-alpha 0.6

# Custom font for publication
python -m ozen.render recording.wav -o fig.pdf \
    --overlays pitch,formants --legend --font "Times New Roman"

Available Overlays

Name	Description	Display Range
`pitch`	Fundamental frequency (F0)	pitch-floor–ceiling Hz (log scale)
`formants`	Formant frequencies F1–F4	direct Hz (red=narrow, pink=wide bandwidth)
`intensity`	Sound pressure level	30–90 dB
`cog`	Center of gravity (spectral centroid)	direct Hz
`hnr`	Harmonics-to-noise ratio	−10–40 dB
`spectral_tilt`	Low vs high frequency energy	−20–+40 dB
`a1p0`	A1–P0 nasal ratio	−20–+20 dB
`nasal_murmur`	Low-frequency energy ratio	0–1

Data Point Options

Option	Description
`--points [COLOR=]FILE`	Data point TSV file (repeatable). Colors: names (`red`), hex (`#4488CC`, `#FF880080` with alpha), grayscale (`0.6`)
`--point-markers-only`	Draw only circle markers, omit vertical lines
`--point-alpha FLOAT`	Opacity for all points, 0.0–1.0 (default: 1.0)

Figure Options

Option	Description
`--font FAMILY`	Font family for all text (e.g., `Times New Roman`, `Helvetica`)
`--legend`	Show legend with overlay names, colors, and value ranges
`--title TEXT`	Figure title
`--width`, `--height`	Figure dimensions in inches
`--dpi NUMBER`	DPI for raster output (default: 300)

Output Formats

PNG, PDF, SVG, and EPS.

Python API

from ozen.render import render_spectrogram

render_spectrogram(
    'recording.wav', 'fig.png',
    overlays=['pitch', 'formants'],
    textgrid_path='recording.TextGrid',
    legend=True,
    font='Helvetica',
    point_markers_only=True,
    point_alpha=0.8,
)

For the full manual with all options: python -m ozen.render --man

Supported Formats

Audio

WAV, FLAC, OGG, MP3

Annotations

Praat TextGrid (.TextGrid, .txt)

Requirements

Python 3.9+
PyQt6
pyqtgraph
praatfan (acoustic analysis - MIT licensed)
sounddevice
numpy, scipy
soundfile

Optional Acoustic Backends

Ozen supports multiple acoustic analysis backends. The default (praatfan) is pure Python and works everywhere. For better performance or compatibility, install additional backends:

Backend	Install	License	Notes	Repository
Praatfan (slow)	Included	MIT	Pure Python, portable	Praatfan
Praatfan (fast)	use the release page	MIT	Rust, ~10x faster	Praatfan
Praatfan (GPL)	use the release page	GPL	Rust, from praatfan-core-rs	Praatfan GPL
Praat (via Parselmouth)	`pip install praat-parselmouth`	GPL	Original Praat bindings	Praat, Parselmouth Website

Switch backends in the UI via the Backend dropdown, or set analysis.acoustic_backend in your config file.

Known Issues

macOS: Audio noise during playback

Setting waveform_line_width to greater than 1 in the config causes audio static/noise during playback on macOS. This appears to be a bug in Qt/pyqtgraph's rendering interaction with CoreAudio, not an issue with Ozen itself. The default is 1, which works fine. If you customize colors via a config file, keep this value at 1.

Acknowledgments

Ozen relies on the following projects for acoustic analysis:

praatfan - Clean-room reimplementation of Praat's acoustic algorithms in Parselmouth:

https://github.com/ucpresearch/praatfan-core-clean

Praat - The gold standard for phonetic analysis:

Boersma, Paul & Weenink, David (2024). Praat: doing phonetics by computer [Computer program]. Retrieved from http://www.praat.org/

Parselmouth - Python bindings for Praat (optional backend):

Jadoul, Y., Thompson, B., & de Boer, B. (2018). Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics, 71, 1-15. https://doi.org/10.1016/j.wocn.2018.07.001

License

MIT - see LICENSE for details.

Note: The default acoustic backend (praatfan) is MIT-licensed, making Ozen fully MIT-compatible out of the box. If you install optional GPL backends (praat-parselmouth or praatfan_gpl), your deployment becomes GPL-licensed when using those backends.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
ozen		ozen
resources		resources
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
ozen.praat.yaml		ozen.praat.yaml
ozen.sample.yaml		ozen.sample.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Ozen

Authors

Features

Installation

macOS / Linux

Windows

Updating

Usage

Basic Usage

With TextGrid File

With Predefined Tier Names

With Custom Config File

Keyboard Shortcuts

Playback

Navigation

Annotation

Annotation Workflow

Data Collection Points

File Operations

Save Behavior

Offline Rendering

Quick Examples

Available Overlays

Data Point Options

Figure Options

Output Formats

Python API

Supported Formats

Audio

Annotations

Requirements

Optional Acoustic Backends

Known Issues

macOS: Audio noise during playback

Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages