CrowCrow: Parametric Articulatory Voice Synth (Experimental)

CrowCrow main UI (click to enlarge)

CrowCrow is an experimental, browser-based parametric articulatory speech synthesizer.
This repo is not under active development, but is preserved as an artifact for reference, inspiration, or further research.

Note:
Like many experimental projects, some paths lead you nowhere but inform your future work.
CrowCrow began as an attempt to build a fully parametric TTS synth that could be AI-driven for future embodiment, as an alternative to current dataset/diffusion-based TTS.
However, since the synth itself is not the main focus and lacks essential features like a phonemizer, coarticulation, and prosody (which are better served with a different architecture), a full rewrite would be needed for production use.

Companion Post:
Embodied speech for AIs — A short introduction to the ideas behind this project.

Features

Real-time articulatory speech synthesis in the browser (Web Audio API + AudioWorklet)
Modular, extensible audio engine (Glottis, Tract, Nasal, Click, Tap, Trill, etc.)
Partial IPA phoneme coverage (vowels, plosives, fricatives, nasals, clicks, trills, taps, etc.)
Interactive UI: phoneme buttons, tract visualizer, oscilloscope, parameter sliders
Phoneme text input and sequencer for custom utterances
Designed for research, education, and creative coding

Live Demo

👉 Try CrowCrow on GitHub Pages

Screenshots

Full UI:
Tract Visualizer & Oscilloscope:
Phoneme Text Input:

Getting Started

Online

Just visit: https://kenoleon.github.io/crowcrow/

Local

git clone https://github.com/kenoleon/crowcrow.git
cd crowcrow
npm install
npm run build
npm start
# Then open http://localhost:8080/

Usage

Start Synth else you wont hear a thing.
Click phoneme buttons to hear individual sounds.
Move the master Gain to hear a continuous waveform.
Adjust sliders for pitch, tenseness, tract shape, etc.
Enter a sequence in the phoneme text input and press "Play" to synthesize custom utterances.
Watch the tract visualizer and oscilloscope update in real time.

Architecture

Audio Graph Topology

[subglottalNode]
      │
[noiseNode]─────┐
      │         │
  [glottisNode] │
      │         │
[chestinessNode]│
      │         │
[transientNode] │
   │     │      │
   │     └─────────────┐
   │                   │
[tractNode]        [nasalNode]
   │                   │
   └─────┬─────┬───────┘
         │     │
  [clickNode]  │
  [tapNode]    │
  [trillNode]  │
         │     │
    [summingNode]
          │
      [gainNode]
          │
  [masterGainNode]
          │
    [analyserNode]
          │
[audioContext.destination]

Legend:

subglottalNode: Simulates subglottal pressure (lungs)
noiseNode: Generates noise for fricatives, aspiration, etc.
glottisNode: Simulates vocal fold vibration and voicing
chestinessNode: Adds chest resonance
transientNode: Handles plosive bursts and transients
tractNode: Main vocal tract filter (8 zones, 44 segments)
nasalNode: Simulates nasal tract coupling
clickNode, tapNode, trillNode: Special consonant bursts
summingNode: Sums oral, nasal, click, tap, trill outputs
gainNode, masterGainNode: Output gain control
analyserNode: For oscilloscope/visualization
Processors:
- public/audio/processors/: AudioWorkletProcessors for each stage
- src/main.js: Main controller, node creation, UI, parameter flow
- src/state/phonemeMap.js: Phoneme definitions and parameters

Phoneme Map

All phonemes are defined in src/state/phonemeMap.js using articulator parameters.
You can add or modify phonemes by editing this file and the UI.

Development

Build: npm run build
Dev server: npm start
Deploy: Push to main branch; GitHub Actions will auto-deploy to Pages.

License

MIT License (see LICENSE)

Credits

Created by Keno Leon
Inspired by Pink Trombone and other open-source speech synthesis projects

Status

This repo is experimental and will not be developed further.
It is preserved as a reference artifact for the community.

Roadmap / Ideas (Not Planned)

Coarticulation and expressive prosody
Voice customization and presets
AI-driven parameterization
Improved UI/UX and accessibility

Feel free to fork or reference for your own projects!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
dist		dist
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
notes_context.txt		notes_context.txt
package-lock.json		package-lock.json
package.json		package.json
webpack.config.js		webpack.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CrowCrow: Parametric Articulatory Voice Synth (Experimental)

CrowCrow is an experimental, browser-based parametric articulatory speech synthesizer.
This repo is not under active development, but is preserved as an artifact for reference, inspiration, or further research.

Features

Live Demo

Screenshots

Getting Started

Online

Local

Usage

Architecture

Audio Graph Topology

Phoneme Map

Development

License

Credits

Status

Roadmap / Ideas (Not Planned)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CrowCrow: Parametric Articulatory Voice Synth (Experimental)

CrowCrow is an experimental, browser-based parametric articulatory speech synthesizer. This repo is not under active development, but is preserved as an artifact for reference, inspiration, or further research.

Features

Live Demo

Screenshots

Getting Started

Online

Local

Usage

Architecture

Audio Graph Topology

Phoneme Map

Development

License

Credits

Status

Roadmap / Ideas (Not Planned)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

CrowCrow is an experimental, browser-based parametric articulatory speech synthesizer.
This repo is not under active development, but is preserved as an artifact for reference, inspiration, or further research.

Packages