Skip to content

catpea/plosive

Repository files navigation

Plosive

A browser-based voice recording app for reading scripts aloud, one cue card at a time.

Paste a script, poem, or narration. Plosive splits it into cue cards on blank lines. Step through each card, record your voice, trim silence, preview playback, and export the final audio.

Getting Started

npm run dev

Opens on http://localhost:8070. Paste your text, click Start, and allow microphone access.

No build step. No dependencies. Just Node.js.

How It Works

  1. Paste text -- blank lines (\n\n) become card boundaries, --- becomes a section break
  2. Start -- creates a project, opens the microphone once (hot mic, no re-init)
  3. Record -- each card holds one or more audio segments
  4. Trim -- auto-detect silence or manually select regions to trim
  5. Preview -- click-and-hold the time ruler to scrub, or press P/Space to play
  6. Export -- Build merges all segments, applies trims, and encodes to a downloadable file

Keyboard Shortcuts

Key Action
R Record / Stop recording
P Play current card
Space Play selection, or full card
C Clear all segments on current card
E Edit card text
I Attach image to card
D Delete attached image
X Create trim from selection / Remove selected trim
A Auto-trim silence
Q Toggle section break after card
L Toggle fluid / fixed layout
, Open config
Esc Cancel editing / Close config
Arrows Navigate between cards

The shortcut bar at the bottom of the screen updates with context-sensitive actions.

Features

Spectrogram

Every recorded card gets a real-time spectrogram computed via Radix-2 FFT with Hann windowing. The spectrogram is scrollable, zoomable (auto-zooms when audio exceeds canvas width), and supports:

  • Time ruler with tick marks and second labels, click-and-hold to preview
  • Trim markers with draggable handles on both edges
  • Selection for targeted playback or trimming
  • Play position indicator with center-biased auto-scrolling
  • Scrollbar with thumb drag and track click

Non-destructive Trimming

Trim markers mask regions of audio without modifying the original recording. Trimmed regions are completely skipped during playback -- no silent gaps. Press X on a canvas selection to create a trim, or click a trim marker and press X to remove it. Press A to auto-detect and trim silence.

Cue Breaks

Press Q to toggle a section break (dashed line) after the current card. Breaks insert 500ms of silence in the final export.

Image Attachments

Press I to attach an image to a card. Images appear as thumbnails and are used by the slideshow export to time visuals to audio.

Live Preview

During recording, a live waveform or spectrogram preview appears on the current card. Toggle between modes in the config panel.

Project Persistence

When running with the server, projects are saved automatically:

  • Audio segments saved on recording stop
  • Images saved on attach
  • Trim markers, card state, and breaks saved on every change
  • Reload the page (F5) to restore full state

Project Structure

plosive/
  index.html              App shell
  style.css               All styles
  cue-cards.js            Main application (plugin architecture)
  server.js               Project management server
  plosive.config.json     Tunable parameters
  lib/
    bus.js                EventEmitter, Signal, App
  components/
    spectrogram.js        <plosive-spectrogram> web component
    preview.js            <plosive-preview> live recording visualization
    shortcut-bar.js       <plosive-shortcuts> context-sensitive shortcut bar
    config-modal.js       <plosive-config> settings modal with focus trap
  samples/
    poem-example.md       Example script
  projects/               Server-managed project storage

Architecture

The app uses a plugin architecture built on a minimal EventEmitter:

var app = new App();
app.use(parserPlugin);
app.use(navigationPlugin);
app.use(keyboardPlugin);
app.use(micPlugin);
app.use(recorderPlugin);
app.use(playerPlugin);
// ...

Each plugin receives the app instance and wires up event listeners, keyboard shortcuts, and state mutations. Plugins communicate through the event bus -- no direct coupling.

Web components (<plosive-spectrogram>, <plosive-shortcuts>, <plosive-preview>, <plosive-config>) encapsulate UI that needs its own lifecycle.

Server Routes

Method Path Description
GET / Redirect to /new
GET /new Empty project page
GET /projects Project list (HTML)
GET /project/:id Load project
GET /project/:id.json Full project state (JSON)
POST /api/projects Create project
PUT /api/project/:id/state Save state
PUT/GET /api/project/:id/audio/:file Audio segments
PUT/GET /api/project/:id/image/:file Image attachments
GET /project/:id/filelist.txt FFmpeg concat demuxer file list
GET /project/:id/slideshow.sh Generated slideshow build script
GET /health Health check

FFmpeg Integration

Concatenation

filelist.txt is served in ffmpeg concat demuxer format:

ffmpeg -f concat -safe 0 -i filelist.txt -c copy combined.webm

Slideshow

slideshow.sh generates a bash script that:

  1. Concatenates all audio segments
  2. Probes per-card durations
  3. Overlays attached images timed to their cards
  4. Outputs an H.264/AAC MP4 with loudness normalization
cd projects/<uuid>/
bash slideshow.sh output.mp4

Configuration

plosive.config.json provides tunable defaults:

{
  "autoTrim": {
    "threshold": 0.01,
    "maxSilenceMs": 500,
    "paddingMs": 250
  },
  "ruler": {
    "tickIntervalSec": 0.5
  }
}
  • threshold -- RMS level below which audio is considered silence
  • maxSilenceMs -- minimum silence duration to create a trim marker
  • paddingMs -- padding kept on each side of a silence region
  • tickIntervalSec -- time ruler tick spacing

License

MIT

About

A browser-based voice recording app for reading scripts aloud, one cue card at a time (Couplet Recorder)

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors