GUI text-based speech and music editor for creating radio/audio stories
JavaScript Python HTML CoffeeScript CSS Shell
Failed to load latest commit information.
music_changepoints working toward a releasable thing Dec 11, 2014
music_remix bug fix Jan 8, 2015
static adding short test track example for alignment Jan 17, 2015
templates small tweaks Dec 13, 2014
utilities bug fix in transcript parser Jan 17, 2015
.bowerrc musical underlays Mar 26, 2013
.dockerignore adding obama clip back in Dec 11, 2014
.gitignore adding easy way to analyze new speech tracks Dec 13, 2014
.gitmodules moving edible submodule to github Dec 15, 2014
Dockerfile new install instructions with vagrant Dec 13, 2014
Gruntfile.js working toward a releasable thing Dec 11, 2014 new install instructions with vagrant Dec 13, 2014 virtualbox note in Dec 18, 2014 readme update Dec 13, 2014
Vagrantfile update install repo address and fix problem in the vagrantfile Dec 17, 2014 update path to p2fa-vislab Jun 19, 2015 swapping tracks Dec 15, 2014 fix bug in loading wavs into music library Feb 25, 2015
app.wsgi mehhhh Feb 22, 2013 volume is hooked up and working for music tracks Mar 11, 2013 pause identification (and related changes) Feb 13, 2013
package.json new install instructions with vagrant Dec 13, 2014 fix provision lib requirements Apr 9, 2016 swapping tracks Dec 15, 2014
requirements.txt working toward a releasable thing Dec 11, 2014
wav2json.patch working toward a releasable thing Dec 11, 2014

Integrated Radio/Audio Editor




Text-based speech editor

  • Text/speech alignment
  • Standard cut/copy/paste/delete metaphors
  • Pause and breath identification and insertion
  • Duplicate sentence detection

Music selection

Music remixing


The speech editor app has the client/server model. The client--the javascript web app--is responsible for all of the interaction. As you edit the speech, the web app changes the underlying state of the audio composition. Then, to actually generate the audio for the composition, the web app sends a request to the server (/reauthor) to build the audio.


speecheditor.js - main front-end javascript code

Defines TAAPP, a global variable that controls the state of the app. Key functions of TAAPP include loadSite, newProject, generateAudio, createUnderlay, and drawScript. Most of the functions in this file have fairly descriptive names.

Here are the key things that happen when the site loads (this can be found at the end of the file):

// launch the project creation modal dialog
    show: (speech === "")
.click(function () {

// start a new project if it was specified in the url
if (speech !== "") {

// initialize everything that doesn't depend on the speech track

edible - timeline and waveform plugin

edible is a jquery-ui plugin that I wrote to represent the waveforms and timeline in the interface. There are a few different kinds of waveforms in the app: edible.musicWaveform.js, edible.textAlignedWaveform.js, edible.waveform.js, all of which inherit from edible.wfBase.js. There's also edible.timeline.js, which is the timeline itself.

The waveforms are rendered as html5 canvas objects. - manage the text areas that contain speech

textAreaManager (often referred to as TAM throughout the code) manages the text areas in the UI. These contain the text that can be edited. It's responsible for editing and highlighting the text as the audio plays.

You can, for example, see the keyboard shortcuts defined in the ScriptArea constructor. This gives you a sense of what you can do within a textarea.

The TAM is created in the TAAPP.reset function in speecheditor.js.

musicbrowser - sub-app for the music browser

This folder contains the entire music browser app. - python back-end

This is the main server for the web app. Its primary functions are to serve the static web app pages, and to generate audio (and do any intense background processing, like music retargeting).

  • /: serves the main web app (index.html)
  • /reauthor: generates the complete audio for the edited story (activated by rendering/pressing play/pressing enter in the web app)
  • /download/<name>: used to download generated audio (activated by download button in web app)
  • /dupes: detect duplicate lines in script (activated when a script is loaded in the web app)
  • /changepoints/<song_name>: finds music change points in a song
  • /underlayRetarget...: generates a retargeted musical underlay for the story
  • /uploadSong: uploads and analyzes a song
  • /alignment/<name>: return the pre-computed transcript-to-speech alignment for the speech track