Skip to content
a video theremin based on PoseNet
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
css update img size and cf stack May 16, 2019
favicons icons Nov 21, 2018
fonts responsive design and FF support Nov 13, 2018
img add qr code May 16, 2019
js add fps count May 16, 2019
lib refactor and add browser option using tone.js Nov 12, 2018
.cfignore update img size and cf stack May 16, 2019
.gitignore icons Nov 21, 2018 add notes range scale to right zone Mar 4, 2019 Update Apr 24, 2019
index.html add qr code May 16, 2019
manifest.yml update img size and cf stack May 16, 2019


Veremin is a video theremin based on PoseNet and the brainchild of John Cohn.

It builds upon the PoseNet Camera Demo and modifies it to allow you to make music by moving your hands/arms in front of a web camera.

PoseNet is used to predict the location of your wrists within the video. The app takes the predictions and converts them to tones in the browser or to MIDI values which get sent to a connected MIDI device.

Browsers must allow access to the webcam and support the Web Audio API. Optionally, to integrate with a MIDI device the browser will need to support the Web MIDI API (e.g., Chrome browser version 43 or later).

Watch the video

Featured tools & technologies

  • PoseNet - a machine learning model which allows for real-time human pose estimation in the browser
  • TensorFlow.js - a JavaScript library for training and deploying ML models in the browser and on Node.js
  • Web MIDI API - an API supporting the MIDI protocol, enabling web applications to enumerate and select MIDI input and output devices on the client system and send and receive MIDI messages
  • Web Audio API - a high-level Web API for processing and synthesizing audio in web applications
  • Tone.js - a framework for creating interactive music in the browser

Live demo

To see the Veremin in action without installing anything, simply visit:

For best results, you may want to use the Chrome browser and have a MIDI synthesizer (hardware or software) connected. See the Using the app section below for more information.


Follow one of these steps to deploy your own instance of Veremin.

Deploy to IBM Cloud


To deploy to the IBM Cloud, from a terminal run:

  1. Clone the veremin locally:

    $ git clone
  2. Change to the directory of the cloned repo:

    $  cd veremin
  3. Log in to your IBM Cloud account:

    $ ibmcloud login
  4. Target a Cloud Foundry org and space:

    $ ibmcloud target --cf
  5. Push the app to IBM Cloud:

    $ ibmcloud cf push

    Deploying can take a few minutes.

  6. View the app with a browser at the URL listed in the output.

    Note: Depending on your browser, you may need to access the app using the https protocol instead of the http

Run locally

To run the app locally:

  1. From a terminal, clone the veremin locally:

    $ git clone
  2. Point your web server to the cloned repo directory (/veremin)

    For example:

    • using the Web Server for Chrome extension (available from the Chrome Web Store)

      1. Go to your Chrome browser's Apps page (chrome://apps)
      2. Click on the Web Server
      3. From the Web Server, click CHOOSE FOLDER and browse to the cloned repo directory
      4. Start the Web Server
      5. Make note of the Web Server URL(s) (e.g.,
    • using the Python HTTP server module

      1. From a terminal shell, go to the cloned repo directory
      2. Depending on your Python version, enter one of the following commands:
        • Python 2.x: python -m SimpleHTTPServer 8080
        • Python 3.x: python -m http.server 8080
      3. Once started, the Web Server URL should be
  3. From your browser, go to the Web Server's URL

Using the app

At a minimum, your browsers must allow access to the web camera and support the Web Audio API.

In addition, if it supports the Web MIDI API, you may connect a MIDI synthesizer to your computer. If you do not have a MIDI synthesizer you can download and run a software synthesizer such as SimpleSynth.

If your browser does not support the Web MIDI API or no (hardware or software) synthesizer is detected, the app defaults to using the Web Audio API to generate tones in the browser.

Open your browser and go to the app URL. Depending on your browser, you may need to access the app using the https protocol instead of the http. You may also have to accept the browser's prompt to allow access to the web camera. Once access is allowed, the PoseNet model gets loaded (it may take a few seconds).

After the model is loaded, the video stream from the web camera will appear and include an overlay with skeletal and joint information detected by PoseNet. The overlay will also include two adjacent zones/boxes. When your wrists are detected within each of the zones, you should here some sound.

  • Move your right hand/arm up and down (in the right zone) to generate different notes
  • Move your left hand/arm left and right (in the left zone) to adjust the velocity of the note.

Click on the Controls icon (top right) to open the control panel. In the control panel you are able to change MIDI devices (if more than one is connected), configure PoseNet settings, set what is shown in the overlay, and configure additional options. More information about the control panel options is available here.


You can’t perform that action at this time.