Capturing piano performance with depth vision
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Capturing piano performance with depth vision

Example of tracked difference image

This program uses OpenCV 3.2.0 and the depth stream from a Kinect V2 camera to track keypress events on a piano keyboard. Crucially, the information captured includes the velocity of each keypress.


In its current state, VMIDI2 solely performs keypress detection. Keyboard registration, MIDI output and automatic music transcription are not implemented.

The perspective transform points and keyboard overlay in the code are manual to suit my recordings, so they will have to be tweaked if using different footage. I am more than happy to provide anyone with the main recordings (.xef files) I used, so get in touch if you are interested. They're many gigabytes in size so I decided not to host them.

Visual Studio Setup

VMIDI2 was developed in Visual Studio 2017 and has also been tested on the 2015 edition. The key thing is to make sure OpenCV is referenced correctly. Ensure OpenCV and the Kinect SDK are properly installed, as detailed in the Installation section.

If using the provided .sIn file, just be sure to set configuration to Release/x64 and you should be good to go.

Else if working from the .cpp source, the Visual Studio solution can be set up as follows:

  1. Right click the project in Solution Explorer and open properties.
  2. In the drop-down menus at the top of the window, select "All Configurations" and "All Platforms".
  3. Edit "VC++ Directories/Include Directories" to add $(OPENCV_DIR)\..\..\include.
  4. Edit "Linker/General/Additional Library Directories" to add $(OPENCV_DIR)\lib and $(KINECTSDK20_DIR)\Lib\x64.
  5. Edit "Linker/Input/Additonal Dependencies" to add opencv_world320.lib and Kinect20.lib.

Running the Code

  1. Open Kinect recording file (.xef) in Kinect Studio.
  2. Press the connection button while in the PLAY tab. This connects the recording file to the computer as if it were a physical Kinect device.
  3. Set inpoints and outpoints as desired within the recording file using the timeline. Be sure that there are no hands or pressed keys present at the start of playback.
  4. Ensure playback is stopped.
  5. Run VMIDI2 application. The command line will show some matrices. After these are shown, the program is ready to receive playback.
  6. Press the play button in Kinect Studio. Many OpenCV windows will show. Keypresses will be logged to the command line in the format {keyname},{velocity}.

Installation (Windows)

Kinect Studio

  1. Get the Kinect for Windows SDK 2.0 from Microsoft.
  2. Use the included installer.
  3. (The installer should do this automatically) Double check that there is a system environment variable named KINECTSDK20_DIR with a value pointing to the Kinect SDK (e.g. *C:\Program Files\Microsoft SDKs\Kinect\v2.0_1409*).


  1. Get pre-built OpenCV 3.2.0 from SourceForge.
  2. Install to a convenient location such as C:\opencv
  3. Edit system environment variables:
    1. Create a variable named OPENCV_DIR with a path to the OpenCV build (e.g. C:\opencv\build\x64\vc14).
    2. Add to the PATH variable the OpenCV binary directory (e.g. C:\opencv\build\x64\vc14\bin).

Useful Links