Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Real Time Avateering Face Tracking

University of Maryland - College Park
Eashaan Kumar, Professor Matthias Zwicker



This research project aims to create a system for remote presence for users in the form of a 3D virtual avatar. It allows them to establish eye contact with their avatar. Through the use of Kinect SDK and facial landmark detection techniques, the system does full body tracking and facial expression tracking. For a more immersive experience, it makes use of virtual reality through Oculus Rift. The project is implemented in a popular game engine called Unity3D that renders the avatar and talks to the Kinect sensor. Skeleton tracking and avatar mesh deformation is done directly by the Kinect. Facial landmark detection is done by an open source library called the Deformable Shape Tracking library. Blend weights are used to alter the avatar’s facial expressions and successfully mirror the user.

Skeleton Tracking

The Kinect skeleton tracking system can be broken down into two stages: first computing a depth map and second inferring body position. First, the depth map is constructed by the Kinect’s time-of-flight camera. This camera “emits light signals and then measures how long it takes them to return” (Meisner 1). It is accurate to the speed of light: 1/10,000,000,000 of a second. Thus, the camera is able to differentiate between light reflected from objects in the surrounding environment. From a single input depth image, a per-pixel body part distribution is calculated

Facial Feature Tracking

DEST, a facial landmark tracking library used in this research depends on OpenCV’s CascadeClassifier class. The CascadeClassifier class relies on the Haar Feature-based Cascades to perform facial detection operations.


Blendshapes are simple linear model of facial expressions used for realistic facial animation. Rendering in Unity3D is done through the “Mesh Renderer” Component. A Mesh Renderer was used to render the image of the user’s face captures by Kinect’s own camera. That image was then sent to the DEST library and OpenCV for facial tracking.


Windows 10 and Kinect v2 sensor required.

  • Download Unity Personal Edition, Visual Studio 2015, Cmake-GUI .
  • Compile DEST and other dependencies as listed on DEST github page.
  • Buy Kinect v2 Examples with MS-SDK so that the scripts for skeleton tracking are added to the Unity project. These scripts are not present in this repository due to the license provided by the creator of this asset.
  • Download the Testing Avatar Unity Project and open it in Unity. Import the Kinect v2 Examples Asset. Then hit run. If program crashes, it might be due to the ultimate_vision.dll in Assets/Plugins. The dll might need to be compiled again on your computer.
    • To compile the dll, download the UltimateVision project and open it in Visual Studio. Set the build target to Release and change the settings to generate a dll. Then hit "Build Solution". Replace the new dll with the old one in the Plugins folder.


To test out the demo, stand in front of the Kinect sensor and make sure it can see all parts of your body. Within 2 seconds, a character should appear on the screen imitating your movements. You should also see two windows pop up, one that represents your face in grayscale and one generated by DEST with the facial-landmark algorithm representing areas of interest. Together, it should result in the character smiling, opening his mouth, and crunching his nose whenever you make those expressions. This is the "virtual avatar" effect.

Read the full research paper - Final Report.pdf


This is a Computer Vision research project that I worked on in Summer 2017.






No releases published


No packages published