McGill-CodeJam-2019

This repository includes the source code for our work on a Real-time Gesture Detection system (RtGD). The work is developed with the help of my teammates (Kaiwen Xu, Tom Sarry, and Elsa Emilien) during the McGill CodeJam 2019 event (https://devpost.com/software/mr-bean) which wins the second-place award in the Optional Theme - Internet of Things. I would also want to acknowledge Brenner Heintz (https://github.com/athena15/project_kojak) for his pre-trained hand-gesture model and the inspiration on this project. This repository will be structured as follow:

    1. Our goal for the project.  
    2. Functions for each script.  
    3. Results of the RtGD model.  
    4. Future work and reflections.

Note, the code provided here is just the computer vision part of the CodeJam project. For the detailed project description and source code, please refer to the devpost page provided above.

Our Goal for the Project

The goal for this project is to develop a RtGD system having minimum effect on environmental noise. When compared to the work done by Brenner Heintz, our system is targeted to function properly outside of a dark room. This is tackled by the compibation of a motion detection algorithm (OpenCV) + model retraining on noisy data in our approach. We followed the pretrained model from Brenner's repo and targeted the recognition of the same five gestures, fist, L, okay, palm, and peace, in this project. Examples for the retraining gesture data are shown in the figure below:

Functions for Each Script

In this section, I will introduce the function for each scripts and zip file based on my experimental flow. A illustration for the experimental flow is presented below:

    1.  pretrain VGG model                 (/models.zip/VGG_cross_validated.h5)  
    2.  record images for downstream task  (1_Record_Image.py) (NewTraining20191216.zip)  
    3.  retrain the model                  (2_Retrain model.py) (retrained_20200506.h5)  
    4.  operate the RtGD system            (3_RtGD.py)

The pretrained and retrained models are stored in the model.zip file. If you are interested to record new images for your downstream task, the script provided in (1_Record_Image.py) can serve well for this purpose. The recording script includes a motion detection algorithm that only catches up moving parts and omits environmental noise (ex: lights from the window or lamp). An example of the recorded images, with 20 images for each class, are stored in NewTraining20191216.zip. These images are then applied to retrain our model using the script (2_Retrain model.py). Due to the time limitation in CodeJam competition, we froze the convolution layer of the VGG model and use the recorded images to only retrain the dense layer which has 3252997 trainable parameters.

After retraining on our task, the real-time inference can be operated by running the script (3_RtGD.py); the same motion detection algorithm is also used here. Due to the retraining process, our model can recognize different hand gestures with minimum interference from the environmental light noise. Testing accuracy of 85% is achieved by using the recorded dataset.

Results of the RtGD Model

The result of the RtGD model is illustrated in the following figure. The left image shows the processed image after motion detection, while the middle images show the real-time captured data. From the image, we can observe that the RtGD model outputs the gesture in real-time (a gesture of L is demonstrated with a confidential score of 99.40%). Notably, the light sources in the real-time image do not appear after the data is processed by the algorithm. This cancellation of the background noise results in a more robust RtGD system and enable the system to be applied to various environment.

Future Work and Reflections

An RtGD system with suppressed background noise is presented in the project. By leveraging on a motion detection algorithm, our retrained model can learn to omit the background light noise and improves its classification accuracy when tackling real-world problems. Although this project shows some promising improvement, some future work can be experimented on to further boost the model performance.

    1.  Increase the re-training dataset size. 
    2.  Implemented data argumentation method.  
    3.  Retrain the whole model instead of merely the dense layer.  
    4.  Introduce other moving parts other than hand gesture (ex: moving head...) during training and make the model immune to those noises.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

McGill-CodeJam-2019

Our Goal for the Project

Functions for Each Script

Results of the RtGD Model

Future Work and Reflections

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
images		images
.gitattributes		.gitattributes
1_Record_Image.py		1_Record_Image.py
2_Retrain model.py		2_Retrain model.py
3_RtGD.py		3_RtGD.py
NewTraining20191216.zip		NewTraining20191216.zip
README.md		README.md
models.zip		models.zip

EtMR/McGill-CodeJam-2019

Folders and files

Latest commit

History

Repository files navigation

McGill-CodeJam-2019

Our Goal for the Project

Functions for Each Script

Results of the RtGD Model

Future Work and Reflections

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages