The goal of this project is to extract text words from a video input using visual lips tracking and machine learning techniques.
LipReading is the final project of Ben Gurion University Software Engineering students:
This project is guided by Dr. Kobi Gal and Dr. Gavi Kohlberg
We aspire to supply patients with larynx or vocal cord conditions an easy and intuitive way to communicate. The system identifies the user’s lip movements by video, and transforms the video to audio output of the spoken word. The system works on any computer equipped with a basic camera. The system is based on image-processing algorithms and machine learning. Given a video segment, the image-processing algorithm identifies the lips and extracts the coordinates of a number of points on the lips from each frame. This data goes through a series of normalization actions to improve the classification results. The last stage of the classification is performed by Machine Learning algorithms implemented by third-party packages we use. After classifying a given word, the audio output of the result is played.