Skip to content

adlyZaroui/Event-Lip-Reading

Repository files navigation

                        Event Video Classification


This GitHub repository showcases an academic project that focuses on classifying event data obtained from an event-based sensor - Event Camera, also known as a neuromorphic sensor.

Please note that the dataset used for this project is provided by Prophesee and will not be published in this repository due to ownership rights. However, the same dataset can be publically accessed and downloaded by anyone interested from the following Kaggle link: Kaggle Dataset.


Introduction

An event is a $4$-tupe $(x,y,p,t)$ where

  • $(x,y)$ denotes the pixel's position associated with the event.
  • $p$ is a boolen indicating wether the change in luminosity is increasing or decreasing.
  • $t$ represents the timestamp (in $\mu s$) from the start of the recording.

Event Data are DataFrames, with each row representing an event, sorted in ascending order w.r.t. the timestamp.

Note: In our unique hardware configuration provided by the manufacturer, the range of $x$ is from $0$ to $480$, $y$ varies from $0$ to $640$, $p$ can be either $0$ (decrease of luminosity) or $1$ (increase of luminosity), and $t$ is a floating-point number.


Project Objective

The primary goal of this project is to address the following problem:

Problem: Given 10 distinct classes, each with 32 examples, our goal is to construct a classifier that can accurately determine the class of a new, unseen example.

The main metric that will be used to assess the performance of the models is accuracy.
This problem is the central focus of our project and all subsequent work will be aimed at solving it.


Illustration

This illustration represents the mentioned data type, which has been converted into a format similar to video data. In this format, events with polarity $0$ and polarity $1$ are distinguished by different color maps:



Usage

To use this project, follow these steps:

  1. Clone the repository: First, clone this repository to your local machine using

    git clone https://github.com/adlyZaroui/Event-camera-classification.git
  2. Download the dataset: You can download the dataset from this Kaggle link. Search for the train10 folder and download it. Once downloaded, position it in the root directory of your local repository. This folder houses another folder also titled train10, which in turn contains 10 subfolders. Ultimately, it should appear as follows:

    local_repo/
    ├──── train10/
    │       ├── train10/
    │             ├── Addition/
    │             ├── Carnaval/
    │             ├── Decider/
    │             ├── Ecole/
    │             ├── Fillette/
    │             ├── Huitre/
    │             ├── Joyeux/
    │             ├── Musique/
    │             ├── Pyjama/
    │             └── Ruisseau/
    ├──── .venv/
    ├──── .gitignore
    ├──── .LICENSE
    ├──── ...
    └──── *.ipynb
    

Every folder within train10/train10/ holds 32 csv files, named from 0.csv to 31.csv. These files represent event data focused on the face of a speaker uttering a specific french word, which is also the name of the parent folder.

For instance, the folder train10/train10/Musique contains 32 csv files, each capturing event data of someone pronouncing the French word Musique.

Further details about the methodology employed to record this dataset can be found at the provided Kaggle Competition link.

  1. Install dependencies: This project requires certain Python libraries. You can install them using pip:

    pip install -r requirements.txt
  2. Virtual Environment: You can now enable the virtual environment by typing

    source .venv/bin/activate
  3. Notebook: You are now able to run the Notebooks using your preferred notebook server such as Jupyter, VSCode, Google Colab, etc.


Project Outline

Data Exploration

The initial phase of the project involves preprocessing the raw event data to make it suitable for visualization. This process includes transforming the data into pixel matrices, enabling the application of standard image processing techniques.

Before delving into model training or selection, our focus is on analyzing trajectories. This crucial step aids in uncovering effective strategies for event data visualization and further investigating noise reduction techniques.

Data Preprocessing

In the preprocessing phase, we first implement noise reduction methods to refine the event data. Following this, we use Principal Component Analysis (PCA) to efficiently manage the high-dimensional data. PCA simplifies the data's dimensionality while retaining essential information, thereby enhancing computational efficiency and improving model performance.

Model Selection and Training

After preprocessing and exploring the data, the next step is model selection and training. This phase involves choosing the most suitable machine learning or deep learning models based on the problem at hand and the insights gained from the data exploration phase. Additionally, we will also take into account literature on the subject to inform our approach.

After the model is trained, it is evaluated on a separate test set to ensure that it can generalize well to unseen data. If the model's performance on the test set is satisfactory, the model is then ready to be deployed for making predictions on new data.


The next step involves comparing the performance of a Random Forest classifier on the raw data versus the reduced data. Random Forest is a powerful ensemble learning algorithm known for its ability to handle high-dimensional data and deliver robust results. By evaluating the model performance on both versions of the data, valuable insights are gained regarding the impact of dimensionality reduction on classification accuracy.

Building upon these findings, a Bagging Random Forest classifier is constructed using the reduced data. Bagging enhances the predictive power of Random Forest by aggregating multiple decision trees trained on bootstrap samples of the dataset. This ensemble approach increases model stability, generalization, and overall classification accuracy.

Throughout this project, code samples, data preprocessing techniques, dimensionality reduction methods, model training, and evaluation procedures are provided, allowing researchers, data scientists, and machine learning enthusiasts to replicate the experiments and gain deeper insights into event data classification.

The repository also includes a detailed documentation, explaining the project's objectives, methodology, and results, along with relevant visualizations and performance metrics. By leveraging the power of Random Forest and dimensionality reduction techniques, this project offers valuable knowledge and resources for tackling event data classification challenges.

The Kaggle Challenge is available in the following link https://www.kaggle.com/competitions/smemi309-final-evaluation-challenge-2022


Contributing:

Contributions to this project are highly encouraged and welcome. If you are interested in further enhancing the capabilities of event data classification, there are several areas where you can make valuable contributions. One possible area of contribution is to consider the raw data as time series and handle the problem as a multivariate time series classification.


License

This project is open source, under the terms of the MIT License.