# Hands-on Perception: Learning Event-based Data and Motion Compensation

## Event-based Cameras
Event-based Cameras (EBCs) are imaging sensors that respond to local brightness changes. Unlike conventional frame-based cameras, EBCs are asynchronous, having each pixel operating independently from the other. Each pixel reports a change of intensity (if it occurs), or stays silent otherwise. For example, a pixel stores a reference intensity at time $t$, the current intensity is then compared to the reference at time $t + 1$. If the aforementioned difference exceeds the threshold, that pixel outputs an **event**.

An **event** is a discrete packet of information containing:
* Event location ($x$ and $y$ coordinates, across the height and width dimensions respectively).
* Event timestamps $t$, the time at which the event occurred.
* Event polarity $p$, the direction of intensity change. It does not contain any information regarding intensity values. It is an output spike that can either be $+1$, or $-1$.

## Data Structure
### Dataset
For the sake of this lab, we will be using the **MVSEC** dataset. The MVSEC dataset is designed for the development of novel 3D perception algorithms for EBCs. In fact, stereo event-based data is collected from cars, motorbikes, hexa-copters and handhelds, fused with LiDAR, IMU, motion-capture and GPS systems to provide ground truth pose and depth images.

> **Note:**
> You can read more about the "MVSEC" dataset [here](https://daniilidis-group.github.io/mvsec/). Before starting this lab, it is important to understand the data structure of MVSEC. This dataset has been taken with a **DAVIS346B** sensor of resolution **(260, 346)**.

## eWiz
To load and manipulate the dataset, we will be using our own **eWiz** library for event-based data reading and manipulation. The **eWiz** library contains a series of modules, which include, but is not limited to:
* **Data Reading:** Reads and clips event-based and ground truth flow optical flow data, which use the eWiz format.
* **Data Rendering:** Renders event-based sequences.
* **Data Augmentations:** Applies temporal and spatial augmentations to event-based data and corresponding grayscale images.
* **Accuracy Metrics:** Provides several accuracy metrics for optical flow evaluation, such as the *Average Endpoint Error (AEE)*, and the percentage of outliers.
* **Visualizations:** Provides visualization functions for optical flow, event-based data, and grayscale images.

## **Part 1:** Data Loading
In this part, you will be looking at the full sequence of one of the datasets. Your tasks are:
1. Choose any dataset of your liking, and render it using the `VideoRendererBase` class.
2. Choose a sequence of *0.2 s*, in which you observe a clear movement in one direction.
3. Extract that sequence using the data reader of eWiz, and print the extracted events.


### Data Renderer
Write the data renderer code in the code block below.

### Data Clipping
Clip the sequence in the code block below.

### Questions
1. What could cause an event to have a positive or negative polarity?
2. Given the timestamp resolution, calculate the maximum theoretical event rate this sensor can handle. How would this impact applications like high-speed object tracking?
3. Event-based sensors can produce noise events (false positives) due to thermal noise or flickering light sources. Based on the initial event samples, how could you identify and filter out noise without significantly impacting true events?

## **Part 2:** Data Manipulation
Event-based data is **asynchronous**, meaning that unlike intensity frame-based cameras, EBCs do not output data at a fixed frequency. This might be a problem for traditional neural network architectures (or computer vision algorithms) as they require data given synchronously. Moreover, EBCs are considered a paradigm shift as they do not output traditional images. As such, if we want to work with convolutional neural networks (CNNs) this event-based data (the array you printed previously) should be converted to an **image-like** equivalent. This is what we call events-to-image encoding, which converts event-based data to a traditional image (either grayscale, RGB, or 2-channels).

For the sake of this lab, let's start by creating a simple events image. Using the **events** data you printed earlier, we want to insert each event in its respective pixel location, for which each channel corresponds to a different polarity. As such, the output image will have the shape **(C, H, W)**, where:
* **C:** The channel dimension (its value is 2), the first element encapsulates **positive** polarities, whereas the second element encapsulates **negative** polarities.
* **H:** The **height** dimension, which corresponds to the y-dimension of the sensor.
* **W:** The **width** dimension, which corresponds to the x-dimension of the sensor.