# Chapter 17 - Event based vision

Feature based Visual Odometry has been a field of research for a few decades now, and great improvements towards Efficiency and Robustness could be made, while accuracy stayed the samefor the most time between 1980 and 2007. It was the combination of VO and *IMU* Sensors that brought up the accuracy for a significant amount, with improving Robnstness as well. The last big accuracy and Efficiency step was made recently with the usage of Event Based Cameras in 2014, which is why we will take a deepter lock into Event Cameras in this Article. 

As we have seen in the last Article, IMUs are very helpful for predicting the Cameras position when the cameras transformation happens too quickly such that images get blurry and untrackable. However, the IMU is an intrinsic sensor, meaning it does not look at the world around it but measures data internally. We therefore still lack have the downsides of the cameras that we can hardly overcome with intrinsic measurements. Sure, the IMU can definitely help with High-speed motion up to a certain degree, and probably with dynamic environments as well. But he can't help with **High Dynamic Range**, long **latency** from the camera sensor, or **Low-texture** scenes. But these challenges can all be overcome with using event cameras. 

## Dynamic Vision Sensor

Event-Based Cameras (EBC) are with their functionality inspired by the Human eye. While traditional cameras work in frames every x<sup>th</sup> microsecond, the human eyes have 130 million photoreceptors that do not forward signals into their axons periodically, but whenever they register a change in their receptive field. This leads to very low information flow since the human eye is mostly still with moving only a fraction of the time. **Dynamic Vision Sensors (DVS)** work the same way. They are 

- **low-latency** (~1 micro-second)
- **High-dynamic range (HDR)** (140db instead of 60db)
- **High refreshrate** (1MHz) 
- **Low-Power** (0.01W copared to 1W)

Image sensors. They are so special because they do not register frames every few milliseconds but output a high frequency signal only for these pixel that registered an intensity change above a certain threshold. So if you film a static scene with an event camera, it will NOT produce any output signal since no pixel registers a change. Only when you move the camera, all pixels that significantly change in value will produce an output for the duration of the movement. 

In the following image, you can see the output over time for a std camera and an event based camera, both filming a rotating plate with a dark marker on it. We can note a few things here. First, the output of the standard cameras are whole frames (images) over time, while the output of the event based camera are individual pixels over time, but with a much higher timely density. Second, the evnet based camera does not register the central point of the spinning plate, since it does not change. Third, the DVS output pixels are either blue or red, while red symbolizes an increase in density, blue a decrease in pixel density value. The output of the DVS is therefore a set of asynchronous events whenever a pixel registers a change in intensity. An event can therefore be characterized by (time, (u,v), sign). 

![Rotating plate, filmed by std and event-based camera](https://github.com/joelbarmettlerUZH/PyVisualOdometry/raw/master/img/chapter_17/cameras_filming_rotating_plate.png)
*Figure 1: Rotating plate, filmed by std and event-based camera. [source](http://rpg.ifi.uzh.ch/docs/teaching/2019/http://rpg.ifi.uzh.ch/docs/teaching/2019/14_event_based_vision.pdf)*

## Event based Cameras

Event based cameras are just cameras that use a DVS sensor. To understand the cameras functionality in details, we will examine one single pixel over time and see how it creates events. This is a valid simplification since pixels are asynchronous and independent from each other anyways, so by looking at a single pixel, we can explain the cameras functionality as a whole. 

First of all, we plot the events generated by a single pixel in an event camera over time with a curve displaying the light intensity value it receives. The intensity *I* is sampled using a logarithmic scale. An event is generated whenever the intensity increases or decreses by a constant threshold C. Positive changes result in a red event *(ON-Events)*, negative changes in a blue one *(OFF-Event)*. For constant instensity, no events are triggered. An event camera therefore samples the intensity, while a std camera samples the time. 

![Sampling (Event Camera, Std. Camera)](https://github.com/joelbarmettlerUZH/PyVisualOdometry/raw/master/img/chapter_17/difference_in_sampling.png)
*Figure 2: Sampling (Event Camera, Std. Camera). [source](http://rpg.ifi.uzh.ch/docs/teaching/2019/http://rpg.ifi.uzh.ch/docs/teaching/2019/14_event_based_vision.pdf)*

This explains why event cameras have such a high frame rate: The sensor does not have to process all pixels every few milliseconds but rather processes only the ones triggering event but in a much higher framerate. But the question remains why event based camera are superior to normal ones in regards to Dynamic Range. Remember that a high dynamic ranges means that a camera sees regions of very bright and very dark pixels. Since event based pixels are asynchronous, we do not really need a global shutter time that defines the exposure. Each pixel rather only keeps track of the current intensity in comparison to the last intensity crossing. When a new event is generated by a high enough intensity change, everything is reset and the differentiation starts from this new anker point, meaning that the range to examine is much smaller and on a per pixel level, which makes it easy to deal with high dynamic range in the frame. 

Using this knowledge, we can now examine picture taken by an event based camera that is rotating to the right. The result is an image with blue pixels wherever bright pixels became darker, and red pixels wherever dark pixels became brighter. We can see that an event based camera is therefore automatically an edge detector. 

![Picture taken by an event based camera](https://github.com/joelbarmettlerUZH/PyVisualOdometry/raw/master/img/chapter_17/event_based_camera_picture.png)
*Figure 3: Picture taken by an event based camera. [source](http://rpg.ifi.uzh.ch/docs/teaching/2019/http://rpg.ifi.uzh.ch/docs/teaching/2019/14_event_based_vision.pdf)*

Note that in order to create this picture, we had to accumulate all events over a specific period of time (in this case 40 ms), otherwise the output would be too sparse and barely perceivable for us. 

Another method to visualize the output of event based camera is to aggregate (sum up) all positive (+1) and negative (-1) events in a given time interval and display the result as a greyscale image. 

![Aggregated event image](https://github.com/joelbarmettlerUZH/PyVisualOdometry/raw/master/img/chapter_17/aggregated_event_image.png)
*Figure 3: Picture taken by an event based camera. [source](http://rpg.ifi.uzh.ch/docs/teaching/2019/http://rpg.ifi.uzh.ch/docs/teaching/2019/14_event_based_vision.pdf)*

Event based cameras are great for IoT applications since they are low-power and low-data. They are also used in Automotives where High dynamic range and low-memory staging is key, AR/VR for low-latency and low-power reasons, and in different industries involving fast moving parts. 

In many regards, Event based camera are even superior to professional High-speed cameras. 
![Comparing High-speed, Standard and Event-Based Cameras](https://github.com/joelbarmettlerUZH/PyVisualOdometry/raw/master/img/chapter_17/high_speed_camera_comparison.png)
*Figure 4: Comparing High-speed, Standard and Event-Based Cameras. [source](http://rpg.ifi.uzh.ch/docs/teaching/2019/http://rpg.ifi.uzh.ch/docs/teaching/2019/14_event_based_vision.pdf)*

## Traditional VO Algorithms on EB-Cameras
Event-Based cameras are still Pinhole cameras after all, so most algorithms should work for EB-Cameras as well. Indeed, we can calibrate an event based camera using a grid with corner detection, with the only difference that the grid must be blinking in order to generate constant ON-OFF-ON Events. Optical Flow algorithms of course work the same, all event pixels follow the direction of movement. An edge moving constant into one direction would produce a line of high events. Plotted over time we'd see a plane of events where the moving speed would correspond to **v = dx / dt**, with dx being the traveled distance and dt being the passed time. 

In many regards, Event based camera are even superior to professional High-speed cameras. 
![Moving Edge over Time](https://github.com/joelbarmettlerUZH/PyVisualOdometry/raw/master/img/chapter_17/moving_edge.png)
*Figure 5: Moving Edge over Time. [source](http://rpg.ifi.uzh.ch/docs/teaching/2019/http://rpg.ifi.uzh.ch/docs/teaching/2019/14_event_based_vision.pdf)*

### Fundamentals of EB-Cameras
The most fundamental theorem of EB Cameras is that the gradient of the negative **logarithmic instensity dL(x, y)** multiplied by the **motion u** = (du, dv) equals to the **contrast threshold C** at which we sample the intensity values. 

To put it in short: **-dL * u = C**

Note that the gradient dL always follows the direction of the edge. But that must not always be the direction of movement, right. So we factor in the intensity change by the pixel itself, which gives us 