# 3D Object Detection via Sensor Fusion (Lidar and Camera) 

Originaly written by: **PixelOverflow**

If you wish to watch videos from you can find videos in the following links:
### Sensor Fusion Tutorial:

- Part 1 - [3D Object Detection Overview](https://www.youtube.com/watch?v=hXpXKRnnM9o&t=0s)
- Part 2 - [Coordinate Transformations](https://www.youtube.com/watch?v=EfiYr61RGUA&t=0s) 
- Part 3 - [Loading Calibration Data](https://www.youtube.com/watch?v=pRAPXfWy-3A&t=0s)     
- Part 4 - [Sensor Fusion Pipeline](https://www.youtube.com/watch?v=vVtpKzEwEFM&t=0s)  
- Part 5 - [Check the Math](https://www.youtube.com/watch?v=lpjQnIrnt20&t=0s)  

In this tutorial we will dive into the KITTI dataset and detect objects in 3D using Early Sensor Fusion or Early Fusion which aims to fuse raw data from multiple sources and then perform detection. Late fusion on the other hand involves first detecting objects, and then fusing the detections. In this case we will perform a modified fusion, where we detect objects in the camera images and then fuse their centers with the LiDAR data to get depth.

The main steps are summarized as:

- Detect objects in the camera images (Detection)
- Project 3D LiDAR point clouds to 2D Image space (Fusion)
- Associate LiDAR depth with each Detected object (Association to get Depth)
- Detection in 3D as opposed to 2D is much more useful to an autonomous vehicle since 3D detection allows the system know where objects are physically located in the world.


For more information a readme for the KITTI data can be found [here](https://github.com/yanii/kitti-pcl/blob/master/KITTI_README.TXT), and a paper that details the data collection and coordinate systems can be found [here](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf).


Now let's get the data and get started.

### Data prepration

In [1]:
# Download data
!wget https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_drive_0047/2011_10_03_drive_0047_sync.zip


--2024-08-30 16:08:06--  https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_drive_0047/2011_10_03_drive_0047_sync.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 3.5.135.248, 52.219.170.29, 3.5.134.36, ...
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|3.5.135.248|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3103291675 (2,9G) [application/zip]
Saving to: ‘2011_10_03_drive_0047_sync.zip’


2024-08-30 16:09:34 (33,8 MB/s) - ‘2011_10_03_drive_0047_sync.zip’ saved [3103291675/3103291675]



In [6]:
# Unzip them
!unzip  2011_10_03_drive_0047_sync.zip
!unzip  2011_10_03_calib.zip

Archive:  2011_10_03_drive_0047_sync.zip
   creating: 2011_10_03/2011_10_03_drive_0047_sync/oxts/
   creating: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000339.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000227.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000468.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000284.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000736.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000187.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000447.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000270.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000134.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data/0000000410.txt  
 extracting: 2011_10_03/2011_10_03_drive_0047_sync/oxts/data

### Base Library Import

In [2]:
# Base Library Import
import os
from glob import glob
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams["figure.figsize"] = (20, 10)

### Import KITTI Utility functions

In [7]:
!wget https://github.com/itberrios/CV_tracking/raw/main/kitti_tracker/kitti_utils.py
from kitti_utils import *

--2024-08-30 16:12:17--  https://github.com/itberrios/CV_tracking/raw/main/kitti_tracker/kitti_utils.py
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/itberrios/CV_tracking/main/kitti_tracker/kitti_utils.py [following]
--2024-08-30 16:12:17--  https://raw.githubusercontent.com/itberrios/CV_tracking/main/kitti_tracker/kitti_utils.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9759 (9,5K) [text/plain]
Saving to: ‘kitti_utils.py’


2024-08-30 16:12:18 (77,8 MB/s) - ‘kitti_utils.py’ saved [9759/9759]



### Data Overview
In the KITTI raw dataset we get images from four cameras (two grayscale and two RGB), the velodyne LiDAR, and the OXTS GPS navigation system.

The update rates are as follows:

- RGB camera: 15 Hz (15 fps)
- OXTS GPS navigation system: 100Hz
- Velodyne LiDAR: 10Hz

The data is synched to the LiDAR, since it has the lowest update rate, but the sync between the camera, GPS/IMU (navigation), and LiDAR is not precise (even though we are using the synched raw data!). Per the KITTI [description](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf) the worst time difference between the camera/velodyne and gps/imu is at most 5ms. More precise measurements can be obtained with interpolation, but for simplicity we will neglect these differences since the small error from the imprecise sync will not greatly impact our measurements. We will see later when we project LiDAR points onto the camera images, that there is no noticable difference.