# Peking University/Baidu - Autonomous Driving
Can you predict vehicle angle in different settings?

[https://www.kaggle.com/c/pku-autonomous-driving/overview](https://www.kaggle.com/c/pku-autonomous-driving/overview)

## Descriptions
Who do you think hates traffic more - humans or self-driving cars? The position of nearby automobiles is a key question for autonomous vehicles ― and it's at the heart of our newest challenge.

Self-driving cars have come a long way in recent years, but they're still not flawless. Consumers and lawmakers remain wary of adoption, in part because of doubts about vehicles’ ability to accurately perceive objects in traffic.

Baidu's Robotics and Autonomous Driving Library (RAL), along with Peking University, hopes to close the gap once and for all with this challenge. They’re providing Kagglers with more than 60,000 labeled 3D car instances from 5,277 real-world images, based on industry-grade CAD car models.

Your challenge: develop an algorithm to estimate the absolute pose of vehicles (6 degrees of freedom) from a single image in a real-world traffic environment.

Succeed and you'll help improve computer vision. That, in turn, will bring autonomous vehicles a big step closer to widespread adoption, so they can help reduce the environmental impact of our growing societies.

## Evalutions

Submissions are evaluated on [mean average precision](https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Mean_average_precision) between the predicted pose information and the correct position and rotation.

We use the following C# code to determine the translation and rotation distances:

```c#
 public static double RotationDistance(Object3D o1, Object3D o2)
    {
        Quaternion q1 = Quaternion.CreateFromYawPitchRoll(o1.yaw, o1.pitch, 
             o1.roll);
        Quaternion q2 = Quaternion.CreateFromYawPitchRoll(o2.yaw, o2.pitch, 
             o2.roll);
        Quaternion diff = Quaternion.Normalize(q1) * 
             Quaternion.Inverse(Quaternion.Normalize(q2));

        diff.W = Math.Clamp(diff.W, -1.0f, 1.0f);

        return Object3D.RadianToDegree( Math.Acos(diff.W) );
    }

    public static double TranslationDistance(Object3D o1, Object3D o2)
    {
        var dx = o1.x - o2.x;
        var dy = o1.y - o2.y;
        var dz = o1.z - o2.z;

        return Math.Sqrt(dx * dx + dy * dy + dz * dz);
    }
```

We then take the resulting distances between all pairs of objects and determine which predicted objects are closest to solution objects, and apply thresholds for both translation and rotation. Confidence scores are used to sort submission objects. Units for rotation are radians; translation is meters.

If both of the distances between prediction and solution (as calculated above) are less than the threshold, then that prediction object is counted as a true positive for that threshold. If not the predicted object is counted as a false positive for that threshold.

Finally, mAP is calculated using these TP/FP determinations across all thresholds.

The thresholds are as follows:

Rotation: `50, 45, 40, 35, 30, 25, 20, 15, 10, 5`

Translation: `0.1, 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01`

### Kernel Submissions
You can make submissions directly from Kaggle Kernels. By adding your teammates as collaborators on a kernel, you can share and edit code privately with them.

### Submission File
For each image ID in the test set, you must predict a pose (position and rotation) for all unmasked cars in the image. The file should contain a header and have the following format:

```txt
ImageId,PredictionString
ID_1d7bc9b31,0.5 0.25 0.5 0.0 0.5 0.0 1.0
ID_f9c21a4e3,0.5 0.5 0.5 0.0 0.0 0.0 0.9
ID_e83dd7c22,0.5 0.5 0.5 0.0 0.0 0.0 1.0
ID_1a050c9a4,0.5 0.5 0.5 0.0 0.0 0.0 0.25
ID_d943d1083,0.5 0.5 0.5 0.0 0.0 0.0 1.0 0.5 0.5 0.5 0.0 0.0 0.0 1.0
ID_3155084f7,0.5 0.5 0.5 0.0 0.0 0.0 1.0
ID_f74dcaa3d,0.5 0.5 0.5 0.0 0.0 0.0 1.0
ID_b183b55dd,0.5 0.5 0.5 0.0 0.0 0.0 1.0
ID_ff5ea7211,0.5 0.5 0.5 0.0 0.0 0.0 1.0
```

Each 7-value element in `PredictionString `corresponds to `pitch`, `yaw`, `roll`, `x`, `y`, `z` and `confidence` for each car in the scene.

## Timeline

January 14, 2020 - Entry deadline. You must accept the competition rules before this date in order to compete.

January 14, 2020 - Pre-trained model and external data disclosure deadline. Participants must disclose any external data or pre-trained models used in the official forum thread in adherence with competition rules.

January 14, 2020 - Team merger deadline. This is the last day participants may join or merge teams.

January 21, 2020 - Final submission deadline.

All deadlines are at 11:59 PM UTC on the corresponding day unless otherwise noted. The competition organizers reserve the right to update the contest timeline if they deem it necessary.

## Data

This dataset contains photos of streets, taken from the roof of a car. We're attempting to predict the position and orientation of all un-masked cars in the test images. You should also provide a `confidence` score indicating how sure you are of your prediction.

### Pose Information (train.csv)

**Note that rotation values are angles expressed in radians, relative to the camera.**

The primary data is images of cars and related `pose` information. The `pose` information is formatted as strings, as follows:

`model type, yaw, pitch, roll, x, y, z`

A concrete example with two cars in the photo:

`5 0.5 0.5 0.5 0.0 0.0 0.0 32 0.25 0.25 0.25 0.5 0.4 0.7`

Submissions (per `sample_submission.csv`) are very similar, with the addition of a confidence score, and the removal of the model type. You are not required to predict the model type of the vehicle in question.

`ID, PredictionString`

`ID_1d7bc9b31,0.5 0.5 0.5 0.0 0.0 0.0 1.0`

indicating that this prediction has a `confidence` score of `1.0`.

### Image Masks (test_masks.zip / train_masks.zip)
Some cars in the images are not of interest (too far away, etc.). Binary masks are provided to allow competitors to remove them from consideration.

### Car Models
3D models of all cars of interest are available for download as pickle files - they can be compared against cars in images, used as references for rotation, etc.

The pickles were created in Python 2. For Python 3 users, the following code will load a given model:
```python
with open(model, "rb") as file:
    pickle.load(file, encoding="latin1")
```

### File descriptions
- train.csv - pose information for all of the images in the training set.
- train_images.zip - images in the training set.
- train_masks.zip - ignore masks for the training set. (Not all images have a mask.)
- test_images.zip - images in the test set.
- test_masks.zip - ignore masks for the test set. (Not all images have a mask.)
- sample_submission.csv - a sample submission file in the correct format
    - ImageId - a unique identifier for each image (and related mask, if one exists).
    - PredictionString - a collection of poses and confidence scores.
- car_models.zip - 3D models of the unmasked cars in the training / test images. They can be used for pose estimation, etc.
- camera.zip - camera intrinsic parameters.

## 数据初览

In [86]:
import numpy as np
import pandas as pd
import cv2
import os
import pickle
import matplotlib.pyplot as plt
import json

### 工具函数

In [79]:
def prediction_string_decoding(prediction_string):
    splited_string = prediction_string.split()
    assert len(splited_string) % 7 == 0
#     model type, yaw, pitch, roll, x, y, z
    decoced_data = [
        {
            "model-type": int(splited_string[i]),
            "yaw": float(splited_string[i + 1]),
            "pitch": float(splited_string[i + 2]),
            "roll": float(splited_string[i + 3]),
            "x": float(splited_string[i + 4]),
            "y": float(splited_string[i + 5]),
            "z": float(splited_string[i + 6])
        } for i in range(0, len(splited_string), 7)
    ]
    
    return decoced_data

In [80]:
def prase_csv_data(csv_file):
    train_csv_data = pd.read_csv(csv_file)
    prased_data = [
        {
            "ImageId":train_csv_data.iloc[i]["ImageId"], 
            "PoseInfo": prediction_string_decoding(
                train_csv_data.iloc[i]["PredictionString"]
            )
        } for i in range(len(train_csv_data))
    ]
    
    return prased_data

In [81]:
def imshow(img):
    fig = plt.figure(figsize=(10, 10))
    ax = fig.add_subplot(111)
    if len(img.shape) != 2:
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    ax.imshow(img)

In [90]:
def load_car_model(car_model_path):
    with open(car_model_path, "rb") as file:
        car_model = pickle.load(file, encoding="latin1")
    return car_model

In [92]:
def load_car_model_from_json(car_model_json_path):
    with open(car_model_json_path) as f:
        car_model = json.load(f)
    return car_model

### 路径全局变量

In [63]:
data_root = "/opt/data/dataset/kaggle/pku-autonomous-driving"
camera_intrinsic_parameters_path = os.path.join(data_root, "camera", "camera_intrinsic.txt")
car_models_dir = os.path.join(data_root, "car_models")
car_models_json_dir = os.path.join(data_root, "car_models_json")
train_csv_path = os.path.join(data_root, "train.csv")
train_images_dir = os.path.join(data_root, "train_images")
train_masks_dir = os.path.join(data_root, "train_masks")
test_images_dir = os.path.join(data_root, "test_images")
test_masks_dir = os.path.join(data_root, "test_masks")

### train.csv

In [64]:
train_csv_data = pd.read_csv(train_csv_path)

In [65]:
train_csv_data

Unnamed: 0,ImageId,PredictionString
0,ID_8a6e65317,16 0.254839 -2.57534 -3.10256 7.96539 3.20066 ...
1,ID_337ddc495,66 0.163988 0.192169 -3.12112 -3.17424 6.55331...
2,ID_a381bf4d0,43 0.162877 0.00519276 -3.02676 2.1876 3.53427...
3,ID_7c4a3e0aa,43 0.126957 -3.04442 -3.10883 -14.738 24.6389 ...
4,ID_8b510fad6,37 0.16017 0.00862796 -3.0887 -3.04548 3.4977 ...
...,...,...
4257,ID_de17ab626,70 0.177583 -0.023215 -3.08003 -25.3682 7.5732...
4258,ID_5a669e211,12 0.23817 -3.12745 3.13929 -7.21988 3.09626 1...
4259,ID_aa6ffba0a,35 0.166437 -0.497963 -3.12063 12.6792 5.48256...
4260,ID_29454123f,70 0.14292 0.0290822 -3.12594 -3.42749 3.38674...


In [72]:
data = prase_csv_data(csv_file=train_csv_path)

### Car Model

#### From pkl file

In [77]:
car_model_path = os.path.join(car_models_dir, "019-SUV.pkl")

In [91]:
load_car_model(car_model_path)

AttributeError: Can't get attribute 'CHJ_tiny_obj' on <module 'objloader' from '/usr/local/lib/python3.6/dist-packages/objloader/__init__.py'>

In [94]:
### From json file

In [97]:
car_model_json_path = os.path.join(car_models_json_dir, "019-SUV.json")

In [99]:
load_car_model_from_json(car_model_json_path)

{'car_type': 'SUV',
 'vertices': [[1.0678626249999998, -0.33273001, 0.5176333949999999],
  [1.062867205, -0.27874485, 0.5143008749999999],
  [1.062789385, -0.27568073000000004, 0.513222385],
  [1.062035905, -0.27597393, 0.49501865499999986],
  [1.0561096950000002, -0.38283093, 0.5140805449999999],
  [1.0536881249999999, -0.38128223, 0.5148062549999999],
  [1.0529605850000001, -0.38780529, 0.49427829499999987],
  [1.051455155, -0.3321797, 0.508986545],
  [1.051182325, -0.33268225, 0.5090134049999999],
  [1.046754985, -0.28120716, 0.5148452749999999],
  [1.0455310850000001, -0.27938246, 0.5153341649999998],
  [1.0417432450000002, -0.26219318, 0.4979226649999999],
  [1.0363166050000001, -0.39474224, 0.5144259649999999],
  [1.0359186550000001, -0.33183737, 0.5994631949999999],
  [1.0333496850000001, -0.38323588999999997, 0.5091470349999999],
  [1.031425855, -0.27130232, 0.5197933949999999],
  [1.030421825, -0.38535107, 0.5097286949999998],
  [1.026869275, -0.38780315000000004, 0.5841842649

### Train Image

In [50]:
img_path = os.path.join(train_images_dir, "ID_8a6e65317.jpg")
mask_path = os.path.join(train_masks_dir,  "ID_8a6e65317.jpg")

In [51]:
img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
mask = cv2.imread(mask_path, cv2.IMREAD_UNCHANGED)