# Passive Vehicular Sensors Datasets (PVS) - Data Exploration

In this notebook, the PVS 1-9 datasets are detailed in the form of maps and tables. For more information on Passive Vehicular Sensors Datasets (PVS) and the project Intelligent Vehicle Perception Based on Inertial Sensing and Artificial Intelligence, visit the page on [GitHub](https://github.com/Intelligent-Vehicle-Perception/Intelligent-Vehicle-Perception-Based-on-Inertial-Sensing-and-Artificial-Intelligence) or [Kaggle](https://www.kaggle.com/jefmenegazzo/pvs-passive-vehicular-sensors-datasets).


## Sensor Network

Several passive approach sensors were used to collect data to create the nine Passive Vehicular Sensors Dataset (PVS) datasets, detailed in the following table:


|      Hardware     |     Sensor    |                   Data                  | Sampling Rate |
|:-----------------|:-------------|:---------------------------------------|:-------------|
| HP Webcam HD-4110 | Camera        | 720p Video                              |     30 Hz     |
| Xiaomi Mi 8       | GPS           | Speed in m/s, latitude, longitude, etc. |      1 Hz     |
| MPU-9250          | Accelerometer | 3-axis acceleration in m/s²             |     100 Hz    |
| MPU-9250          | Gyroscope     | 3-axis rotation rate in deg/s           |     100 Hz    |
| MPU-9250          | Magnetometer  | 3-axis ambient geomagnetic field in µT  |     100 Hz    |
| MPU-9250          | Temperature   | Sensor temperature in ◦C                |     100 Hz    |

<br>
All the hardware equipment was attached to the vehicle as shown in the next figure. The camera was placed on the outside car roof (1), and the GPS receiver was placed internally on the dashboard (2). Two networks with MPU-9250 modules were distributed in the vehicle to get data coming from different points. Thus, each end of the front axle (right and left side) received one of the sensor networks, where a module was attached to the control arm (4), located below and near to the vehicle’s suspension system; another module was placed above and near the suspension system, attached to the body immediately above the tire (3); and a third module was attached to the vehicle’s dashboard (2), inside the cabin.

<div align="center">
    <img src="https://github.com/Intelligent-Vehicle-Perception/Intelligent-Vehicle-Perception-Based-on-Inertial-Sensing-and-Artificial-Intelligence/raw/master/img/sensor_network.png" alt="Sensor Hardware Network Placement" align="center"/>
</div>

<br>
The data were produced in three different vehicles, with three different drivers, in three different environments in which there are three different surface types, in addition to variations in conservation state and presence of obstacles and anomalies, such as speed bumps and potholes. The following table details the data collection contexts:

| DataSet |       Vehicle      |  Driver  |  Scenario  | Distance |
|:-------|:------------------|:--------|:----------|:--------|
| PVS 1   | Volkswagen Saveiro | Driver 1 | Scenario 1 | 13.81 km |
| PVS 2   | Volkswagen Saveiro | Driver 1 | Scenario 2 | 11.62 km |
| PVS 3   | Volkswagen Saveiro | Driver 1 | Scenario 3 | 10.72 km |
| PVS 4   | Fiat Bravo         | Driver 2 | Scenario 1 | 13.81 km |
| PVS 5   | Fiat Bravo         | Driver 2 | Scenario 2 | 11.63 km |
| PVS 6   | Fiat Bravo         | Driver 2 | Scenario 3 | 10.73 km |
| PVS 7   | Fiat Palio         | Driver 3 | Scenario 1 | 13.78 km |
| PVS 8   | Fiat Palio         | Driver 3 | Scenario 2 | 11.63 km |
| PVS 9   | Fiat Palio         | Driver 3 | Scenario 3 | 10.74 km |

Each dataset consists of the following files:

| File                       | Description                                                                                                            |
|:----------------------------|:------------------------------------------------------------------------------------------------------------------------|
| dataset_gps.csv            | GPS data, including latitude, longitude, altitude, speed, accuracy, etc.                                               |
| dataset_gps_mpu_left.csv   | Inertial sensor data on the left side of the vehicle, combined with GPS data.                                          |
| dataset_gps_mpu_right.csv  | Inertial sensor data on the right side of the vehicle, combined with GPS data.                                         |
| dataset_labels.csv         | Data classes for each sample data in the dataset (for both sides).                                                |
| dataset_mpu_left.csv       | Inertial sensor data on the left side of the vehicle.                                                                  |
| dataset_mpu_right.csv      | Inertial sensor data on the right side of the vehicle.                                                                 |
| dataset_settings_left.csv  | Settings of the inertial sensors placed on the left side of the vehicle. Includes measurement range, resolution, etc.  |
| dataset_settings_right.csv | Settings of the inertial sensors placed on the right side of the vehicle. Includes measurement range, resolution, etc. |
| map.html                   | Interactive maps with data classes.                                                                                  |
| video_dataset_left.mp4     | Video with data plotted from inertial sensors and speed, sampled on the left side of the vehicle.                      |
| video_dataset_right.mp4    | Video with data plotted from inertial sensors and speed, sampled on the right side of the vehicle.                     |
| video_environment.mp4      | External environment video.                                                                                            |
| video_environment_dataset_left.mp4      | Videos side by side from video_environment.mp4 and video_dataset_left.mp4                                 |
| video_environment_dataset_right.mp4     | Videos side by side from video_environment.mp4 and video_dataset_right.mp4                                 |

## Data Classes

The data classes are available in the **dataset_labels.csv** file were built in one-hot-encoded form. The following labels are available:

#### Road Surface Type Labels

|    Description   |    Label    |
|:----------------|:-----------|
| Dirt Road        | dirt_road        |
| Cobblestone Road | cobblestone_road |
| Asphalt Road     | asphalt_road     |

#### Road Surface Condition

|    Description   |    Label    |
|:----------------|:-----------|
| Paved Road       | paved_road   |
| Unpaved Road     | unpaved_road |

#### Road Roughness Condition

|    Description   |    Label    |
|:----------------|:-----------|
| Good Road        | good_road_left, good_road_right        |
| Regular Road | regular_road_left, regular_road_right |
| Bad Road     | bad_road_left, bad_road_right     |

#### Speed Bump

|    Description   |    Label    |
|:----------------|:-----------|
| No Speed Bump        | no_speed_bump        |
| Speed Bump in Asphalt   | speed_bump_asphalt   |
| Speed Bump in Cobblestone | speed_bump_cobblestone |


## Defining Utility Functions

In [None]:
# Importing required packages
import os
import folium
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.core.display import display, HTML, Video
%matplotlib inline
# %matplotlib notebook
pd.set_option("float_format", '{:0.2f}'.format)
pd.set_option('display.max_columns', 30)

In [None]:
# Returns all loaded datasets
def getDatasets():

    pvs = {}
    
    for i in range(1,10):
    
        folder = os.path.join(datasets_folder, "PVS " + str(i))
        data_left = pd.read_csv(os.path.join(folder, 'dataset_gps_mpu_left.csv'), float_precision="high")
        data_right = pd.read_csv(os.path.join(folder, 'dataset_gps_mpu_right.csv'), float_precision="high")
        data_labels = pd.read_csv(os.path.join(folder, 'dataset_labels.csv'))

        pvs["pvs_" + str(i)] = {
            "data_left": data_left,
            "data_right": data_right,
            "data_labels": data_labels
        }

    return pvs

# Shows data classes on a plot
def plotDataClass(pvs, classes):
    
    data_labels = datasets["pvs_" + str(pvs)]["data_labels"]
    plt.figure(figsize=(16,6)) 
    
    for i in range(0, len(classes)):
        classe = classes[i]
        (data_labels[classe] * (i+1)).plot(linewidth=2)

    plt.legend()

# Shows data classes on a map
def mapDataClass(pvs, classes, colors, zoom_start=14):
    
    dataset = datasets["pvs_" + str(pvs)]
    data = pd.concat([dataset["data_left"], dataset["data_labels"]], axis=1)

    gps = data[['latitude', 'longitude']]
    focolat = (gps['latitude'].min() + gps['latitude'].max()) / 2
    focolon = (gps['longitude'].min() + gps['longitude'].max()) / 2
    maps = folium.Map(location=[focolat, focolon], zoom_start=zoom_start)

    grouper = data.groupby(["latitude","longitude"]).mean().round(0)

    for i in range(0, len(classes)):
    
        classe = classes[i]
        color = colors[i]
        points = grouper[grouper[classe] == 1].index.values.reshape(-1)
        
        for point in points:
            folium.Circle(point, color=color, radius=0.1).add_to(maps)

    return maps

# Shows data class maps side by side
def mapDataClassSideBySyde(pvs, classes, colors):
    
    html = ""
    
    for i in pvs:
        maps = mapDataClass(i, classes, colors, 13)
        html += """
        <iframe srcdoc="{}" style="float:left; width: {}px; height: {}px; display:inline-block; width:33%; margin: 0 auto; border: 1px solid black"></iframe>
        """.format(maps.get_root().render().replace('"', '&quot;'),500,500)

    display(HTML(html))
    
# Shows legend for data classes
def makeDataClassLegend(classes_names, colors):
    
    html_legend = """
    <style>
        .legend { list-style: none; }
        .legend li { float: left; margin-right: 10px; }
        .legend span { border: 1px solid #ccc; float: left; width: 12px; height: 12px; margin: 2px; }
    </style>
    <ul class="legend" style="list-style: none;">
    """
    
    for i in range(0, len(classes_names)):
        name = classes_names[i]
        color = colors[i]
        
        html_legend += """
        <li><span style="background-color: {}"></span> {}</li>
        """.format(color, name)
    
    html_legend += """
    </ul>
    """
    
    display(HTML(html_legend))
    
# Measure the quantity and distribution metrics of the data classes
def metricsDataClass(classes):
    
    list_data = []
    
    for pvs in range(1,10):
        data = datasets["pvs_" + str(pvs)]
        list_data.append(data["data_labels"][classes].sum())
       
    data = pd.DataFrame(list_data)
    data["Total"] = data.sum(axis=1)
    
    for classe in classes:
        data[classe + "_distribuition_%"] = round(data[classe]/data["Total"] * 100, 2)
        
    data.index = np.arange(1, len(data) + 1)
    data.index = data.index.rename("PVS")
    return data

In [None]:
# Datasets Location
datasets_folder = "../input/pvs-passive-vehicular-sensors-datasets/"
# In-memory Datasets
datasets = getDatasets()

## Exploring the Data Classes

### Road Surface Type

There are three road surface types in datasets: asphalt, cobblestone, and dirt road. All types are present in all PVS datasets.

In [None]:
makeDataClassLegend(["Dirt Road", "Cobblestone Road", "Asphalt Road"], ["red", "green", "blue"])
mapDataClass(1, ["dirt_road", "cobblestone_road", "asphalt_road"], ["red", "green", "blue"])

In [None]:
makeDataClassLegend(["Dirt Road", "Cobblestone Road", "Asphalt Road"], ["red", "green", "blue"])
mapDataClassSideBySyde([1,2,3,4,5,6,7,8,9], ["dirt_road", "cobblestone_road", "asphalt_road"], ["red", "green", "blue"])

Sample quantity metrics and the distribution of data classes are detailed in the table below. The metrics are the same for the right and left sides.

In [None]:
metricsDataClass(["dirt_road", "cobblestone_road", "asphalt_road"])

### Road Surface Condition

There are two road surface conditions in datasets: paved and unpaved road. All types are present in all PVS datasets.

In [None]:
makeDataClassLegend(["Paved Road", "Unpaved Road"], ["blue", "red"])
mapDataClass(1, ["paved_road", "unpaved_road"], [ "blue", "red"])

In [None]:
makeDataClassLegend(["Paved Road", "Unpaved Road"], ["blue", "red"])
mapDataClassSideBySyde([1,2,3,4,5,6,7,8,9], ["paved_road", "unpaved_road"], [ "blue", "red"])

Sample quantity metrics and the distribution of data classes are detailed in the table below. The metrics are the same for the right and left sides.

In [None]:
metricsDataClass(["paved_road", "unpaved_road"])

### Road Roughness Condition

There are three road roughness condition in datasets: good, regular, and bad road. All types are present in all PVS datasets.

In [None]:
makeDataClassLegend(["Good Road", "Regular Road", "Bad Road"], ["green", "yellow", "red"])
mapDataClass(1, ["good_road_left", "regular_road_left", "bad_road_left"], ["green", "yellow", "red"])

In [None]:
makeDataClassLegend(["Good Road", "Regular Road", "Bad Road"], ["green", "yellow", "red"])
mapDataClassSideBySyde([1,2,3,4,5,6,7,8,9], ["good_road_left", "regular_road_left", "bad_road_left"], ["green", "yellow", "red"])

Sample quantity metrics and the distribution of data classes are detailed in the table below. The metrics are for the left side.

In [None]:
metricsDataClass(["good_road_left", "regular_road_left", "bad_road_left"])

Sample quantity metrics and the distribution of data classes are detailed in the table below. The metrics are for the right side.

In [None]:
metricsDataClass(["good_road_right", "regular_road_right", "bad_road_right"])

### Speed Bump

There are three speed bump conditions in datasets: speed bump in asphalt, speed bump in cobblestone, and no speed bump.

In [None]:
makeDataClassLegend(["Speed Bump Asphalt", "Speed Bump Cobblestone"], ["blue", "red"])
mapDataClass(1, ["speed_bump_asphalt", "speed_bump_cobblestone"], [ "blue", "red"])

In [None]:
makeDataClassLegend(["Speed Bump Asphalt", "Speed Bump Cobblestone"], ["blue", "red"])
mapDataClassSideBySyde([1,2,3,4,5,6,7,8,9], ["speed_bump_asphalt", "speed_bump_cobblestone"], [ "blue", "red"])

Sample quantity metrics and the distribution of data classes are detailed in the table below. The metrics are the same for the right and left sides.

In [None]:
metricsDataClass(["speed_bump_asphalt", "speed_bump_cobblestone", "no_speed_bump"])