# Scenario K-projection Metric for German Traffic Highway

In this metric, we are interested in understanding the completeness of the test data (more generally, it can also be used to measure the completeness of training data). Apart from a random & unsystematic approach to drive as many kilometers as possible, we have carefully selected a segment of A9 in order to create as many variations as possible for the same data.

The road segment being selected is from **Munuch Schwabing** to **Munich Freimann**. By observing the map below, one can see that in this short segment, we encounter left curves, right curves, and straight lines. 

The completeness of data plays an important role in terms of understanding the performance of the network. 


To start, the following imports are standard. Be sure to have required libraries. 

In [1]:
# Put these at the top of every notebook, to get automatic reloading and inline plotting
%reload_ext autoreload
%autoreload 2
%matplotlib inline

import os
import numpy as np
import cv2
from scipy.misc import imread, imresize
from matplotlib import pyplot as plt


![German Highway A9 from Schwabing to Freimann](img/A9_Schwabing_Freimann.jpg?)
![Variations](img/Variations.jpg?)

The above 6 images shows the vehicle driving on the same road. They are under different lighting conditions (day, night), under different road surface situations (dry, wet), under different lanes (inner, outer), under different weathers (sunny, cloudy, rainy). One can easily observe that a complete data sampling will require exponentially many videos taken. In this simple condition, even for the same place, we already need 2x2x2x3=24 videos. 

If you consider the road curving (left, straight, right), the total number of lanes (2, 3, 4), as well as additional weather conditions such as snowy, the orientation of the sun (right, back, left), even in such a small segment, one can easily needs up 1000 videos for a systematic testing, even on such a small segment. 

The below demonstrated **scenario k-projection metric** tries to create a weaker form of completeness which avoids combinatorial explosion while still guaranteeing *diversity* of test data. The diversity of the test data can later be used also for understanding the effects of physical transformation. More importantly, **the data completeness demonstrated by the diversity and proven by a science-driven metric allows certification authority to believe that the data sampling is approached in a displined manner**. 

The following *description* file sepcifies all possible categorizations under consideration

In [2]:
from nndependability.metrics import ScenarioKProjection

metric2 = ScenarioKProjection.Scenario_KProjection_Metric("data/scenario_coverage/A9/description.xml")

The below file specifies restrictions where certain combination is not possible, for the A9 Schwabing to Freimann segment. We refer readers to look at the file, which contains detailed explanation.

In [3]:
metric2.addDomainRestrictionsFromFile("data/scenario_coverage/A9/domain-restrictions_A9_Freimann.xml")

0.0 <=  + 1.0 C0_4 <= 0.0
0.0 <=  + 1.0 C1_1 + 1.0 C0_2 <= 1.0
0.0 <=  + 1.0 C1_1 + 1.0 C0_3 <= 1.0
0.0 <=  + 1.0 C2_0 + 1.0 C3_1 <= 1.0
0.0 <=  + 1.0 C2_0 + 1.0 C3_2 <= 1.0
0.0 <=  + 1.0 C2_0 <= 0.0
0.0 <=  + 1.0 C2_2 + 1.0 C4_1 <= 1.0
0.0 <=  + 1.0 C2_2 + 1.0 C4_0 <= 1.0
0.0 <=  + 1.0 C2_2 + 1.0 C4_3 <= 1.0
0.0 <=  + 1.0 C2_2 + 1.0 C4_4 <= 1.0
0.0 <=  + 1.0 C2_1 + 1.0 C3_2 <= 1.0


Now we load existing scenarios where in each scenario, it has a field stating which video file is it associated (here the video clip is not presented). Then the solver automatically computed the 2 projection coverage (k=2 is by default) 

In [4]:
metric2.addScenariosFromFile("data/scenario_coverage/A9/scenarios.xml")


2-projection coverage (without considering domain restrictions): 111/162


The denominator 162 has not considered situations where some scenario combinations are not possible. Actually one can compute the number of cases that are not possible (the tool provides a semi-automatic, iterative way to do that). Based on the computation, actually 42 cases are not possible. 

The above value has demonstrated that we have covered quite a bit, by considering the total number of 120 cases. The question remains to be **"what is the next sample to be taken, in order to maximally increase coverage?"**

The automatic test case generator (see below) can provide you such an information, which specifies that we need a night video which is cloudy. The road needs to be dry and straight, and it needs to be on the 3 lane drive where the vehicle is on the 2nd lane.

In [5]:
from nndependability.atg.scenario import scenariogen

In [6]:
variableAssignment = scenariogen.proposeScenariocandidate(metric2)

Optimal solution found
Maximum possibility for improvement = 51
Optimal objective value computed from IP = 4

for criterion weather, set it to cloudy
for criterion day, set it to night
for criterion total_lanes, set it to 3
for criterion current_lane, set it to 2
for criterion straight_curvy, set it to straight
for criterion road_surface, set it to dry


We can store the criterion to a scenario file, and after we add it, we indeed increase the coverage by 4. 

In [7]:
metric2.writeScenarioToFile(variableAssignment, "tmp.xml")