# Benchmarking Behavior Planners in BARK

This notebook the benchmarking workflow of BARK.

Systematically benchmarking behavior consists of
1. A reproducable set of scenarios (we call it **BenchmarkDatabase**)
2. Metrics, which you use to study the performance (we call it **Evaluators**)
3. The behavior model(s) under test

Our **BenchmarkRunner** can then run the benchmark and produce the results.

In [1]:
import os
import matplotlib.pyplot as plt
from IPython.display import Video

from load.benchmark_database import BenchmarkDatabase
from serialization.database_serializer import DatabaseSerializer
from bark.benchmark.benchmark_runner import BenchmarkRunner, BenchmarkConfig, BenchmarkResult
from bark.benchmark.benchmark_analyzer import BenchmarkAnalyzer

from bark.runtime.commons.parameters import ParameterServer

from bark.runtime.viewer.matplotlib_viewer import MPViewer
from bark.runtime.viewer.video_renderer import VideoRenderer


from bark.models.behavior import BehaviorIDMClassic, BehaviorConstantVelocity

ModuleNotFoundError: No module named 'benchmark_database'

# Database

The benchmark database provides a reproducable set of scenarios. A scenario get's created by a ScenarioGenerator (we have a couple of them). The scenarios are serialized into binary files (ending .bark_scenarios) and packed together with the map file and the parameter files into a .zip-archive. We call this zipped archive a relase, which can be published at Github, or processed locally. 



## We will first start with the DatabaseSerializer

The **DatabaseSerializer** recursively serializes all scenario param files sets
 within a folder.
 
We will process the database directory from Github.

In [None]:
#dbs = DatabaseSerializer(test_scenarios=4, test_world_steps=5, num_serialize_scenarios=10)
dbs = DatabaseSerializer(test_scenarios=1, test_world_steps=10, num_serialize_scenarios=1)
dbs.process("../../../benchmark_database/data/database1")
local_release_filename = dbs.release(version="tutorial")

print('Filename:', local_release_filename)

Then reload to test correct parsing

In [None]:
db = BenchmarkDatabase(database_root=local_release_filename)
scenario_generation, _ = db.get_scenario_generator(scenario_set_id=0)

for scenario_generation, _ in db:
  print('Scenario: ', scenario_generation)

## Evaluators

Evaluators allow to calculate a boolean, integer or real-valued metric based on the current simulation world state.

The current evaluators available in BARK are:
- StepCount: returns the step count the scenario is at.
- GoalReached: checks if a controlled agent’s Goal Definitionis satisfied.
- DrivableArea: checks whether the agent is inside its RoadCorridor.
- Collision(ControlledAgent): checks whether any agent or only the currently controlled agent collided

Let's now map those evaluators to some symbols, that are easier to interpret.

In [None]:
evaluators = {"success" : "EvaluatorGoalReached", \
              "collision" : "EvaluatorCollisionEgoAgent", \
              "max_steps": "EvaluatorStepCount"}

We will now define the terminal conditions of our benchmark. We state that a scenario ends, if
- a collision occured
- the number of time steps exceeds the limit
- the definition of success becomes true (which we defined to reaching the goal, using EvaluatorGoalReached)

In [None]:
terminal_when = {"collision" :lambda x: x, \
                 "max_steps": lambda x : x>40, \
                 "success" : lambda x: x}

# Behaviors Under Test
Let's now define the parameters for different heuristics.

In [None]:
params = ParameterServer() 
behaviors_tested = {"IDM": BehaviorIDMClassic(params), "Const" : BehaviorConstantVelocity(params)}

# Benchmark Runner

The BenchmarkRunner allows to evaluate behavior models with different parameter configurations over the entire benchmarking database.

In [None]:
benchmark_runner = BenchmarkRunner(benchmark_database=db,\
                                   evaluators=evaluators,\
                                   terminal_when=terminal_when,\
                                   behaviors=behaviors_tested,\
                                   log_eval_avg_every=10)

result = benchmark_runner.run(maintain_history=True)

We will now dump the files, to allow them to be postprocessed later.

In [None]:
result.dump(os.path.join("./benchmark_results.pickle"))


# Benchmark Results

Benchmark results contain
- the evaluated metrics of each simulation run, as a Panda Dataframe
- the world state of every simulation (optional)

In [None]:
result_loaded = BenchmarkResult.load(os.path.join("./benchmark_results.pickle"))

We will now first analyze the dataframe.

In [None]:
df = result_loaded.get_data_frame()

df.head()

# Benchmark Analyzer

The benchmark analyzer allows to filter the results to show visualize what really happened. These filters can be set via a dictionary with lambda functions specifying the evaluation criteria which must be fullfilled.

A config is basically a simulation run, where step size, controlled agent, terminal conditions and metrics have been defined.

Let us first load the results into the BenchmarkAnalyzer and then filter the results.

In [None]:
analyzer = BenchmarkAnalyzer(benchmark_result=result_loaded)


configs_idm = analyzer.find_configs(criteria={"behavior": lambda x: x=="IDM", "success": lambda x : not x})
configs_const = analyzer.find_configs(criteria={"behavior": lambda x: x=="Const", "success": lambda x : not x})

We will now create a video from them. We will use Matplotlib Viewer and render everything to a video.

In [None]:
sim_step_time=5

params2 = ParameterServer()

fig = plt.figure(figsize=[10, 10])
viewer = MPViewer(params=params2, y_length = 80, enforce_y_length=True, enforce_x_length=False,\
                  follow_agent_id=True, axis=fig.gca())
video_exporter = VideoRenderer(renderer=viewer, world_step_time=sim_step_time)

analyzer.visualize(viewer = video_exporter, configs_idx_list=configs_idm[1:3], \
                  real_time_factor=10, fontsize=6)
                   
video_exporter.export_video(filename="./tutorial_video")


In [None]:
Video("./tutorial_video.mp4")