# Getting started with bb_tracking

This notebook shows you how to use the code in [bb_tracking](https://github.com/BioroboticsLab/bb_tracking).

It covers the following topics:

 - Loading data from [bb_binary](https://github.com/BioroboticsLab/bb_binary)
 - Using predefined scoring functions to perform tracking
 - Define features and train your own classifier for tracking
 - Validate the results of trakcing

In [1]:
# python standard imports
from collections import OrderedDict

# sklearn imports
from sklearn.ensemble import RandomForestClassifier

## bb_binary imports
from bb_binary import Repository

## bb_tracking imports
from bb_tracking.data.constants import DETKEY
from bb_tracking.data import DataWrapperBinary, DataWrapperTruthBinary
from bb_tracking.tracking import make_detection_score_fun, SimpleWalker, score_id_sim_v,\
    distance_orientations_v, distance_positions_v, train_bin_clf
from bb_tracking.validation import Validator, validate_plot, plot_fragments

import matplotlib.pyplot as plt
%matplotlib inline

plt.rcParams['figure.figsize'] = (20.0, 10.0)
plt.rcParams.update({'font.size': 12})

## Loading the data

If you are a member of the Biorobotics Lab ask where you can get access to the ground truth data.
For others there *might* be a web interface with data access later.

For this tutorial you'll need the original *bb_binary* pipeline output and the corresponding *bb_binary* ground truth data.

There are separate `DataWrapper` for each type. The ordinary `DataWrapper` is for tracking, the `DataWrapperTruth` for training and validation. Let's start with loading the data.

> **Note:** The directory you are pointing to is the one with a file named `.bbb_repo_config.json` in it!

In [3]:
repo_test = Repository("/home/franziska/Documents/BA/bb_analysis_local/data/interim/20150918_Truth/repo_pipeline")
dw_tracking = DataWrapperBinary(repo_test)

merge_radius = 30  # used for merging truth detections with pipeline output
repo_truth = Repository("/home/franziska/Documents/BA/bb_analysis_local/data/interim/20150918_Truth/repo_truth")
dw_truth = DataWrapperTruthBinary(repo_test, repo_truth, merge_radius)

Let's see if we really have some data loaded:

In [4]:
print("\
 Detections tracking: {}\n\
 Detections truth: {}\n\
 Positives: {}\n\
 False Positives: {}".format(
    len(dw_tracking.detections_dict),
    len(dw_truth.detections_dict),
    len(dw_truth.positives),
    len(dw_truth.false_positives)))

 Detections tracking: 18085
 Detections truth: 18085
 Positives: 17863
 False Positives: 222


## Perform tracking

### Ready to use Classifier

We provide a ready to use classifier that uses a suitable Machine Learning Algorithm with a selected set of Features and Hyperparameters.

In [5]:
score_fun_ready, _ = make_detection_score_fun(dw_truth, verbose=True)

ROC AUC: 0.9992
Accuracy on training set: 0.9963
Accuracy on testing set: 0.9958.
Classification Report (10 fold cross validation):
             precision    recall  f1-score   support

          0     0.9779    0.9866    0.9822      2015
          1     0.9984    0.9974    0.9979     17237

avg / total     0.9963    0.9963    0.9963     19252



### Train your own Classifier

You may use any classifier that is compatible with [scikit-learn](http://scikit-learn.org/stable/index.html).
In this example we are using a [RandomForest](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn-ensemble-randomforestclassifier). The features are organized in a ``OrderedDict`` because we depend on the order of the feature functions. For testing purposes we define some ``lambdas`` that map the last detection of a track and the detections that might be assigned to some existing scoring functions.

The training function will not only train the classifier, it will also evaluate it and return a scoring function for testing purposes.

In [6]:
frame_diff = 1  # difference between frames, 1 means no gaps
gap = frame_diff - 1 # the length of gaps that are allowed to occur
training_radius = 200  # search radius in image coordinates in training
tracking_radius = 200  # search radius in image coordinates in tracking

features = OrderedDict()
features['distance'] = lambda tracks, detections:\
    distance_positions_v([track.meta[DETKEY][-1] for track in tracks], detections)

features['id_score'] = lambda tracks, detections:\
    score_id_sim_v([track.meta[DETKEY][-1] for track in tracks], detections)

features['rotation'] = lambda tracks, detections:\
    distance_orientations_v([track.meta[DETKEY][-1] for track in tracks], detections)

clf = RandomForestClassifier(n_estimators=10)
_, _, score_fun_test = train_bin_clf(clf, dw_truth, features, frame_diff, training_radius, verbose=True)

ROC AUC: 0.9994
Accuracy on training set: 0.9998
Accuracy on testing set: 0.9983.
Classification Report (10 fold cross validation):
             precision    recall  f1-score   support

          0     0.9976    0.9986    0.9981     18895
          1     0.9984    0.9974    0.9979     17273

avg / total     0.9980    0.9980    0.9980     36168



### Calculate tracks using a Walker

A Walker will take a scoring function and a `DataWrapper` and will try to assign the best fitting frame objects in a given search radius.

In [None]:
%%time
walker = SimpleWalker(dw_tracking, score_fun_ready, frame_diff, tracking_radius)
tracks = walker.calc_tracks()

Also run our new testing function.

In [None]:
%%time
walker = SimpleWalker(dw_tracking, score_fun_test, frame_diff, tracking_radius)
tracks_test = walker.calc_tracks()

## Validate the results

For the validation of the results there is a `Validator` class that performs some sanity checks and validates tracks. You might provide a custom scoring function, or use the integrated one.

In this scoring we want the validation algorithm to also check if the tracking algorithm was able to *close* the gaps on the left and right. Finally we visualize some metrics of the validation scores and the distribution of fragments.

In [None]:
validator = Validator(dw_truth)
scores = validator.validate(tracks, gap, gap_l=True, gap_r=True, cam_gap=True)
scores_test = validator.validate(tracks_test, gap, gap_l=True, gap_r=True, cam_gap=True)

In [None]:
_ = validate_plot(tracks, scores, validator, gap)
_ = validate_plot(tracks_test, scores_test, validator, gap)

In [None]:
_ = plot_fragments(scores, validator, gap)
_ = plot_fragments(scores_test, validator, gap)