# Computer Vision III: Detection, Segmentation and Tracking (CV3DST) GNN Exercise

In this exercise, we will first develop an extension of the ReID-based tracker we created in the previous exercise that will make it more robust to occlusions by allowing it to recover from missed detections. 

We will then implement a Message Passing Network from scratch, and we will use to build a model that will learn to combine position information and reid features to directly predict associations between past tracks and detections. We will use this model to create robust tracker. 

Your tasks are the following:
- Adapt the track management scheme of our ReIDTracker allow it to recover from missed detections.
- Implement a Message Passing Network from scratch to operate on bipartite graphs
- Implement the pairwise feature  computation to obtain features for our Message Passing Network
- Train the Message Passing Network and improve your tracker's IDF1 score


## Setup

### Download and extract project data to your Google Drive

1.   **Required**: Please follow all instructions of exercise 0 before running this notebook.
2.   Save this notebook to your Google Drive by clicking `Save a copy in Drive` from the `File` menu.
3.   Download [this](https://vision.in.tum.de/webshare/u/brasoand/cv3dst/cv3dst_gnn_exercise.zip) zip file to your desktop, extract it and upload it into the `Colab Notebooks` folder in your Google Drive.

#### Connect the notebook to your Google Drive

In [1]:
from google.colab import drive

drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [11]:
!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=117OyrCIPF1sPGICZBNg0s6zLuOTufN7e' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=117OyrCIPF1sPGICZBNg0s6zLuOTufN7e" -O cv3dst_exercise.zip && rm -rf /tmp/cookies.txt
!unzip -q cv3dst_exercise.zip

!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1-5QgzZhaW5VUhiuXsibWa5miIV8c9ZPU' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1-5QgzZhaW5VUhiuXsibWa5miIV8c9ZPU" -O cv3dst_gnn_exercise.zip && rm -rf /tmp/cookies.txt
!unzip -q cv3dst_gnn_exercise.zip

--2021-08-15 15:52:19--  https://docs.google.com/uc?export=download&confirm=Ko86&id=117OyrCIPF1sPGICZBNg0s6zLuOTufN7e
Resolving docs.google.com (docs.google.com)... 108.177.97.101, 108.177.97.102, 108.177.97.100, ...
Connecting to docs.google.com (docs.google.com)|108.177.97.101|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0k-6k-docs.googleusercontent.com/docs/securesc/skfap0a91jtc73tabjh6gnu46un918aj/p3j83edj5c80idk1qg2ho6c3hcvu19nd/1629042675000/13103035794560160676/09000045850581459307Z/117OyrCIPF1sPGICZBNg0s6zLuOTufN7e?e=download [following]
--2021-08-15 15:52:19--  https://doc-0k-6k-docs.googleusercontent.com/docs/securesc/skfap0a91jtc73tabjh6gnu46un918aj/p3j83edj5c80idk1qg2ho6c3hcvu19nd/1629042675000/13103035794560160676/09000045850581459307Z/117OyrCIPF1sPGICZBNg0s6zLuOTufN7e?e=download
Resolving doc-0k-6k-docs.googleusercontent.com (doc-0k-6k-docs.googleusercontent.com)... 64.233.189.132, 2404:6800:4008:c07::84
Connecting

In [13]:
!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=12uf2AurrI4u_qmFMFw3xUUgeN8BNdoyl' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=12uf2AurrI4u_qmFMFw3xUUgeN8BNdoyl" -O cv3.zip && rm -rf /tmp/cookies.txt
!unzip -q cv3.zip

--2021-08-15 16:05:14--  https://docs.google.com/uc?export=download&confirm=&id=12uf2AurrI4u_qmFMFw3xUUgeN8BNdoyl
Resolving docs.google.com (docs.google.com)... 108.177.97.139, 108.177.97.101, 108.177.97.113, ...
Connecting to docs.google.com (docs.google.com)|108.177.97.139|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-04-bo-docs.googleusercontent.com/docs/securesc/1v6jmote4ih52v923475k82blb3k6sam/a8u5rjacng9a5k8qcojr1u2adr8esjs4/1629043500000/13103035794560160676/15486079759782646868Z/12uf2AurrI4u_qmFMFw3xUUgeN8BNdoyl?e=download [following]
--2021-08-15 16:05:15--  https://doc-04-bo-docs.googleusercontent.com/docs/securesc/1v6jmote4ih52v923475k82blb3k6sam/a8u5rjacng9a5k8qcojr1u2adr8esjs4/1629043500000/13103035794560160676/15486079759782646868Z/12uf2AurrI4u_qmFMFw3xUUgeN8BNdoyl?e=download
Resolving doc-04-bo-docs.googleusercontent.com (doc-04-bo-docs.googleusercontent.com)... 64.233.189.132, 2404:6800:4008:c07::84
Connecting to 

In [14]:
!cp -f "data_track.py" "/content/cv3dst_gnn_exercise/src/tracker/"
!cp -f "dataset.py" "/content/cv3dst_gnn_exercise/src/gnn/"
!cp -f "tracker.py" "/content/cv3dst_gnn_exercise/src/tracker/"
!cp -f "trainer.py" "/content/cv3dst_gnn_exercise/src/gnn/"
!cp -f "utils.py" "/content/cv3dst_gnn_exercise/src/tracker/"

In [None]:
# !rm *.zip

In [None]:
# root_dir = "gdrive/My Drive/Colab Notebooks/cv3dst_exercise/"
# gnn_root_dir = "gdrive/My Drive/Colab Notebooks/cv3dst_gnn_exercise/"

root_dir = "./cv3dst_exercise/"
gnn_root_dir = "./cv3dst_gnn_exercise/"

The `root_dir` path points to the directory and the content in your Google Drive.

In [None]:
# !ls "gdrive/My Drive/Colab Notebooks/cv3dst_gnn_exercise/data"
!ls "./cv3dst_gnn_exercise/data"

preprocessed_data_test_2.pth  preprocessed_data_train_2.pth


#### Install and import Python libraries

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

!pip install tqdm lap
!pip install https://github.com/timmeinhardt/py-motmetrics/archive/fix_pandas_deprecating_warnings.zip

Collecting lap
[?25l  Downloading https://files.pythonhosted.org/packages/bf/64/d9fb6a75b15e783952b2fec6970f033462e67db32dc43dfbb404c14e91c2/lap-0.4.0.tar.gz (1.5MB)
[K     |████████████████████████████████| 1.5MB 7.7MB/s 
[?25hBuilding wheels for collected packages: lap
  Building wheel for lap (setup.py) ... [?25l[?25hdone
  Created wheel for lap: filename=lap-0.4.0-cp37-cp37m-linux_x86_64.whl size=1590143 sha256=19994cbeed5fcdc5b285b1c00221f0298ed5af870ea1e7c87eee0297bc7857b7
  Stored in directory: /root/.cache/pip/wheels/da/3e/af/eddcd6ffaa27df8d0ddac573758f8953c4e57c64c4c8c8b7d0
Successfully built lap
Installing collected packages: lap
Successfully installed lap-0.4.0
[K     | 276kB 7.7MB/s
Building wheels for collected packages: motmetrics
  Building wheel for motmetrics (setup.py) ... [?25l[?25hdone
  Created wheel for motmetrics: filename=motmetrics-1.1.3-cp37-none-any.whl size=134200 sha256=c4d5b683846c41f5b2bb30e30d1b96c59ca7f32d41f289114b93e2c3990550ea
  Stored in di

In [None]:
import os
import sys
sys.path.append(os.path.join(gnn_root_dir, 'src'))


import matplotlib.pyplot as plt
import numpy as np
import time
from tqdm.autonotebook import tqdm

import torch
from torch.utils.data import DataLoader

from tracker.data_track import MOT16Sequences
from tracker.tracker import Tracker, ReIDTracker
from tracker.utils import run_tracker, cosine_distance
from scipy.optimize import linear_sum_assignment as linear_assignment
import os.path as osp

import motmetrics as mm
mm.lap.default_solver = 'lap'

  if __name__ == '__main__':


In [None]:
# !ls "gdrive/My Drive/Colab Notebooks/cv3dst_exercise/data/MOT16/train"
# !ls "gdrive/My Drive/Colab Notebooks/cv3dst_exercise/data/MOT16/test"

!ls "./cv3dst_exercise/data/MOT16/train"
!ls "./cv3dst_exercise/data/MOT16/test"

MOT16-001  MOT16-003  MOT16-04	MOT16-09  MOT16-11
MOT16-002  MOT16-02   MOT16-05	MOT16-10  MOT16-13
MOT16-01  MOT16-03  MOT16-06  MOT16-07	MOT16-08  MOT16-12  MOT16-14


## Speed-Ups
In order to speed up training and inference runtimes, in this exercise we will be working with pre-computed detections and ReID embeddings. We ran the object detector we provided in Exercise 0 and applied to all frames. We also computed reid embeddings for all boxes in every frame of the dataset so that they don't need to be computed every time you run your tracker. This yields over 10x speed improvements. You will not have to work directly with the resulting files, as we have internally adapted the boilerplate code to work with them.

In [None]:
train_db = torch.load(osp.join(gnn_root_dir, 'data/preprocessed_data_train_2.pth'))

## Exercise Part 0 - Assignment's 1 ReIDHungarianTracker

We start by providing a sample solution of the ``ReIDTracker`` from Exercise 1.
It will serve as our baseline.


Recall that this tracker works by performing frame-to-frame bipartite matching between newly detected boxes and past tracks based on ReID distance. Whenever a past track cannot be matched, its killed. And whenever, a newly detected box cannot be match, it starts a new trajectory.

**NOTE**: We have modified the ``compute_distance`` function in ``data_association`` from last week to include a thresshold on ReID distance (if ReID distance >0.1, matching is not possible). This is important to prevent our tracker from reusing tracks for very dissimilar objects.


In [None]:
from tracker.tracker import Tracker, ReIDTracker
_UNMATCHED_COST=255
class ReIDHungarianTracker(ReIDTracker):
    def data_association(self, boxes, scores, pred_features):  
        """Refactored from previous implementation to split it onto distance computation and track management"""
        if self.tracks:
            # print("!!!origin!!!")
            track_boxes = torch.stack([t.box for t in self.tracks], axis=0)
            track_features = torch.stack([t.get_feature() for t in self.tracks], axis=0)
            
            distance = self.compute_distance_matrix(track_features, pred_features,
                                                    track_boxes, boxes, metric_fn=cosine_distance)

            # Perform Hungarian matching.
            row_idx, col_idx = linear_assignment(distance)            
            self.update_tracks(row_idx, col_idx,distance, boxes, scores, pred_features)

            
        else:
            # No tracks exist.
            self.add(boxes, scores, pred_features)
        
    def update_tracks(self, row_idx, col_idx, distance, boxes, scores, pred_features):
        """Updates existing tracks and removes unmatched tracks.
           Reminder: If the costs are equal to _UNMATCHED_COST, it's not a 
           match. 
        """
        track_ids = [t.id for t in self.tracks]

        unmatched_track_ids = []
        seen_track_ids = []
        seen_box_idx = []
        for track_idx, box_idx in zip(row_idx, col_idx):
            costs = distance[track_idx, box_idx] 
            internal_track_id = track_ids[track_idx]
            seen_track_ids.append(internal_track_id)
            if costs == _UNMATCHED_COST:
                unmatched_track_ids.append(internal_track_id)
            else:
                self.tracks[track_idx].box = boxes[box_idx]  # 更新了存储的track的box
                self.tracks[track_idx].add_feature(pred_features[box_idx])  # 更新了track的feature
                seen_box_idx.append(box_idx)

        unseen_track_ids = set(track_ids) - set(seen_track_ids)
        unmatched_track_ids.extend(list(unseen_track_ids))
        self.tracks = [t for t in self.tracks
                       if t.id not in unmatched_track_ids]


        # Add new tracks.
        new_boxes_idx = set(range(len(boxes))) - set(seen_box_idx)
        new_boxes = [boxes[i] for i in new_boxes_idx]
        new_scores = [scores[i] for i in new_boxes_idx]
        new_features = [pred_features[i] for i in new_boxes_idx]
        self.add(new_boxes, new_scores, new_features)


In [None]:
val_sequences = MOT16Sequences('MOT16-reid', root_dir = osp.join(root_dir, 'data/MOT16'), vis_threshold=0.)

In [None]:
tracker = ReIDHungarianTracker(None)
run_tracker(val_sequences, db=train_db, tracker=tracker, output_dir=None)

Tracking: MOT16-02
Tracks found: 314
Runtime for MOT16-02: 6.4 s.
Tracking: MOT16-05
Tracks found: 295
Runtime for MOT16-05: 1.8 s.
Tracking: MOT16-09
Tracks found: 87
Runtime for MOT16-09: 1.7 s.
Tracking: MOT16-11
Tracks found: 187
Runtime for MOT16-11: 2.7 s.
Runtime for all sequences: 12.6 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT ML   FP    FN IDs   FM  MOTA  MOTP
MOT16-02 41.6% 59.1% 32.1% 52.2% 96.1%  62  11  39 12  390  8873 203  216 49.1% 0.096
MOT16-05 57.9% 68.5% 50.1% 68.8% 94.0% 133  56  65 12  305  2156 176  146 61.9% 0.142
MOT16-09 52.2% 64.6% 43.8% 66.3% 97.7%  26  13  12  1   82  1793  72   79 63.4% 0.083
MOT16-11 63.4% 69.9% 58.0% 80.2% 96.6%  75  44  24  7  266  1871  88   90 76.4% 0.083
OVERALL  51.6% 64.8% 42.8% 63.5% 96.1% 296 124 140 32 1043 14693 539  531 59.6% 0.099


## Exercise Part I - Long-Term ReID Tracker


The tracker above has an obvious limitation: whenever a track cannot be matched with the detections of a given frame the track will be killed. This means that if our detector misses an object in a single frame (due to e.g. occlusion), we will not be able to recover that track, and we will start a new one. 

To fix this issue, we would like to allow our tracker to maintain tracks that are not matched during data association. We will refer to these tracks as **inactive**. During data association, we will try to match the detected boxes for the current frame to both tracks that are active (i.e. tracks that we were able to match in the previous frame) as well as those that are inactive. Therefore, if a detector misses an object in a frame and the object reappears after a few frames, we will still be able to match it to its corresponding track, instead of creating a new one.

In order to adapt our tracker to have this behavior, we will use the `inactive` attribute from the `track` class (see `tracker/tracker.py`. This attribute will be assigned an integer indicating for how many frames a track has remained unmatched. Whenever we are able to match the track `t`, we will set `t.inactive=0` and, naturally, when tracks are initialized, the class constructor sets `inactive=0`. 

Your job is to maintain the `inactive` attribute of all tracks being kept by tracker so that its value represents the number of frames for which the track has been unmatched. Additionally, we introduce a `patience` parameter. Whenever a track has been inactive for more than `inactive` frames. it will need to be killed.

In [None]:
class LongTermReIDHungarianTracker(ReIDHungarianTracker):
    def __init__(self, patience, *args, **kwargs):
        """ Add a patience parameter"""
        self.patience=patience
        super().__init__(*args, **kwargs)

    def update_results(self):
        """Only store boxes for tracks that are active"""
        for t in self.tracks:
            if t.id not in self.results.keys():
                self.results[t.id] = {}
            if t.inactive == 0: # Only change
                self.results[t.id][self.im_index] = np.concatenate([t.box.cpu().numpy(), np.array([t.score])])

        self.im_index += 1        
        
    def update_tracks(self, row_idx, col_idx, distance, boxes, scores, pred_features):
        track_ids = [t.id for t in self.tracks]

        unmatched_track_ids = []
        seen_track_ids = []
        seen_box_idx = []
        for track_idx, box_idx in zip(row_idx, col_idx):
            costs = distance[track_idx, box_idx] 
            internal_track_id = track_ids[track_idx]
            seen_track_ids.append(internal_track_id)
            if costs == _UNMATCHED_COST:
                unmatched_track_ids.append(internal_track_id)

            else:
                self.tracks[track_idx].box = boxes[box_idx]
                self.tracks[track_idx].add_feature(pred_features[box_idx])
                
                # Note: the track is matched, therefore, inactive is set to 0
                self.tracks[track_idx].inactive=0
                seen_box_idx.append(box_idx)
                

        unseen_track_ids = set(track_ids) - set(seen_track_ids)
        unmatched_track_ids.extend(list(unseen_track_ids))
        ##################
        ### TODO starts
        ##################
        
        # Update the `inactive` attribute for those tracks that have been 
        # not been matched. kill those for which the inactive parameter 
        # is > self.patience
        for unmatched_track_idx in unmatched_track_ids:
          self.tracks[track_ids.index(unmatched_track_idx)].inactive = self.tracks[track_ids.index(unmatched_track_idx)].inactive + 1

        new_tracks = []
        for t in self.tracks:
          if t.inactive <= self.patience:
            new_tracks.append(t)
        self.tracks = new_tracks
        # self.tracks = [t for t in self.tracks
        #                    if t.inactive <= self.patience]
        
        ##################
        ### TODO ends
        ##################        
        
        new_boxes_idx = set(range(len(boxes))) - set(seen_box_idx)
        new_boxes = [boxes[i] for i in new_boxes_idx]
        new_scores = [scores[i] for i in new_boxes_idx]
        new_features = [pred_features[i] for i in new_boxes_idx]
        self.add(new_boxes, new_scores, new_features)

In [None]:
tracker = LongTermReIDHungarianTracker(patience=20, obj_detect=None)
run_tracker(val_sequences, db=train_db, tracker=tracker, output_dir=None)

Tracking: MOT16-02
Tracks found: 130
Runtime for MOT16-02: 6.7 s.
Tracking: MOT16-05
Tracks found: 155
Runtime for MOT16-05: 2.0 s.
Tracking: MOT16-09
Tracks found: 51
Runtime for MOT16-09: 1.4 s.
Tracking: MOT16-11
Tracks found: 91
Runtime for MOT16-11: 2.9 s.
Runtime for all sequences: 13.0 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT ML   FP    FN IDs   FM  MOTA  MOTP
MOT16-02 47.2% 67.0% 36.4% 52.2% 96.1%  62  11  38 13  390  8873 142  220 49.4% 0.095
MOT16-05 62.5% 74.0% 54.2% 68.8% 94.0% 133  56  65 12  305  2156 119  149 62.7% 0.142
MOT16-09 56.6% 70.0% 47.5% 66.3% 97.7%  26  12  13  1   82  1793  41   83 64.0% 0.085
MOT16-11 69.4% 76.5% 63.5% 80.2% 96.6%  75  44  25  6  266  1871  48   90 76.8% 0.083
OVERALL  56.9% 71.5% 47.3% 63.5% 96.1% 296 123 141 32 1043 14693 350  542 60.0% 0.099


## Exercise Part II - Building a tracker based on Neural Message Passing

Our ``LongTermReIDHungarianTracker`` is still limited when compared to current modern trackers. 

Firstly, it relies solely on appearance to predict similarity scores between objectes. This can be problematic whenever appearance alone may not discriminative, and it'd be best to also take into account object position and size attributes. Secondly, our tracker can only account for pairwise similarities among objects. Ideally, we would like it to also consider higher-order information.

To address these limitations. We will now build a tracker that will combine both apperance and position information with a Message Passing Neural Network, inspired by the approach presented in [Learning a Neural Solver for Multiple Object Tracking, CVPR 2020](https://arxiv.org/abs/1912.07515)

The overall idea will be to build, for every tracking step, a bipartite graph containing two sets of nodes: past tracks, and detections in the current frame. We will initialize node features with ReID embeddings, and edge features with relative position features and ReID distance. We will use an MPN to refine these edge embeddings. The learning task will be to classify the edge embeddings in this graph, which is equivalent to predicting the entries of our data association similarity matrix.


### Building an MPN for Bipartite Graphs

We will first build a Neural Message Passing layer based on the Graph Networks framework introduced in [Relational inductive biases, deep learning, and graph networks, arXiv 2020](https://arxiv.org/abs/1806.01261), as explained in the *A More General Framework* slides of [Lecture 5](https://www.moodle.tum.de/pluginfile.php/2928927/mod_resource/content/1/5.MOT2.pdf) (slides 70 to 75).

We will be using a bipartite graph, i.e., we will have two sets of nodes $A$ (past tracks), and $B$ (detections), and our set of edges will be $A\times B$. That is, we will connect every pair of past tracks and detections.

We will have initial node features (i.e. reid embeddings) matrices: $X_A$ and $X_B$ and an initial edge features tensor $E$.

$X_A$ and $X_B$ have shape $|A|\times \text{node_dim}$ and $|B|\times \text{node_dim}$, respectively.

$E$ has shape $|A| \times |B| \times \text{edge_dim}$. Its $(i, j)$ entry contains the edge features of node $i$ in $A$ and node $j$ in $B$.

With the given layer, we will produce new node feature matrices $X_A'$ and $X_B'$ and edge features $E'$ with the same dimensions. 
Please refer to the formulas in the slides and figure how to apply them in this setting.

You are asked to implement both the node and edge update steps in the class below

**NOTE 1**: Working with a bipartite graph allows us to vectorize all operations in the formulas in a straightforward manner (keep in mind that we store edge features in a matrix). Given a node in $A$, it is connected to all nodes in $B$.

**NOTE 2**: You do not need to care about batching several graphs. This implementation will only work with a single graph at a time.

In [None]:
from torch import nn

class BipartiteNeuralMessagePassingLayer(nn.Module):    
    def __init__(self, node_dim, edge_dim, dropout=0.):
        super().__init__()

        edge_in_dim  = 2*node_dim + 2*edge_dim # 2*edge_dim since we always concatenate initial edge features
        self.edge_mlp = nn.Sequential(*[nn.Linear(edge_in_dim, edge_dim), nn.ReLU(), nn.Dropout(dropout), 
                                    nn.Linear(edge_dim, edge_dim), nn.ReLU(), nn.Dropout(dropout)])

        node_in_dim  = node_dim + edge_dim
        # self.node_mlp = nn.Sequential(*[nn.Linear(node_in_dim, node_dim), nn.ReLU(), nn.Dropout(dropout),  
        #                                 nn.Linear(node_dim, node_dim), nn.ReLU(), nn.Dropout(dropout)])

        # dk
        self.node_mlp_1 = nn.Sequential(*[nn.Linear(node_in_dim, node_dim), nn.ReLU(), nn.Dropout(dropout),  
                                nn.Linear(node_dim, node_dim), nn.ReLU(), nn.Dropout(dropout)])
        self.node_mlp_2 = nn.Sequential(*[nn.Linear(node_in_dim, node_dim), nn.ReLU(), nn.Dropout(dropout),  
                                nn.Linear(node_dim, node_dim), nn.ReLU(), nn.Dropout(dropout)])

    def edge_update(self, edge_embeds, nodes_a_embeds, nodes_b_embeds):
        """
        Node-to-edge updates, as descibed in slide 71, lecture 5.
        Args:
            edge_embeds: torch.Tensor with shape (|A|, |B|, 2 x edge_dim) 
            nodes_a_embeds: torch.Tensor with shape (|A|, node_dim)
            nodes_a_embeds: torch.Tensor with shape (|B|, node_dim)
            
        returns:
            updated_edge_feats = torch.Tensor with shape (|A|, |B|, edge_dim) 
        """
        
        n_nodes_a, n_nodes_b, _  = edge_embeds.shape
        
        ########################
        #### TODO starts
        ########################
        _, node_dim = nodes_a_embeds.shape
        # edge_in = ... # has shape (|A|, |B|, 2*node_dim + 2*edge_dim) 
        tmp_1 = nodes_a_embeds.reshape(n_nodes_a,1,node_dim)
        tmp_1 = tmp_1.repeat(1,n_nodes_b,1)
        tmp_2 = nodes_b_embeds.reshape(1,n_nodes_b,node_dim)
        tmp_2 = tmp_2.repeat(n_nodes_a,1,1)

        edge_in = torch.cat((tmp_1,tmp_2,edge_embeds),2)

        
        ########################
        #### TODO ends
        ########################
        
        
        return self.edge_mlp(edge_in)

    # dk
    def node_update(self, edge_embeds_1, edge_embeds_2, nodes_a_embeds, nodes_b_embeds, nodes_c_embeds):
        """
        Edge-to-node updates, as descibed in slide 75, lecture 5.

        Args:
            edge_embeds_1: torch.Tensor with shape (|A|, |B|, 2 x edge_dim ) 
            edge_embeds_2: torch.Tensor with shape (|B|, |C|, 2 x edge_dim ) 
            nodes_a_embeds: torch.Tensor with shape (|A|, node_dim)
            nodes_b_embeds: torch.Tensor with shape (|B|, node_dim)
            
        returns:
            tuple(
                updated_nodes_a_embeds: torch.Tensor with shape (|A|, node_dim),
                updated_nodes_b_embeds: torch.Tensor with shape (|B|, node_dim)
                )
        """
        
        ########################
        #### TODO starts
        ########################
        
        # NOTE: Use 'sum' as aggregation function
        _, _, edge_dim = edge_embeds_1.shape
        # edge_dim = edge_dim / 2
        # edge_embeds = edge_embeds[:,:,:edge_dim]
        edge_embeds_a = torch.sum(edge_embeds_1,dim=1)
        edge_embeds_b = torch.sum(edge_embeds_1,dim=0)
        
        nodes_a_in = torch.cat((nodes_a_embeds,edge_embeds_a),1) # Has shape (|A|, node_dim + edge_dim) 
        nodes_b_in = torch.cat((nodes_b_embeds,edge_embeds_b),1) # Has shape (|B|, node_dim + edge_dim) 

        ########################
        #### TODO ends
        ########################

        nodes_a = self.node_mlp_1(nodes_a_in)
        nodes_b = self.node_mlp_1(nodes_b_in)

        ## next 

        edge_embeds_b = torch.sum(edge_embeds_2,dim=1)
        edge_embeds_c = torch.sum(edge_embeds_2,dim=0)
        
        nodes_b_in = torch.cat((nodes_b,       edge_embeds_b),1) # Has shape (|A|, node_dim + edge_dim) 
        nodes_c_in = torch.cat((nodes_c_embeds,edge_embeds_c),1) # Has shape (|B|, node_dim + edge_dim) 

        nodes_b = self.node_mlp_2(nodes_b_in)
        nodes_c = self.node_mlp_2(nodes_c_in)


        return nodes_a, nodes_b, nodes_c

    def forward(self, edge_embeds_1, edge_embeds_2, nodes_a_embeds, nodes_b_embeds, nodes_c_embeds):
        # edge_embeds_latent = self.edge_update(edge_embeds, nodes_a_embeds, nodes_b_embeds)
        # nodes_a_latent, nodes_b_latent = self.node_update(edge_embeds_latent, nodes_a_embeds, nodes_b_embeds)
        edge_embeds_1_ = self.edge_update(edge_embeds_1, nodes_a_embeds, nodes_b_embeds)
        edge_embeds_2_ = self.edge_update(edge_embeds_2, nodes_b_embeds, nodes_c_embeds)
        nodes_a_latent, nodes_b_latent, nodes_c_latent = self.node_update(edge_embeds_1_, edge_embeds_2_, nodes_a_embeds, nodes_b_embeds, nodes_c_embeds)

        return edge_embeds_1_, edge_embeds_2_, nodes_a_latent, nodes_b_latent, nodes_c_latent


## Building the entire network to predict similarities
We now build the network that generates initial node and edge features, performs neural message passing, and classifies edges in order to produce the final costs that we will use for data association.

You need to implement the method that computes the initial edge features. You can can follow [1] and, given a two bounding boxes $(x_i, y_i, w_i, h_i)$ and  $(x_j, y_j, w_j, h_j)$ and timestamps $t_i$ and $t_j$, compute an initial 5-dimensional edge feature vector as:
$$ E_(i, j) = \left (\frac{2(x_j - x_i)}{h_i + h_j}, \frac{2(y_j - y_i)}{h_i + h_j}, \log{\frac{h_i}{h_j}}, \log{\frac{w_i}{w_j}}, t_j - t_i \right )$$


Feel free to engineer your own features (e.g. use IoU, etc.)

In [None]:
from torch.nn import functional as F
class AssignmentSimilarityNet(nn.Module):
    def __init__(self, reid_network, node_dim, edge_dim, reid_dim, edges_in_dim, num_steps, dropout=0.):
        super().__init__()
        self.reid_network = reid_network
        self.graph_net = BipartiteNeuralMessagePassingLayer(node_dim=node_dim, edge_dim=edge_dim, dropout=dropout)
        self.num_steps = num_steps
        self.cnn_linear = nn.Linear(reid_dim, node_dim)
        self.edge_in_mlp = nn.Sequential(*[nn.Linear(edges_in_dim, edge_dim), nn.ReLU(), nn.Dropout(dropout), nn.Linear(edge_dim, edge_dim), nn.ReLU(),nn.Dropout(dropout)])
        self.classifier = nn.Sequential(*[nn.Linear(edge_dim, edge_dim), nn.ReLU(), nn.Linear(edge_dim, 1)])
        
    
    def compute_edge_feats(self, track_coords, current_coords, track_t, curr_t):    
        """
        Computes initial edge feature tensor

        Args:
            track_coords: track's frame box coordinates, given by top-left and bottom-right coordinates
                          torch.Tensor with shape (num_tracks, 4)
            current_coords: current frame box coordinates, given by top-left and bottom-right coordinates
                            has shape (num_boxes, 4)
                          
            track_t: track's timestamps, torch.Tensor with with shape (num_tracks, )
            curr_t: current frame's timestamps, torch.Tensor withwith shape (num_boxes,)        
            
        
        Returns:
            tensor with shape (num_trakcs, num_boxes, 5) containing pairwise
            position and time difference features 
        """

        ########################
        #### TODO starts
        ########################
        
        # NOTE 1: we recommend you to use box centers to compute distances
        # in the x and y coordinates.

        # NOTE 2: Check out the  code inside train_one_epoch function and 
        # LongTrackTrainingDataset class a few cells below to debug this
        
        track_n, _ = track_coords.shape
        current_n, _ = current_coords.shape

        track_coords_ = torch.zeros(track_coords.shape)
        current_coords_ = torch.zeros(current_coords.shape)
        track_coords_[:,0] = (track_coords[:,0]+track_coords[:,2])/2
        track_coords_[:,1] = (track_coords[:,1]+track_coords[:,3])/2
        track_coords_[:,2] = (track_coords[:,2]-track_coords[:,0])
        track_coords_[:,3] = (track_coords[:,3]-track_coords[:,1])
        current_coords_[:,0] = (current_coords[:,0]+current_coords[:,2])/2
        current_coords_[:,1] = (current_coords[:,1]+current_coords[:,3])/2
        current_coords_[:,2] = (current_coords[:,2]-current_coords[:,0])
        current_coords_[:,3] = (current_coords[:,3]-current_coords[:,1])

        track_coords_ = track_coords_.reshape(track_n,1,4)
        current_coords_ = current_coords_.reshape(1,current_n,4)
        track_coords_ = track_coords_.repeat(1,current_n,1)
        current_coords_ = current_coords_.repeat(track_n,1,1)

        track_t_ = track_t.reshape(track_n,1)
        track_t_ = track_t_.reshape(track_n,1,1)
        track_t_ = track_t_.repeat(1,current_n,1)
        curr_t_ = curr_t.reshape(current_n,1)
        curr_t_ = curr_t_.reshape(1,current_n,1)
        curr_t_ = curr_t_.repeat(track_n,1,1)
        # print("track_t_.shape: ",track_t_.shape)
        # print("curr_t_.shape: ",curr_t_.shape)

        edge_feats = torch.zeros(track_n,current_n,5).cuda()
        edge_feats[:,:,0] = 2*(-track_coords_[:,:,0]+current_coords_[:,:,0])/(track_coords_[:,:,3]+current_coords_[:,:,3])
        edge_feats[:,:,1] = 2*(-track_coords_[:,:,1]+current_coords_[:,:,1])/(track_coords_[:,:,3]+current_coords_[:,:,3])
        edge_feats[:,:,2] = np.log(track_coords_[:,:,3]/current_coords_[:,:,3])
        edge_feats[:,:,3] = np.log(track_coords_[:,:,2]/current_coords_[:,:,2])
        edge_feats[:,:,4] = (-track_t_+curr_t_).squeeze(-1)
        
        ########################
        #### TODO ends
        ########################

        return edge_feats # has shape (num_trakcs, num_boxes, 5)


    def forward(self, track_app, current_app, next_app, track_coords, current_coords, next_coords, track_t, curr_t, next_t):
        """
        Args:
            track_app: track's reid embeddings, torch.Tensor with shape (num_tracks, 512)
            current_app: current frame detections' reid embeddings, torch.Tensor with shape (num_boxes, 512)
            track_coords: track's frame box coordinates, given by top-left and bottom-right coordinates
                          torch.Tensor with shape (num_tracks, 4)
            current_coords: current frame box coordinates, given by top-left and bottom-right coordinates
                            has shape (num_boxes, 4)
                          
            track_t: track's timestamps, torch.Tensor with with shape (num_tracks, )
            curr_t: current frame's timestamps, torch.Tensor withwith shape (num_boxes,)
            
        Returns:
            classified edges: torch.Tensor with shape (num_steps, num_tracks, num_boxes),
                             containing at entry (step, i, j) the unnormalized probability that track i and 
                             detection j are a match, according to the classifier at the given neural message passing step
        """
        
        # Get initial edge embeddings to
        # print("track_app.device: ",track_app.device)
        # print("current_app.device: ",current_app.device)
        track_app = track_app.cuda()
        current_app = current_app.cuda()
        next_app = next_app.cuda()
        dist_reid_1 = cosine_distance(track_app, current_app).cuda()
        dist_reid_2 = cosine_distance(current_app, next_app).cuda()
        pos_edge_feats_1 = self.compute_edge_feats(track_coords, current_coords, track_t, curr_t)
        pos_edge_feats_2 = self.compute_edge_feats(current_coords, next_coords, curr_t, next_t)
        # print("pos_edge_feats.device: ",pos_edge_feats.device)
        # print("dist_reid.device: ",dist_reid.device)
        edge_feats_1 = torch.cat((pos_edge_feats_1, dist_reid_1.unsqueeze(-1)), dim=-1) # 6
        edge_feats_2 = torch.cat((pos_edge_feats_2, dist_reid_2.unsqueeze(-1)), dim=-1)
        edge_embeds_1 = self.edge_in_mlp(edge_feats_1)
        edge_embeds_2 = self.edge_in_mlp(edge_feats_2)
        initial_edge_embeds_1 = edge_embeds_1.clone()
        initial_edge_embeds_2 = edge_embeds_2.clone()

        # Get initial node embeddings, reduce dimensionality from 512 to node_dim
        track_embeds = F.relu(self.cnn_linear(track_app))
        curr_embeds = F.relu(self.cnn_linear(current_app))
        next_embeds = F.relu(self.cnn_linear(next_app))

        classified_edges = []
        for _ in range(self.num_steps):
            edge_embeds_1 = torch.cat((edge_embeds_1, initial_edge_embeds_1), dim=-1)    
            edge_embeds_2 = torch.cat((edge_embeds_2, initial_edge_embeds_2), dim=-1)         
            edge_embeds_1, edge_embeds_2, track_embeds, curr_embeds, next_embeds = self.graph_net(edge_embeds_1=edge_embeds_1, 
                                                                                                edge_embeds_2=edge_embeds_2, 
                                                                                                nodes_a_embeds=track_embeds, 
                                                                                                nodes_b_embeds=curr_embeds,
                                                                                                nodes_c_embeds=next_embeds
                                                                                                )

            # edge_embeds_1, edge_embeds_2, nodes_a_embeds, nodes_b_embeds, nodes_c_embeds

            classified_edges.append(self.classifier(edge_embeds_1))

        return torch.stack(classified_edges).squeeze(-1)

## Putting everything together

Finally, we incorporate our ``AssignmentSimilarityNet`` into our tracker. We can keep everything as in ``LongTermReIDHungarianTracker`` except for the distance computation, which is now directly obtained via a forward pass through AssignmentSimilarityNet.

In [None]:
_UNMATCHED_COST=255
class MPNTracker(LongTermReIDHungarianTracker):
    def __init__(self, assign_net, *args, **kwargs):
        self.assign_net = assign_net
        super().__init__(*args, **kwargs)
        
    def data_association(self, boxes, scores, pred_features, boxes_2, scores_2, pred_features_2):  
        if self.tracks:  
            track_boxes = torch.stack([t.box for t in self.tracks], axis=0).cuda()
            track_features = torch.stack([t.get_feature() for t in self.tracks], axis=0)
            
            # Hacky way to recover the timestamps of boxes and tracks
            curr_t = self.im_index * torch.ones((pred_features.shape[0],)).cuda()
            next_t = self.im_index * torch.ones((pred_features_2.shape[0],)).cuda()
            track_t = torch.as_tensor([self.im_index - t.inactive - 1 for t in self.tracks]).cuda()

            ########################
            #### TODO starts
            ########################
            
            # Do a forward pass through self.assign_net to obtain our costs.
            # Note: self.assign_net will return unnormalized probabilities. 
            # Make sure to apply the sigmoid function to them!

            # track_app, current_app, track_coords, current_coords, track_t, curr_t

            # pred_features = pred_features.cuda()
            # track_features = track_features.cuda()
            pred_sim = self.assign_net(track_features, pred_features, pred_features_2, track_boxes, boxes, boxes_2, track_t, curr_t, next_t)
            # pred_sim = pred_sim[-1]
            # print("###")
            # print("pred_sim.device:",pred_sim.device)
            # print("pred_features.device:",pred_features.device)
            # pred_sim = pred_sim[-1,:,:]
            # pred_sim = F.softmax(pred_sim,dim=1)
            # print("pred_sim.shape: ",pred_sim.shape)
            pred_sim = torch.sigmoid(pred_sim)

            ########################
            #### TODO ends
            ########################

            pred_sim = pred_sim[-1]  # Use predictions at last message passing step
            pred_sim = pred_sim.cpu().numpy()
            distance = (1- pred_sim) 
            
            # Do not allow mataches when sim < 0.5, to avoid low-confident associations
            distance = np.where(pred_sim < 0.5, _UNMATCHED_COST, distance) 

            # Perform Hungarian matching.
            row_idx, col_idx = linear_assignment(distance)            
            self.update_tracks(row_idx, col_idx, distance, boxes, scores, pred_features)

            
        else:
            # No tracks exist.
            self.add(boxes, scores, pred_features)

## Training and evaluating our model

We provide all boilerplate code for training our neural message passing based
tracker, as well as evaluating. 

Under the hood, we are sampling frames randomly from our training sequences, and then sampling boxes from past frames as past_tracks to generate our 
training
data. Check out `LongTrackTrainingDataset` for details.

We train the model with a weighted cross-entropy loss
to account for the class imbalance. Check out `train_one_epoch` if you're 
interested.

No need to write any code from your side here!


In [None]:
from gnn.dataset import LongTrackTrainingDataset
from torch.utils.data import DataLoader
from gnn.trainer import train_one_epoch

MAX_PATIENCE = 20 # 20
MAX_EPOCHS = 2 # 15
EVAL_FREQ = 1


# Define our model, and init 
# assign_net = AssignmentSimilarityNet(reid_network=None, # Not needed since we work with precomputed features
#                                      node_dim=32, 
#                                      edge_dim=64, 
#                                      reid_dim=512, 
#                                      edges_in_dim=6, 
#                                      num_steps=10).cuda()

assign_net = AssignmentSimilarityNet(reid_network=None, # Not needed since we work with precomputed features
                                     node_dim=32, 
                                     edge_dim=16, 
                                     reid_dim=512, 
                                     edges_in_dim=6, 
                                     num_steps=10).cuda()

# # We only keep two sequences for validation. You can
# dataset = LongTrackTrainingDataset(dataset='MOT16-train_wo_val2', 
#                                    db=train_db, 
#                                    root_dir= osp.join(root_dir, 'data/MOT16'),
#                                    max_past_frames = MAX_PATIENCE,
#                                    vis_threshold=0.25)

# data_loader = DataLoader(dataset, batch_size=8, collate_fn = lambda x: x, 
#                          shuffle=True, num_workers=2, drop_last=True)
device = torch.device('cuda')
optimizer = torch.optim.Adam(assign_net.parameters(), lr=0.0015) # lr=0.001
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3)

We only leave 2 sequences for validation in order to maximize 
the amount of training data. For your convenience, here are the
 LongTermReIDTracker results on them. Your validation IDF1 scores should show an improvement of ~0.5 over them.
```

          IDF1   IDP   IDR  Rcll  Prcn  GT MT PT ML  FP    FN IDs   FM  MOTA  MOTP
MOT16-02 47.2% 67.0% 36.4% 52.2% 96.1%  62 11 38 13 390  8873 142  220 49.4% 0.095
MOT16-11 69.4% 76.5% 63.5% 80.2% 96.6%  75 44 25  6 266  1871  48   90 76.8% 0.083
OVERALL  55.5% 71.2% 45.5% 61.7% 96.3% 137 55 63 19 656 10744 190  310 58.6% 0.090
```



Let's start training!

Note that we have observed quite a lot of noise in validation scores among epochs and runs. This can be explained due to the small size of our training and
validation sets.
We recommend you to perform early stopping to obtain the best performing model on validation. 

In [None]:

# for epoch in range(1, MAX_EPOCHS + 1):
#     print(f"-------- EPOCH {epoch:2d} --------")
#     train_one_epoch(model = assign_net, data_loader=data_loader, optimizer=optimizer, print_freq=100)
#     scheduler.step()

#     if epoch % EVAL_FREQ == 0:
#         tracker =  MPNTracker(assign_net=assign_net.eval(), obj_detect=None, patience=MAX_PATIENCE)
#         val_sequences = MOT16Sequences('MOT16-val2', osp.join(root_dir, 'data/MOT16'), vis_threshold=0.)
#         run_tracker(val_sequences, db=train_db, tracker=tracker, output_dir=None)

train_datasets = ['MOT16-train_wo_04','MOT16-train_wo_09', 'MOT16-train_wo_10','MOT16-train_wo_05',
                  'MOT16-train_wo_11','MOT16-train_wo_13', 'MOT16-train_wo_02']
val_datasets = ['MOT16-04', 'MOT16-09', 'MOT16-10', 'MOT16-05', 'MOT16-11', 'MOT16-13', 'MOT16-02']

for epoch in range(1, MAX_EPOCHS + 1):
    print(f"-------- EPOCH {epoch:2d} --------")
    for j in range(len(train_datasets)):
        dataset = LongTrackTrainingDataset(dataset=train_datasets[j], 
                                   db=train_db, 
                                   root_dir= osp.join(root_dir, 'data/MOT16'),
                                   max_past_frames = MAX_PATIENCE,
                                   vis_threshold=0.25)

        data_loader = DataLoader(dataset, batch_size=8, collate_fn = lambda x: x, 
                                shuffle=True, num_workers=2, drop_last=True)
        train_one_epoch(model = assign_net, data_loader=data_loader, optimizer=optimizer, print_freq=100)
        scheduler.step()

        tracker =  MPNTracker(assign_net=assign_net.eval(), obj_detect=None, patience=MAX_PATIENCE)
        val_sequences = MOT16Sequences(val_datasets[j], osp.join(root_dir, 'data/MOT16'), vis_threshold=0.)
        run_tracker(val_sequences, db=train_db, tracker=tracker, output_dir=None)


-------- EPOCH  1 --------


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 1.206. Accuracy: 0.842. Recall: 0.622. Precision: 0.298
Iter 200. Loss: 0.184. Accuracy: 0.990. Recall: 0.984. Precision: 0.902
Iter 300. Loss: 0.082. Accuracy: 0.997. Recall: 0.990. Precision: 0.972
Iter 400. Loss: 0.039. Accuracy: 0.998. Recall: 0.996. Precision: 0.978
Iter 500. Loss: 0.029. Accuracy: 0.998. Recall: 0.998. Precision: 0.979

Tracking: MOT16-04
Tracks found: 94
Runtime for MOT16-04: 43.2 s.
Runtime for all sequences: 43.2 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML  FP    FN IDs   FM  MOTA  MOTP
MOT16-04 71.2% 82.8% 62.4% 73.9% 97.9% 83 40 30 13 741 12434  73  287 72.1% 0.102
OVERALL  71.2% 82.8% 62.4% 73.9% 97.9% 83 40 30 13 741 12434  73  287 72.1% 0.102


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.035. Accuracy: 0.999. Recall: 0.993. Precision: 0.985
Iter 200. Loss: 0.023. Accuracy: 0.999. Recall: 0.997. Precision: 0.988
Iter 300. Loss: 0.023. Accuracy: 0.997. Recall: 0.999. Precision: 0.977
Iter 400. Loss: 0.012. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 500. Loss: 0.013. Accuracy: 0.999. Recall: 0.999. Precision: 0.990

Tracking: MOT16-09
Tracks found: 30
Runtime for MOT16-09: 8.8 s.
Runtime for all sequences: 8.8 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML FP   FN IDs  FM  MOTA  MOTP
MOT16-09 51.8% 64.1% 43.5% 66.3% 97.7% 26 12 13  1 82 1793  31  76 64.2% 0.081
OVERALL  51.8% 64.1% 43.5% 66.3% 97.7% 26 12 13  1 82 1793  31  76 64.2% 0.081


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.013. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 200. Loss: 0.013. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 300. Loss: 0.015. Accuracy: 0.999. Recall: 0.998. Precision: 0.988
Iter 400. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 500. Loss: 0.010. Accuracy: 0.999. Recall: 0.999. Precision: 0.994

Tracking: MOT16-10
Tracks found: 133
Runtime for MOT16-10: 13.7 s.
Runtime for all sequences: 13.7 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML   FP   FN IDs   FM  MOTA  MOTP
MOT16-10 61.5% 65.0% 58.3% 80.3% 89.5% 57 37 19  1 1205 2527 221  261 69.2% 0.150
OVERALL  61.5% 65.0% 58.3% 80.3% 89.5% 57 37 19  1 1205 2527 221  261 69.2% 0.150


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.008. Accuracy: 1.000. Recall: 0.999. Precision: 0.994
Iter 200. Loss: 0.007. Accuracy: 1.000. Recall: 0.999. Precision: 0.995
Iter 300. Loss: 0.007. Accuracy: 0.999. Recall: 1.000. Precision: 0.992
Iter 400. Loss: 0.006. Accuracy: 1.000. Recall: 1.000. Precision: 0.994
Iter 500. Loss: 0.007. Accuracy: 1.000. Recall: 0.999. Precision: 0.994

Tracking: MOT16-05
Tracks found: 103
Runtime for MOT16-05: 13.1 s.
Runtime for all sequences: 13.1 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT MT PT ML  FP   FN IDs   FM  MOTA  MOTP
MOT16-05 62.9% 74.4% 54.5% 68.8% 94.0% 133 56 65 12 305 2156  65  148 63.5% 0.142
OVERALL  62.9% 74.4% 54.5% 68.8% 94.0% 133 56 65 12 305 2156  65  148 63.5% 0.142


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.014. Accuracy: 0.999. Recall: 0.998. Precision: 0.990
Iter 200. Loss: 0.012. Accuracy: 0.999. Recall: 0.999. Precision: 0.989
Iter 300. Loss: 0.014. Accuracy: 0.999. Recall: 0.999. Precision: 0.986
Iter 400. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 500. Loss: 0.012. Accuracy: 0.999. Recall: 0.999. Precision: 0.988

Tracking: MOT16-11
Tracks found: 89
Runtime for MOT16-11: 14.4 s.
Runtime for all sequences: 14.4 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML  FP   FN IDs  FM  MOTA  MOTP
MOT16-11 70.9% 78.1% 64.8% 80.2% 96.6% 75 44 24  7 266 1871  34  90 77.0% 0.083
OVERALL  70.9% 78.1% 64.8% 80.2% 96.6% 75 44 24  7 266 1871  34  90 77.0% 0.083


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.010. Accuracy: 0.999. Recall: 0.999. Precision: 0.992
Iter 200. Loss: 0.008. Accuracy: 0.999. Recall: 1.000. Precision: 0.991
Iter 300. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 400. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.992
Iter 500. Loss: 0.008. Accuracy: 0.999. Recall: 0.999. Precision: 0.992

Tracking: MOT16-13
Tracks found: 133
Runtime for MOT16-13: 14.8 s.
Runtime for all sequences: 14.8 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT MT PT ML   FP   FN IDs   FM  MOTA  MOTP
MOT16-13 66.0% 67.1% 64.9% 85.5% 88.4% 110 84 22  4 1311 1692 113  196 73.2% 0.138
OVERALL  66.0% 67.1% 64.9% 85.5% 88.4% 110 84 22  4 1311 1692 113  196 73.2% 0.138


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.011. Accuracy: 0.999. Recall: 0.998. Precision: 0.992
Iter 200. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.993
Iter 300. Loss: 0.010. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 400. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 500. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.991

Tracking: MOT16-02
Tracks found: 98
Runtime for MOT16-02: 14.5 s.
Runtime for all sequences: 14.5 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML  FP   FN IDs   FM  MOTA  MOTP
MOT16-02 47.9% 68.0% 37.0% 52.2% 96.1% 62 11 38 13 390 8873  97  216 49.6% 0.094
OVERALL  47.9% 68.0% 37.0% 52.2% 96.1% 62 11 38 13 390 8873  97  216 49.6% 0.094
-------- EPOCH  2 --------


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.013. Accuracy: 0.999. Recall: 0.998. Precision: 0.988
Iter 200. Loss: 0.012. Accuracy: 0.999. Recall: 0.999. Precision: 0.987
Iter 300. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.989
Iter 400. Loss: 0.013. Accuracy: 0.999. Recall: 0.999. Precision: 0.988
Iter 500. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.991

Tracking: MOT16-04
Tracks found: 94
Runtime for MOT16-04: 41.3 s.
Runtime for all sequences: 41.3 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML  FP    FN IDs   FM  MOTA  MOTP
MOT16-04 73.0% 84.9% 64.0% 73.9% 97.9% 83 40 30 13 741 12434  62  287 72.2% 0.102
OVERALL  73.0% 84.9% 64.0% 73.9% 97.9% 83 40 30 13 741 12434  62  287 72.2% 0.102


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.013. Accuracy: 0.999. Recall: 0.998. Precision: 0.989
Iter 200. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 300. Loss: 0.008. Accuracy: 0.999. Recall: 0.999. Precision: 0.992
Iter 400. Loss: 0.011. Accuracy: 0.999. Recall: 0.998. Precision: 0.991
Iter 500. Loss: 0.007. Accuracy: 0.999. Recall: 1.000. Precision: 0.990

Tracking: MOT16-09
Tracks found: 31
Runtime for MOT16-09: 8.3 s.
Runtime for all sequences: 8.3 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML FP   FN IDs  FM  MOTA  MOTP
MOT16-09 51.0% 63.1% 42.8% 66.3% 97.7% 26 12 13  1 82 1793  33  78 64.2% 0.083
OVERALL  51.0% 63.1% 42.8% 66.3% 97.7% 26 12 13  1 82 1793  33  78 64.2% 0.083


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 200. Loss: 0.007. Accuracy: 0.999. Recall: 1.000. Precision: 0.992
Iter 300. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 400. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 500. Loss: 0.008. Accuracy: 0.999. Recall: 0.999. Precision: 0.992

Tracking: MOT16-10
Tracks found: 124
Runtime for MOT16-10: 13.9 s.
Runtime for all sequences: 13.9 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML   FP   FN IDs   FM  MOTA  MOTP
MOT16-10 64.9% 68.6% 61.6% 80.3% 89.5% 57 37 19  1 1206 2528 190  258 69.4% 0.150
OVERALL  64.9% 68.6% 61.6% 80.3% 89.5% 57 37 19  1 1206 2528 190  258 69.4% 0.150


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.006. Accuracy: 1.000. Recall: 1.000. Precision: 0.993
Iter 200. Loss: 0.007. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 300. Loss: 0.008. Accuracy: 1.000. Recall: 0.999. Precision: 0.994
Iter 400. Loss: 0.007. Accuracy: 1.000. Recall: 1.000. Precision: 0.994
Iter 500. Loss: 0.007. Accuracy: 0.999. Recall: 0.999. Precision: 0.993

Tracking: MOT16-05
Tracks found: 102
Runtime for MOT16-05: 12.8 s.
Runtime for all sequences: 12.8 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT MT PT ML  FP   FN IDs   FM  MOTA  MOTP
MOT16-05 62.9% 74.4% 54.5% 68.8% 94.0% 133 56 65 12 305 2156  64  148 63.5% 0.142
OVERALL  62.9% 74.4% 54.5% 68.8% 94.0% 133 56 65 12 305 2156  64  148 63.5% 0.142


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.013. Accuracy: 0.999. Recall: 0.999. Precision: 0.988
Iter 200. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.989
Iter 300. Loss: 0.012. Accuracy: 0.999. Recall: 0.998. Precision: 0.990
Iter 400. Loss: 0.010. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 500. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.989

Tracking: MOT16-11
Tracks found: 89
Runtime for MOT16-11: 14.6 s.
Runtime for all sequences: 14.6 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML  FP   FN IDs  FM  MOTA  MOTP
MOT16-11 70.9% 78.1% 64.8% 80.2% 96.6% 75 44 24  7 266 1871  34  90 77.0% 0.083
OVERALL  70.9% 78.1% 64.8% 80.2% 96.6% 75 44 24  7 266 1871  34  90 77.0% 0.083


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.992
Iter 200. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 300. Loss: 0.007. Accuracy: 0.999. Recall: 1.000. Precision: 0.992
Iter 400. Loss: 0.008. Accuracy: 0.999. Recall: 0.999. Precision: 0.992
Iter 500. Loss: 0.006. Accuracy: 0.999. Recall: 1.000. Precision: 0.992

Tracking: MOT16-13
Tracks found: 134
Runtime for MOT16-13: 14.7 s.
Runtime for all sequences: 14.7 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT MT PT ML   FP   FN IDs   FM  MOTA  MOTP
MOT16-13 66.9% 68.1% 65.8% 85.5% 88.4% 110 84 22  4 1311 1692 110  196 73.3% 0.138
OVERALL  66.9% 68.1% 65.8% 85.5% 88.4% 110 84 22  4 1311 1692 110  196 73.3% 0.138


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Iter 100. Loss: 0.010. Accuracy: 0.999. Recall: 0.999. Precision: 0.991
Iter 200. Loss: 0.010. Accuracy: 0.999. Recall: 0.999. Precision: 0.993
Iter 300. Loss: 0.011. Accuracy: 0.999. Recall: 0.999. Precision: 0.990
Iter 400. Loss: 0.009. Accuracy: 0.999. Recall: 1.000. Precision: 0.990
Iter 500. Loss: 0.009. Accuracy: 0.999. Recall: 0.999. Precision: 0.991

Tracking: MOT16-02
Tracks found: 98
Runtime for MOT16-02: 13.9 s.
Runtime for all sequences: 13.9 s.
          IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML  FP   FN IDs   FM  MOTA  MOTP
MOT16-02 47.9% 68.0% 37.0% 52.2% 96.1% 62 11 38 13 390 8873  97  216 49.6% 0.094
OVERALL  47.9% 68.0% 37.0% 52.2% 96.1% 62 11 38 13 390 8873  97  216 49.6% 0.094


In [None]:
torch.save(assign_net, "assign_net_070701.pkl")
!cp assign_net_070701.pkl "/content/gdrive/MyDrive/cv3"

# Exercise submission

The `seq.write_results(results, os.path.join(output_dir))` statement saves predicted tracks into files. After executing this notebook the `output` directory in your Google Drive should contain multiple `MOT16-XY.txt` files.

For the final submission you have to process the test sequences and upload the zipped prediction files to our server. See moodle for a guide how to upload the results.

In [None]:
!ls

cv3dst_exercise      cv3dst_gnn_exercise      sample_data
cv3dst_exercise.zip  cv3dst_gnn_exercise.zip


In [None]:
# root_dir= osp.join(root_dir, 'data/MOT16')
test_root_dir= "./cv3dst_exercise/data/MOT16"

In [None]:
MAX_PATIENCE = 60
tracker =  MPNTracker(assign_net=assign_net.eval(), obj_detect=None, patience=MAX_PATIENCE)
test_db = torch.load(osp.join(gnn_root_dir, 'data/preprocessed_data_test_2.pth'))
val_sequences = MOT16Sequences('MOT16-test', test_root_dir, vis_threshold=0.)
run_tracker(val_sequences, db=test_db, tracker=tracker, output_dir='output')

Tracking: MOT16-01
No GT evaluation data available.
Tracks found: 53
Runtime for MOT16-01: 5.0 s.
Writing predictions to: output/MOT16-01.txt
Tracking: MOT16-03
No GT evaluation data available.
Tracks found: 82839
Runtime for MOT16-03: 283.7 s.
Writing predictions to: output/MOT16-03.txt
Tracking: MOT16-08
No GT evaluation data available.
Tracks found: 80
Runtime for MOT16-08: 7.0 s.
Writing predictions to: output/MOT16-08.txt
Tracking: MOT16-12
No GT evaluation data available.
Tracks found: 102
Runtime for MOT16-12: 9.7 s.
Writing predictions to: output/MOT16-12.txt
Runtime for all sequences: 305.5 s.


In [None]:
!zip -q -r output_070701.zip ./output

In [None]:
!cp output_070701.zip /content/gdrive/MyDrive/cv3/

In [None]:
patiences = [10, 20, 30, 40, 50]
val_sequences = MOT16Sequences('MOT16-train', osp.join(root_dir, 'data/MOT16'), vis_threshold=0.)
for patience in patiences:
    print("patience: ", patience)
    tracker =  MPNTracker(assign_net=assign_net.eval(), obj_detect=None, patience=patience)    
    run_tracker(val_sequences, db=train_db, tracker=tracker, output_dir=None)

patience:  10
Tracking: MOT16-04
Tracks found: 120
Runtime for MOT16-04: 41.1 s.
Tracking: MOT16-13
Tracks found: 174
Runtime for MOT16-13: 14.4 s.
Tracking: MOT16-10
Tracks found: 177
Runtime for MOT16-10: 13.9 s.
Tracking: MOT16-05
Tracks found: 140
Runtime for MOT16-05: 12.8 s.
Tracking: MOT16-11
Tracks found: 109
Runtime for MOT16-11: 14.6 s.
Tracking: MOT16-02
Tracks found: 134
Runtime for MOT16-02: 13.5 s.
Tracking: MOT16-09
Tracks found: 38
Runtime for MOT16-09: 8.2 s.
Runtime for all sequences: 118.6 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT ML   FP    FN IDs    FM  MOTA  MOTP
MOT16-04 70.7% 82.2% 62.0% 73.9% 97.9%  83  40  30 13  741 12434  87   287 72.1% 0.102
MOT16-13 69.7% 70.9% 68.6% 85.5% 88.4% 110  84  22  4 1311  1692 108   196 73.3% 0.138
MOT16-10 61.4% 64.9% 58.2% 80.3% 89.5%  57  37  19  1 1205  2527 202   261 69.4% 0.150
MOT16-05 59.1% 69.9% 51.2% 68.8% 94.0% 133  56  65 12  305  2156  79   148 63.3% 0.142
MOT16-11 70.4% 77.6% 64.4% 80.2% 96.6%  75  44  

In [None]:
patiences = [60, 70, 80, 90]
val_sequences = MOT16Sequences('MOT16-train', osp.join(root_dir, 'data/MOT16'), vis_threshold=0.)
for patience in patiences:
    print("patience: ", patience)
    tracker =  MPNTracker(assign_net=assign_net_2.eval(), obj_detect=None, patience=patience)    
    run_tracker(val_sequences, db=train_db, tracker=tracker, output_dir=None)

patience:  60
Tracking: MOT16-04
Tracks found: 76
Runtime for MOT16-04: 33.0 s.
Tracking: MOT16-13
Tracks found: 115
Runtime for MOT16-13: 12.0 s.
Tracking: MOT16-10
Tracks found: 100
Runtime for MOT16-10: 10.4 s.
Tracking: MOT16-05
Tracks found: 88
Runtime for MOT16-05: 10.6 s.
Tracking: MOT16-11
Tracks found: 80
Runtime for MOT16-11: 11.1 s.
Tracking: MOT16-02
Tracks found: 71
Runtime for MOT16-02: 10.6 s.
Tracking: MOT16-09
Tracks found: 26
Runtime for MOT16-09: 6.2 s.
Runtime for all sequences: 93.9 s.
          IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT ML   FP    FN IDs    FM  MOTA  MOTP
MOT16-04 75.3% 87.6% 66.0% 73.9% 97.9%  83  40  30 13  741 12434  49   288 72.2% 0.102
MOT16-13 65.1% 66.2% 64.0% 85.5% 88.4% 110  84  22  4 1311  1692 108   196 73.3% 0.138
MOT16-10 67.1% 71.0% 63.7% 80.3% 89.5%  57  37  19  1 1206  2528 184   258 69.5% 0.150
MOT16-05 61.2% 72.4% 53.0% 68.8% 94.0% 133  56  65 12  305  2156  63   148 63.5% 0.142
MOT16-11 73.8% 81.4% 67.6% 80.2% 96.6%  75  44  24  7