Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<img src="scenarios/media/logo_cvbp.png" align="right" alt="" width="300"/>

```diff
+ Update June 24: Added action recognition as new core scenario.
+ Update June 24: Added action recognition as new core scenario.
+ Object tracking coming soon (in 2-4 weeks).
```

Expand Down Expand Up @@ -37,7 +37,7 @@ Our target audience for this repository includes data scientists and machine lea
To get started, navigate to the [Setup Guide](SETUP.md), which lists
instructions on how to setup the compute environment and dependencies needed to run the
notebooks in this repo. Once your environment is setup, navigate to the
[Scenarios](scenarios) folder and start exploring the notebooks.
[Scenarios](scenarios) folder and start exploring the notebooks. We recommend to start with the *image classification* notebooks, since this introduces concepts which are also used by the other scenarios (e.g. pre-training on ImageNet).

Alternatively, we support Binder
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/PatrickBue/computervision-recipes/master?filepath=scenarios%2Fclassification%2F01_training_introduction_BINDER.ipynb)
Expand Down
10 changes: 5 additions & 5 deletions scenarios/tracking/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,23 @@
# Multi-Object Tracking

```diff
+ June 2020: This work is ongoing.
+ June 2020: All notebooks/code in this directory is work-in-progress and might not fully execute.
```

This directory provides examples and best practices for building multi-object tracking systems. Our goal is to enable the users to bring their own datasets and train a high-accuracytracking model easily. While there are many open-source trackers available, we have implemented the [FairMOT tracker](https://github.com/ifzhang/FairMOT) specifically, as its algorithm has shown competitive tracking performance in recent MOT benchmarking challenges, at fast inference speed.

## Technology
Multi-object-tracking (MOT) is one of the hot research topics in Computer Vision, due to its wide applications in autonomous driving, traffic surveillance, etc. It builds on object detection technology, in order to detect and track all objects in a dynamic scene over time. Inferring target trajectories correctly across successive image frames remains challenging: occlusion happens when objects overlap; the number of and appearance of objects can change. Compared to object detection algorithms, which aim to output rectangular bounding boxes around the objects, MOT algorithms additionally associated an ID number to each box to identify that specific object across the image frames.
Multi-object-tracking (MOT) is one of the hot research topics in Computer Vision, due to its wide applications in autonomous driving, traffic surveillance, etc. It builds on object detection technology, in order to detect and track all objects in a dynamic scene over time. Inferring target trajectories correctly across successive image frames remains challenging: occlusion happens when objects overlap; the number of and appearance of objects can change. Compared to object detection algorithms, which aim to output rectangular bounding boxes around the objects, MOT algorithms additionally associated an ID number to each box to identify that specific object across the image frames.

As seen in the figure below ([Ciaparrone, 2019](https://arxiv.org/pdf/1907.12740.pdf)), a typical multi-object-tracking algorithm performs part or all of the following steps:
* Detection: Given the input raw image frames (step 1), the detector identifies object(s) on each image frame as bounding box(es) (step 2).
* Feature extraction/motion prediction: For every detected object, visual appearance and motion features are extracted (step 3). Sometimes, a motion predictor (e.g. Kalman Filter) is also added to predict the next position of each tracked target.
* Affinity: The feature and motion predictions are used to calculate similarity/distance scores between pairs of detections and/or tracklets, or the probabilities of detections belonging to a given target or tracklet (step 4).
* Feature extraction/motion prediction: For every detected object, visual appearance and motion features are extracted (step 3). Sometimes, a motion predictor (e.g. Kalman Filter) is also added to predict the next position of each tracked target.
* Affinity: The feature and motion predictions are used to calculate similarity/distance scores between pairs of detections and/or tracklets, or the probabilities of detections belonging to a given target or tracklet (step 4).
* Association: Based on these scores/probabilities, a specific numerical ID is assigned to each detected object as it is tracked across successive image frames (step 5).

<p align="center">
<img src="./media/figure_motmodules2.jpg" width="700" align="center"/>
</p>
</p>


## State-of-the-art (SoTA)
Expand Down