Skip to content

linear assignment

Jian Tay edited this page Jun 15, 2021 · 3 revisions

Object tracking using the linear assignment framework

Object tracking is the process of identifying and following the same object through different frames of a movie. Tracking is usually accomplished by comparing the properties of objects (such as position, size, intensity, or a combination of these properties) and finding objects that are most similar. The data from these objects are grouped together (linked) into a single track.

Example of object tracking

Linking objects by nearest distance

The image above shows the process of tracking the bee A in the first frame. In frame 2, there are two possible positions B and C (new detections). To determine the new position of bee A, we can compute the distance between A in frame 1 and the two candidate positions B and C in frame 2. In this case, since d2 < d1 it is most likely that bee A has moved to position B.

The linear assignment approach

In the linear assignment framework, each object is assigned a cost that represent the probability that it is the same object detected in the previous frame. These costs can be either distance between centroid positions (as in the bee example above) or the ratio of overlapping pixels.

Alternative outcomes, such as the object being lost (e.g. by drifting out of the field of view) or a new object being detected (e.g. by entering the field of view or cell division) are also considered. These outcomes are assigned an alternative cost C that is 1.05 x maximum linking score. These costs are all combined into a single cost matrix:

Linking objects by nearest distance

A linear assignment algorithm is then used to assign each row to a column, such that the total cost is minimized. The assignment therefore represents the most likely outcome of each object.

The cost matrix

In this section, we assume that there are M objects tracked from the previous frame and N newly detected objects in the current frame.

The cost matrix consists of four distinct quadrants:

  1. Linking matrix (top left) is an M-by-N matrix containing the cost of linking newly detected objects to objects already tracked from previous frames. These costs are typically either the euclidean (straight line) distance between centroid positions or the ratio of overlapping pixels (described below).

  2. Cost for missing objects (top right) is an M-by-M matrix containing the alternative cost of a previously tracked object being absent in the current frame. This could happen if the object left the field of view or is obscured.

    The missing cost matrix contains the alternative cost C along its diagonals and is infinite (Inf) everywhere else.

    The alternative cost is defined as

    C = 1.05 \times \max(L)
    

    where $L$ is the linking matrix. Tracks are assigned to this outcome if no better links can be formed.

  3. Cost for new tracks (bottom left) is an N-by-N matrix containing the alternative cost of a new detection being a new object. As above, this matrix contains the alternative cost $C$ along its diagonal and is infinite elsewhere.

    If a new detection is assigned to this row, it is tested for division.

  4. Auxiliary matrix (bottom right) is an N-by-M matrix to complete the cost matrix. This matrix is computed by taking the transpose of the linking matrix. Then, any element that is less than infinity is replaced by the minimum linking cost (i.e. $min(L)$). Assignments here have no meaning - this quadrant is simply required since matrices must be rectangular.

Changes from original publication

This toolbox introduces several changes to Jaqaman's original description of the algorithm:

  • This code was originally developed to track non-motile rod-shaped bacteria. Due to the rod-shape, using the centroid position to link the cells resulted in linking errors with parallel cells. We introduced the pxintersect metric to compute the ratio of overlapping pixels which performed much better.
  • The original code utilized the Munkres/Hungarian algorithm to solve the linear assignment problem. Here, we use the Jonker-Volgenant (JV) algorithm. In internal testing, we found that the JV algorithm solved assignment about 10x faster than the Hungarian algorithm.
  • In the original implementation, tracks were linked throughout the whole movie and a correction step was then introduced to find merging or division events. Here, we test for division events whenever a new track is created rather than waiting until the end of the movie.

Ratio of overlapping pixels

Overlapping pixels

Since we work with rod-shaped non-motile bacteria, using the centroid positions to link cells generated a lot of errors when between parallel cells. Hence, we developed a metric named pxintersect to calculate the ratio of overlapping pixels. The metric is

$$L = \frac{|A \cup B|}{|A \cap B|} = \frac{\mathrm{Number\ pixels\ union}}{\mathrm{Number\ pixels\ intersect}}$$

The number of pixels union is the number of unique pixels in the masks of the two objects, while the number of pixels intersect is the number of unique pixels that are identical in the two masks.

During cell division, using this metric allows a division event to be detected when two cells overlap the same cell in a previous frame.

Overlapping pixels

< Back to User Guide