Vehicle Re-Identification using Track-to-track ranking of deep latent representation of vehicles

This repository is related to the publication "Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension" (https://arxiv.org/abs/1910.09458). This paper is a postprint of the paper submitted and accepted to IET Computer Vision (https://digital-library.theiet.org/content/journals/iet-cvi).

Vehicle Re-Identification using CNN latent representation

Latent representation extraction

We defined track of vehicle $T_k=\{I_{k,1}, ..., I_{k,N_k}\}$ as a set of $N_k$ images of a vehicle recorded by a given camera.

For a given image $I_{k,i}\in \mathbb{R}^{n\times m}$ , we extract its latent representation (LR) $L_{k,i} \in \mathbb{R}^{f}$ by projecting it in the latent space of a neural network $\mathcal{N}$ (in our experiments, the second-to-last layer of a CNN).

We construct the matrix $\mathbf{L}_{k}=[L_{k,1}, ..., L_{k,N_k}] \in \mathbb{R}^{f\times N_k}$ , the LR of the track $T_k$ as a concatenation of the LR of the $N_k$ images of the track.

I2T/T2T Ranking procedure

Given a distance metric $d(.)$ , a query track $T_q$ , and a set of test tracks $\mathcal{T}=\{T_1, ..., T_{n_t}\}$

The track-to-track ranking (T2T) process consists in ranking every track of $\mathcal{T}$ to construct an ordered set $\tilde{\mathcal{T}}_q = \{T_{q,1}, ..., T_{q,N_t}\}$ , such that a track $T_{q,i}$ is the $i^{th}$ nearest track from the query according to the distance function $d(.)$ , $T_{q, 1}$ being the first match (i.e. the nearest) and $T_{q, N_t}$ , being the last (i.e. the farthest).

The image-to-track ranking (I2T) corresponds to the T2T ranking procedure but with a query track composed of only one image $T_q = I_q$ , and its corresponding LR $L_q$ (only the distance metric d used differs).

Distance metric (I2T)

In I2T ranking process the distance $d(.)$ is computed between a query composed of one image $L_{q}$ , and a test track $\mathbf{L}_r = \{L_{r, 1}, ..., L_{r, n_t}\}$ .

- MED : Minimal Euclidean Distance

$d(L_{q}, \mathbf{L}_r) = \underset{i \in \{1, ..., N_r\}}{min} (|| L_q - L_{r,i} ||_2)$

- MCD : Minimal Cosine Distance

$d(L_{q}, \mathbf{L}_r) = \underset{i \in \{1, ..., N_r\}}{min} (1 - \frac{L_q^\top L_{r, i}}{|| L_q ||_2 || L_{r,i} ||_2} )$

- RSCR : Residual of the Sparse Coding Reconstruction

$d(L_q, \mathbf{L}_r ) = {|| L_q - \mathbf{L}_r \Gamma_{q,r} ||_2}^2$

with

${\Gamma}_{q,r} = \underset{\tilde{\Gamma}_{q,r}}{\mathrm{argmin}} ( {|| L_{q} - \mathbf{L}_r \tilde{\Gamma}_{q,r} ||_2}^2 + \alpha || \tilde{\Gamma}_{q,r} ||_1)$

Aggregation function for T2TP

In track-to-track (T2T) ranking process the distance $d(.)$ is computed between a query track $\mathbf{L}_{q} = \{L_{q, 1}, ..., L_{q, n_q}\}$ and a test track $\mathbf{L}_r = \{L_{r, 1}, ..., L_{r, n_t}\}$ .

If the distance metric is based on MED or MCD, an aggregation function $agg(.)$ is used to aggregate the set of I2T distances ( $d_{i2t}(.)$ ) between each $L_{q, i}$ of the query and the test track $\mathbf{L}_r$ :

$d(\mathbf{L}_q, \mathbf{L}_r) = agg ( \{ d_{i2t}(L_{q,1}, \mathbf{L}_r), ..., d_{i2t}(L_{q,n_q}, \mathbf{L}_{r}) \} )$

- min : minimum of distances
- mean : average of distances
- med : median of distances
- mean50 : average of distances between the 50% smallest distances
- med50 : average of distances between 50% smallest distances

If the distance is based on RSCR, the distance between $\mathbf{L}_{q}$ and $\mathbf{L}_r$ is computed as follows :

$d(\mathbf{L}_q, \mathbf{L}_r ) = || \mathbf{L}_q - \mathbf{L}_r \mathbf{\Gamma}_{q,r} ||_F$

with

${\Gamma}_{q_i,r} = \underset{\tilde{\Gamma}_{q_i,r}}{\mathrm{argmin}} ( {|| L_{q,i} - \mathbf{L}_r \tilde{\Gamma}_{q_i,r} ||_2}^2 + \alpha || \tilde{\Gamma}_{q_i,r} ||_1)$

Note : $||.||_F$ denotes the Frobenius norm

The package `vehicle_reid`

The python package vehicle_reid contains code for :

Extract the latent representation of images of vehicle using the second-to-last layer of our CNN fine-tuned in the task of vehicle recognition as proposed in our paper. The CNN considered here is based on the DenseNet201 (https://arxiv.org/abs/1608.06993) architectures which has been fine-tuned using the VeRI dataset (https://github.com/VehicleReId/VeRidataset). Corresponding weights are given in "data/cnn_weights/VeRI_densenet_ft50.pth"
Compute the Ranking Vehicle Re-identification between tracks of vehicle using the various distance metrics studied in the paper.
Compute the performance metrics rank1 rank5 and mAP.

The module vehicle_reid is composed of 3 modules

latent_representation.py
- Extract latent representation (LR) of each track of vehicle -> return a json file containing the LR for each track
ranking.py
- Compute the ranking for each query track -> return a json file containing the ranking for each query track
performance.py
- Compute the performance metrics. namely rank1 rank5 and mAP (See paper for details)

Dependencies

numpy==1.19.2
torchvision==0.7.0
torch==1.6.0
scikit_learn==0.23.2

Running example

The directory data contains data to test the module vehicle_reid. Note that to perform the VeRI experiments presented on the paper, you'll need the VeRI dataset which can be found, by simple request to authors, here : https://github.com/JDAI-CV/VeRidataset

data/cnn_weights/VeRI_densenet_ft50.pth : Pre-trained weights for the DenseNet201 architecture. The model has been trained to classify vehicles of training set the VeRI dataset. Only its latent space (the second-to-last layer) is used to extract features
data/image_sample : some VeRI tracks of vehicle (splitted in query and test).

python3 run_example.py

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
data		data
img		img
reid_examples		reid_examples
vehicle_reid		vehicle_reid
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_example.py		run_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vehicle Re-Identification using Track-to-track ranking of deep latent representation of vehicles

Vehicle Re-Identification using CNN latent representation

Latent representation extraction

I2T/T2T Ranking procedure

Distance metric (I2T)

Aggregation function for T2TP

The package `vehicle_reid`

Dependencies

Running example

About

Releases

Packages

Languages

License

GeoTrouvetout/Vehicle_ReID

Folders and files

Latest commit

History

Repository files navigation

Vehicle Re-Identification using Track-to-track ranking of deep latent representation of vehicles

Vehicle Re-Identification using CNN latent representation

Latent representation extraction

I2T/T2T Ranking procedure

Distance metric (I2T)

Aggregation function for T2TP

The package vehicle_reid

Dependencies

Running example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

The package `vehicle_reid`

Packages