# Person Search

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vqpy/vqpy/blob/main/examples/person_search/demo.ipynb)

## Introduction

Person search could retrieve the target person from videos across different camera views.

In this example we use models in [Fast-ReID](https://github.com/JDAI-CV/fast-reid/) to match the query and gallery person.

## Environment setup

1. Install VQPy from GitHub

In [None]:
!pip install torch torchvision numpy==1.23.5 cython
!pip install 'vqpy @ git+https://github.com/vqpy/vqpy.git'


2. Install other dependencies for additional models Fast-ReID [here](https://github.com/JDAI-CV/fast-reid/blob/master/INSTALL.md).


In [None]:
!pip install -r https://raw.githubusercontent.com/JDAI-CV/fast-reid/master/docs/requirements.txt

4. Download video from [here](https://drive.google.com/file/d/1xkCr6uY-wp0ZdhJfEkhd_7XOc0qNmOcq/view?usp=sharing) and place it in the same directory as this notebook.

5. Set paths

In [None]:
video_path = "./video/camera.mp4"	# path to video
query_folder = "./query/"           # folder containing query images
save_folder = "./vqpy_outputs"      # folder to save the query result

For reference, the working directory should look like:

```text
.
├── video           # camera video examples
│   └── camera.mp4  # video to query on
│── query           # query person
│   └── query_0.jpg # person images for query
└── vqpy            # VQPy repo
    └── vqpy        # VQPy library
```

## Person Search with VQPy

### Step 1: Define `VObj` type for person

Interested in person search, we create a `Person` VObj, which is implemented with two additional properties.


- To store the person features, we create a `feature` property, using a pretrained model to extract the image features. The pretrained models are optional by following the Fast-ReID instruction [here](https://github.com/JDAI-CV/fast-reid/blob/master/MODEL_ZOO.md). In our example, we use the [BoT](http://openaccess.thecvf.com/content_CVPRW_2019/papers/TRMTMCT/Luo_Bag_of_Tricks_and_a_Strong_Baseline_for_Deep_Person_CVPRW_2019_paper.pdf) backbone pretrained on [MSMT17](https://openaccess.thecvf.com/content_cvpr_2018/papers/Wei_Person_Transfer_GAN_CVPR_2018_paper.pdf) datasets. Additionally, to precisely evaluate the `feature` property, we decorate the `feature` property with `@stateful(30)`, where we can store the person features of last 30 frames.

- Since person search always has more than one query object, we create a `candidate` property for `Person` VObj to store the most matched query object IDs, and the corresponding similarity distance.


In [None]:
import os
import numpy as np
import vqpy
import sys

class Person(vqpy.VObjBase):
    required_fields = ['class_id', 'image']

    feature_predictor = None
    gallery_features = None

    @vqpy.property()
    @vqpy.stateful(30)
    def feature(self):
        """
        extract the feature of person image
        :return: feature vector, shape = (N,)
        """
        image = self.getv('image')
        if image is None:
            return None
        return Person.feature_predictor(image).reshape(-1)

    @vqpy.property()
    def candidate(self):
        """
        retrieve the top-1 similar query object as the searching candidate
        :returns:
            ids (int): query IDs with most similarity
            dist (float): the similarity distance with [0, 1]
        """
        query_features = [self.getv('feature', (-1) * i) for i in range(1, 31)]
        gallery_features = self.getv('gallery_features')

        # compare the feature distance for different target person
        past_ids, past_dist = [], []
        for query_feature in query_features:
            # iterate features from the last 30 frames
            if query_feature is not None:
                dist = np.dot(gallery_features, query_feature)  # cosine similarity distance
                past_ids.append(np.argmax(dist))  # the most similar IDs
                past_dist.append(np.max(dist))  # the most similarity distance

        ids = np.argmax(np.bincount(past_ids))  # the most matched IDs
        dist = np.mean(past_dist)  # the mean distance over past matching

        return ids, dist


# load pre-trained models for person feature extracting
sys.path.append("VQPy/examples/person_search/")
from models import ReIDPredictor
feature_predictor = ReIDPredictor(cfg="MSMT17/bagtricks_R50.yml")

# extract the feature of query images
gallery_features = []
for file_name in os.listdir(query_folder):
    # extract features for all images from given directory
    img_path = os.path.join(query_folder, file_name)
    preds = feature_predictor(img_path)
    gallery_features.append(preds)

gallery_features = np.concatenate(gallery_features, axis=0)

Person.feature_predictor = feature_predictor
Person.gallery_features = gallery_features

## Step 2: Query on `Person` retrieval

The `Person` VObj has the `candidate` property to describe the most similar query object IDs and the corresponding score. We pre-defined a threshold `0.97` for candidate score to filter out the matching person in videos.

`filter_cons` is:

In [None]:
filter_cons = {
    '__class__': lambda x: x == Person,
    'candidate': lambda x: x[1] >= 0.97,  # similar threshold
}

For output, we select:

- tracker id, selected with `track_id`
- candidate id, selected with the `candidate[0]`. Need to be converted to `str` before serializing.
- bounding box, in format of coordinates of top-left and bottom-right corner, selected with `tlbr`. Need to be converted to `str` before serializing.

`select_cons` is:

In [None]:
select_cons = {
    'track_id': None,
    'candidate': lambda x: str(x[0]),  # convert IDs to string
                                       # for JSON serialization
    'tlbr': lambda x: str(x),  # convert to string
                               # for JSON serialization
}

The query could be:

In [None]:
class PersonSearch(vqpy.QueryBase):
    """The class searching target person from videos"""

    @staticmethod
    def setting() -> vqpy.VObjConstraint:
        return vqpy.VObjConstraint(
            filter_cons=filter_cons,
            select_cons=select_cons,
            filename='person_search'
        )

## Running the query

With the `Person` VObj and the query defined, we can run the query, with:

- `cls_name` is a tuple for mapping numerical outputs of object detector to literal detection class name

	Here we use `COCO_CLASSES` since it includes all the class names of interest in the fall detection query, i.e. `"person"`.

- dictionary `cls_type` is then used to map detection class name (in str) to VObj types defined

	`{"person": Person}` means we wish to map COCO class `person` to VObj type `Person`

- `tasks` is a list of queries to run on the video

In [None]:
vqpy.launch(
    cls_name=vqpy.COCO_CLASSES,
    cls_type={"person": Person},
    tasks=[PersonSearch()],
    video_path=video_path,
    save_folder=save_folder,
)

# Expected result

Result of the query will be in `{save_folder}/{video_name}_{task_name}_{detector_name}.json`, output for this example should be in `./vqpy_outputs/person_search_yolox.json`.

One entry is created for each frame that has filter condition satisfied.

e.g. The query person images are captured in advance:

<img src="./demo.assets/query.jpg">

Retrieve the target person on the camera videos:

```json
{
  "frame_id": 11,
  "data": [
    {
       "track_id": 2,
       "match": "0",
       "tlbr": "[155.7875 232.65001 224.5375 431.75 ]"
    }
  ]
}
```

Visualize the query person in videos, and the person in red bounding box is the retrieved object:

<img src="./demo.assets/marked.jpg">

