Cosine similarity between CLIP-Reid features of this repo and original repo #1288

sourabh-patil · 2024-02-06T06:55:51Z

Search before asking

I have searched the Yolo Tracking issues and found no similar bug report.

Question

Hi! Thanks for sharing this awesome work. I wanted to improve the features of CLIP-Reid model. The reason is that it is trained on Market1501 dataset which has similar lighting conditions for different instances of the same person. But when we try to reid a person from different lighting conditions (as the camera setup is some other area), sometimes it fails. So, I wanted to fine-tune the CLIP model for our dataset. But before doing so, just for a sanity check, I checked features from the CLIP model used in your repo and features from the CLIP original repo. There was a huge difference in terms of cosine similarity (calculated between different IDs). The features from this repo were much better (there was a considerable gap between the same IDs and different IDs) as compared to the features from the original repo (the gap was very low). So, my question is, did you retrain or fintune the original CLIP model or directly use it as it is (I supposed that you used it as it is)? Also, do you have any suggestions or comments for the real-world problem that we face (different lighting conditions for the same person id) while doing reid on people?

mikel-brostrom · 2024-02-06T12:14:32Z

So, my question is, did you retrain or fintune the original CLIP model or directly use it as it is

I used it as it is.

There was a huge difference in terms of cosine similarity (calculated between different IDs). The features from this repo were much better (there was a considerable gap between the same IDs and different IDs) as compared to the features from the original repo (the gap was very low)

The only thing that I can think of is that the features are always normalized when inferring with the ReID models in this repo, as seen here:

https://github.com/mikel-brostrom/yolo_tracking/blob/df424189f658dfeecac16eb67a816bc987271dfa/boxmot/appearance/reid_multibackend.py#L310

but this should not affect the cosine similarity as pre-normalizing features to unit length does not affect the outcome. Cosine similarity measures the cosine of the angle between two vectors, providing a similarity score based on their orientation in space, rather than their size or length.

Also, do you have any suggestions or comments for the real-world problem that we face (different lighting conditions for the same person id) while doing reid on people?

The best results are always achieved by fine-tuning for your specific use-case

sourabh-patil · 2024-02-07T04:23:06Z

Thanks for the reply. As you rightly said cosine similarity should not be affected by normalization, but anyway I am comparing the normalized features from both models. Need to check the way I am getting the embeddings, I suppose.

sourabh-patil added the question Further information is requested label Feb 6, 2024

sourabh-patil closed this as completed Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cosine similarity between CLIP-Reid features of this repo and original repo #1288

Cosine similarity between CLIP-Reid features of this repo and original repo #1288

sourabh-patil commented Feb 6, 2024

mikel-brostrom commented Feb 6, 2024 •

edited

Loading

sourabh-patil commented Feb 7, 2024

Cosine similarity between CLIP-Reid features of this repo and original repo #1288

Cosine similarity between CLIP-Reid features of this repo and original repo #1288

Comments

sourabh-patil commented Feb 6, 2024

Search before asking

Question

mikel-brostrom commented Feb 6, 2024 • edited Loading

sourabh-patil commented Feb 7, 2024

mikel-brostrom commented Feb 6, 2024 •

edited

Loading