Create clusters of unassigned faces with variable "similarity score" #945

g3n35i5 · 2023-08-14T08:28:38Z

Describe the feature you'd like to request

First of all, I would like to thank all the developers for this awesome app! I use it in combination with Memories and it works really well. I suggested in pulsejet/memories#721 that it would be a cool feature to cluster unassigned faces by a variable similarity score. The developer made a few suggestions and comments, but pointed out that it would make more sense to implement this feature directly in recognize.

Describe the solution you'd like

A UI slider (with a sane default value, lets say 70%) that defines a similarity score for persons. The list/grid of unassigned faces is clustered by all persons that have a similarity index greater than the selected value with the slider.

Describe alternatives you've considered

N/A

marcelklehr · 2023-08-15T15:34:01Z

Hello @g3n35i5

Thanks for your kind words and giving your input on the face clustering. We're always open to improving the clustering algorithm. We used to use an implementation of the DBSCAN algortihm, which basically takes the vector distance between two face description vectors to calculate if they should be in the same cluster or not. This did not lead to great results however. We now use HDBSCAN, which is more involved than DBSCAN in that it also factors in the density of face detections in the vector space. There are some variables that can be tweaked but they are non trivial. So far I've opted to go for a good default experience rather than implementing knobs for people to tweak the system themselves. We have calibrated the current variables on a dataset of 20.000 pictures from IMDB and I really think the current settings yield the best results we are going to get with the current face descriptor model - ie. if we want better results, I think we need to invest more effort than simply changing a variable.

Also, feel free to chime into the discussion thread at #754 ♥️

g3n35i5 · 2023-08-16T19:08:02Z

Thanks @marcelklehr for your quick feedback! I can completely understand why under these circumstances further feature development from your side has a lower priority than improving the algorithm itself. Unfortunately, I can only provide limited input on the topic of clustering algorithms.

g3n35i5 added the enhancement New feature or request label Aug 14, 2023

github-actions bot added this to Backlog in Recognize Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create clusters of unassigned faces with variable "similarity score" #945

Create clusters of unassigned faces with variable "similarity score" #945

g3n35i5 commented Aug 14, 2023

marcelklehr commented Aug 15, 2023

g3n35i5 commented Aug 16, 2023

Create clusters of unassigned faces with variable "similarity score" #945

Create clusters of unassigned faces with variable "similarity score" #945

Comments

g3n35i5 commented Aug 14, 2023

Describe the feature you'd like to request

Describe the solution you'd like

Describe alternatives you've considered

marcelklehr commented Aug 15, 2023

g3n35i5 commented Aug 16, 2023