-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(server): separate face clustering job #5598
Conversation
2d6b965
to
d6dc331
Compare
Deploying with
|
Latest commit: |
157dcf4
|
Status: | ✅ Deploy successful! |
Preview URL: | https://7e3ac1de.immich.pages.dev |
Branch Preview URL: | https://feat-face-clustering-job.immich.pages.dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is quite large lol. Can you give a summary of the changes in the description? Like renamed the queue, added a second queue, etc.?
4bd4ec1
to
b354212
Compare
Haha yeah, I added some more details to the description |
I managed to address the clustering simplifications. (1) is solved by moving non-core faces to the back of the queue, effectively making it a priority queue. This gives a guarantee (at least for All jobs) that any person a core face finds is from another core face. (2) is solved by making a separate search for a person. This will work regardless of library size as the query will only return a face with a person. The results on a library with a few thousand images are almost perfect and better than the current algorithm. Now to make it not fail all of our checks... |
deeb0d2
to
c75e9ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I obviously can't speak for the python code, yet alone the whole facial recognition logic. However, the code I could asses looks really good imo!
Great job :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a huge change, but I think it looks really good. Being able to run the clustering algorithm independently of the embedding computation will be very nice to have. LGTM!
997cc65
to
858e68f
Compare
formatting linting
remove unused imports
formatting
remove unused import
fix tests
22f5ae7
to
2457ab4
Compare
Description
This PR separates the current facial recognition job into two parts: generating faces and clustering faces, or Face Detection and Facial Recognition. This forms the basis for future improvements to clustering that can leverage the fact that each job has access to the full set of face embeddings.
The queue job for facial recognition now waits for face detection to finish, while face detection jobs queue this queue job. The queue facial recognition job is assigned a job ID so BullMQ only queues the first instance of it.
The clustering job no longer requires embeddings to be re-generated, so it's very quick to change options and re-cluster. For both face detection and facial recognition, running on all assets still requires re-clustering.
The clustering algorithm is also overhauled. Due to the increased number of faces during search, looking at just the nearest neighbor led to many duplicate people.
As part of this,
minFaces
is now incorporated into the clustering algorithm. When minFaces is set to 1, this algorithm performs like a better version of the one on main (no distinction between core and non-core points, but all faces have access to the full set of embeddings and cluster sequentially). When set higher, it increases the precision of the clustering at the cost of increasing the chance that a face is not assigned to a person. The default is increased to 3 for higher precision; this produced the best results during testing without excluding relevant images.A perk of this change is that default thumbnails for people will now be better on average, with a lower chance of blurred or off-angle faces. This is because only core faces (described below) can generate thumbnails.
Some other minor changes:
getAll
andgetAllFaces
queries are now paginatedwaitForQueueCompletion
method for job hierarchy (e.g. running all detection jobs before recognition jobs can start)unlink
method now warns if a file doesn't exist instead of throwingTheembedding
column ofasset_faces
is now (unfortunately) selected by defaultTrying to explicitly select this column doesn't work because the column "doesn't exist" according to TypeORM, but it works if it's selected by defaultAlgorithm
The clustering algorithm has been updated to a variant of DBSCAN, including a concept of "core" points (points with a minimum number of faces around them). During search, a person is only created if there are no points around with an assigned person and the current point is a core point. Core points are additionally allowed to assign a person to all un-assigned points around them, while non-core points can only assign to themselves.
There
arewere two simplifications here:1. Core faces are allowed to extend from the people of non-core faces during search (normally non-core points cannot extend a cluster)-
This would require looking up the density of each neighbor, which is difficult to do efficiently with the current job system2. Core faces can only reassign 100 - 1000 faces at a time depending on library size (normally all points in range would be extended, not just the top K)- HNSW indices are not optimized for range queries, so a limit is needed for performance. This can theoretically cause duplicates if the number of faces for a person is very high, so might need to be tweaked in the future.I also experimented with clustering with the ML service using HDBSCAN, a more sophisticated algorithm than DBSCAN, but ran into some issues:
How Has This Been Tested?
Results on a toy dataset are perfect. Results on a library with a few thousand images are nearly perfect and better than the current clustering algorithm.
The only thing I'm unsure about is how the algorithm performs for other libraries. It's possible there are edge cases I haven't encountered.
Fixes #6441
Fixes #4087