Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clustering fails for large number of faces #725

Open
WayneBooth opened this issue Feb 26, 2024 · 1 comment
Open

Clustering fails for large number of faces #725

WayneBooth opened this issue Feb 26, 2024 · 1 comment

Comments

@WayneBooth
Copy link

Expected behaviour

Consumed resources should be related to the number of new faces to process, not the number of faces already processed. Adding 1 new face should take the same processing power, time and memory to process power, the first time a face is processed vs a new face in a database of 99million.

Actual behaviour

Clustering take exponentially longer the more images are added to the system. So that when there are around 100k images processed, clustering can take in excess of 4-5 hours, (or fail) to add a single new face.

Steps to reproduce

1.Perform face recognition on images in 10k batches
2.See that the next run, the clusting steps take exponentially longer
3.After 100k images, the clusting will start to fail.

Server configuration

Logs

None logged

Background task log with debug.

sudo -u apache php occ -vvv face:background_job
1/8 - Executing task CheckRequirementsTask (Check all requirements)
2/8 - Executing task CheckCronTask (Check that service is started from either cron or from command)
3/8 - Executing task DisabledUserRemovalTask (Purge all the information of a user when disable the analysis.)
4/8 - Executing task StaleImagesRemovalTask (Crawl for stale images (either missing in filesystem or under .nomedia) and remove them from DB)
5/8 - Executing task CreateClustersTask (Create new persons or update existing persons)
	Face clustering will be recreated with new information or changes
	65831 faces found for clustering
Killed
@matiasdelellis
Copy link
Owner

Hi @WayneBooth
Unfortunately everything you say is true. 😞
I am looking for how to optimize these cases, but today it is not a progressive clustering and the time and memory consumption may be excessive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants