-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Face recognition blocked for 1 month: Large batch size causes job to be stopped during preview generation #967
Comments
Do you have more than 120 new faces? You can check your database with the following query |
Hello, Values for of commands have not changed
|
Hello
I did the complete face reset and there be sure that new faces are recognized The newly detected faces are not shown in the memories app only in Fotos. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
SQL command values are always identical to the last time. #967 (comment) |
So, the issue is that face recognition is stuck. Can you check if there are jobs in the oc_jobs table for recognize and maybe post the list? (something like |
Ah, sorry about the wrong query. It looks like the job is intact and should be running, last_run indicates 'Sun Sep 17 04:58:21 2023 UTC'. Can you find any errors in the nextcloud log related to recognize (try something like |
Here are the attached logs, thank you I see errors related to "Welcome App" . An update of Welcome is available since 1h ( 1.0.10 ) , I have just made the update. |
Nah, I think that's unrelated. It's also just warnings. We can ignore those. I think the reason for this is that we limit the run time of the job to 5 minutes, but by the time the 5minutes are over it still hasn't finished preparing the previews for classification. :/ |
…Time to 0 see #967 Signed-off-by: Marcel Klehr <mklehr@gmx.net>
okay, do you think I should decrease the value of "The number of files to process per job run"? I'm currently at 50, because I'm in WASM mode. |
If anything, try reducing it, but I will publish a new release soon with a proper fix that will hopefully solve this for you |
okay !, I've just reduced it to "5" to see if there's any change there. Thanks, I'll be waiting with impatience for this new version :) |
Do let me know if the reduced value changes things, to confirm that I've indentified the bug 🙏 |
Yes yes, of course, I'll come back here to tell you if it works or not. |
Chiming in to say that I've had the same issue, and this seems to have fixed it 🎉 (I think?) Lowered (roughly halving the defaults) the number of files to process and it appears to be working well over the past 24h with "Schedule jobs" now having values some of the time and "Last background job execution" showing values for some 🙃 The Whereas now I ended up with a large backlog (don't have original numbers, but "Object recognition" has 22,740 queued files right now) because I disabled the app for a few months while I worked out better hardware. Cheers 🙏🏻 (Edit: Had to drop object/landmark to ~25 since that seems to re-occur once the face recognition queue clears; and this is in native Tensorflow mode, not WASM mode like the parent) |
Great! Thanks for letting me know. I'll soon release v5 of recognize which will fix this permanently, so you don't have to reduce the batch size anymore |
Reducing the number of "files to process per job run" solved my problem. New faces are identified! Thanks @marcelklehr |
Hello @marcelklehr After lowering the value, and doing a full indexation. New faces have appeared. however, it's blocked again :( I lowered the value again, but it doesn't change anything. |
For my side, I've restarted indexing, but the system is still blocked. |
@EVOTk Which version are you on now? |
recognize.txt I cleaned up the jobs several days ago, and restarted a complete indexing ( over 40k files ). It takes a long time. --
--
--
|
Can you up the number of files to process per run? |
Would you like set to 50 as originally proposed? |
We can do 500 as recommended in the settings, now |
Okay, i'm use WASM mode |
Ah, right. then 50 |
@EVOTk Can you post the nextcloud log again, now that you're on v5.x |
Also, could you check if some of your images have 0 bytes? |
Hello ! There are a lot of photo files, but I haven't seen any empty files |
Interesting! Could you post the oc_jobs table again? |
And then, if you're comfortable with that, could you try applying the following patch to your nextcloud? nextcloud/server#41295 |
Alright, then let's wait if it trips up again with the change applied. Thank you for bearing with me :) |
Hello @marcelklehr Sounds good. I have new detection present in Memories. I'll now see when I've added photos, if the recognition works properly. Thx ! |
How is it looking? @EVOTk |
Hello ! |
Did you update your nextcloud in the meantime? Maybe my patch was reverted on your instance because of that. 19 Queued files, 1 scheduled job and last classification 12/11/2023 with last execution 5min ago sounds like it's stuck again... Could you post some logs again? 🙏 |
Hello, However, the container doesn't keep the patch, so if I re-create the container, I have to re-apply the patch manually.
|
mine is also stalled now it seems. However, I actually found an error in the log.
Is this the same problem or a different one? |
Yep, this is the same problem.
And this is the symptom. |
@EVOTk Ah, the concurrent execution limits us to one job from all classifiers, so you can either have the imagenet classifier running or the face recognition classifier. If you want to run as many as possible at the same time, you can check |
Nextcloud 27.1.4 should fix the last bug in the concurrency detection, I think. So one recognize classifier job should be running at all times (as much as possible via cron, at least), but it may not be the face recognition one, so some stalling is normal, unless Do let me know here if you have a case where
or
|
I will activate this. Thank you |
Which version of recognize are you using?
4.3.2
Enabled Modes
Face recognition
TensorFlow mode
WASM mode
Downstream App
Memories App
Which Nextcloud version do you have installed?
27.0.227.1.0Which Operating system do you have installed?
Debian 12
Which database are you running Nextcloud on?
MariaDB
10.11.410.11.5Which Docker container are you using to run Nextcloud? (if applicable)
Linuxserver
27.0.227.1.0PHP 8.2.10
How much RAM does your server have?
32 GB
What processor Architecture does your CPU have?
x86_64
Describe the Bug
For the past 1 month, I've had no new face detection on the photos I've added.
the process appears to be "blocked"
"Last classification: 06/08/2023"
occ recognize:recrawl
orocc recognize:classify
not resolve issueI've also tried "clear the classifier queues and clear all background jobs" and relaunching a complete classification, but it always comes back to the same thing.
I didn't run
occ recognize:reset-face-clusters
andocc recognize:reset-faces
because I don't want to lose the current classification.Thx for your help
Expected Behavior
Face detection on new photos
To Reproduce
Debug log
No response
The text was updated successfully, but these errors were encountered: