Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking photos repeatedly after index completed #1618

Closed
iwinstar opened this issue Oct 12, 2021 · 34 comments
Closed

Checking photos repeatedly after index completed #1618

iwinstar opened this issue Oct 12, 2021 · 34 comments

Comments

@iwinstar
Copy link

PhotoPrism Version: 211007-8f55d6f8-Linux-x86_64

......
INFO[2021-10-09T08:14:07Z] indexing completed in 1378 s
INFO[2021-10-09T08:28:47Z] photo pqtivkj34ertw2oe: keeping meta title “Miami Downtown”
INFO[2021-10-09T08:28:51Z] photo pqtivl67m2fjbyqk: keeping meta title “Giant's Causeway”
INFO[2021-10-09T08:28:53Z] photo pqtivlqwnlsd1gcg: keeping meta title “Rising from Orange”
......
......
......
INFO[2021-10-12T08:51:11Z] photo pqtjsgzlrhvcdkwp: keeping meta title “Country Horizon”
INFO[2021-10-12T08:51:18Z] metadata: optimized 110 photos
INFO[2021-10-12T08:51:41Z] index: updating counts
INFO[2021-10-12T08:51:46Z] index: updating covers
INFO[2021-10-12T08:58:28Z] photo pqtivkj34ertw2oe: keeping meta title “Miami Downtown”
INFO[2021-10-12T08:58:30Z] photo pqtivl67m2fjbyqk: keeping meta title “Giant's Causeway”
INFO[2021-10-12T08:58:32Z] photo pqtivlqwnlsd1gcg: keeping meta title “Rising from Orange”
......

I don't add or modifiy any photo, seem it's checking all photos again.

@graciousgrey
Copy link
Member

That's expected PhotoPrism regularly runs an optimize command in the background

@iwinstar
Copy link
Author

That's expected PhotoPrism regularly runs an optimize command in the background

I checked https://docs.photoprism.org/getting-started/config-options/, couldn't find any option related to this task. How could I adjust it? I have more than 263k photos, it'll need 3 days to run once each time.

@lastzero
Copy link
Member

Are you sure it needs 3 days to go through "just" 263k database rows? It's not indexing, just updating based on existing index data.

You can increase the interval of the background worker via PHOTOPRISM_WAKEUP_INTERVAL:

https://docs.photoprism.org/getting-started/config-options/

For example, PhotoPrism generates titles based on metadata or recognized faces. Since not all photos can be instantly updated when you change the name of a person, this happens in the background.

@iwinstar
Copy link
Author

MariaDB [photoprism]> select date_format(checked_at, '%m-%d %H'), count(1) from photos where deleted_at is null group by date_format(checked_at, '%m-%d %H');
+-------------------------------------+----------+
| date_format(checked_at, '%m-%d %H') | count(1) |
+-------------------------------------+----------+
| 10-09 11                            |       50 |
| 10-09 12                            |      750 |
| 10-09 13                            |     7076 |
| 10-09 14                            |     8974 |
| 10-09 15                            |     9822 |
| 10-09 16                            |     9884 |
| 10-09 17                            |     6211 |
| 10-09 18                            |     3823 |
| 10-09 19                            |     6822 |
| 10-09 20                            |     9038 |
| 10-09 21                            |     8740 |
| 10-09 22                            |     7809 |
| 10-09 23                            |     6664 |
| 10-10 00                            |     3219 |
| 10-10 01                            |     3483 |
| 10-10 02                            |     2001 |
| 10-10 03                            |     1868 |
| 10-10 04                            |     1739 |
| 10-10 05                            |     1851 |
| 10-10 06                            |     1774 |
| 10-10 07                            |     1780 |
| 10-10 08                            |     1623 |
| 10-10 09                            |     1497 |
| 10-10 10                            |     1439 |
| 10-10 11                            |     1298 |
| 10-10 12                            |     1654 |
| 10-10 13                            |     1785 |
| 10-10 14                            |     1839 |
| 10-10 15                            |     1529 |
| 10-10 16                            |     1554 |
| 10-10 17                            |     1830 |
| 10-10 18                            |     1828 |
| 10-10 19                            |     1769 |
| 10-10 20                            |     1670 |
| 10-10 21                            |     1557 |
| 10-10 22                            |     1645 |
| 10-10 23                            |     1883 |
| 10-11 00                            |     1877 |
| 10-11 01                            |     1860 |
| 10-11 02                            |     1841 |
| 10-11 03                            |     1849 |
| 10-11 04                            |     1803 |
| 10-11 05                            |     1854 |
| 10-11 06                            |     1795 |
| 10-11 07                            |     1615 |
| 10-11 08                            |     1417 |
| 10-11 09                            |     1288 |
| 10-11 10                            |     1606 |
| 10-11 11                            |     1402 |
| 10-11 12                            |     1541 |
| 10-11 13                            |     1414 |
| 10-11 14                            |     1959 |
| 10-11 15                            |     3399 |
| 10-11 16                            |     3305 |
| 10-11 17                            |     3946 |
| 10-11 18                            |     4754 |
| 10-11 19                            |     4861 |
| 10-11 20                            |     4345 |
| 10-11 21                            |     3969 |
| 10-11 22                            |     3720 |
| 10-11 23                            |     3576 |
| 10-12 00                            |     3323 |
| 10-12 01                            |     2953 |
| 10-12 02                            |     3590 |
| 10-12 03                            |     3447 |
| 10-12 04                            |     3758 |
| 10-12 05                            |     3843 |
| 10-12 06                            |     3288 |
| 10-12 07                            |     1856 |
| 10-12 08                            |     2527 |
| 10-12 09                            |    10430 |
| 10-12 10                            |    10184 |
| 10-12 11                            |     9924 |
| 10-12 12                            |     9602 |
| 10-12 13                            |     4224 |
+-------------------------------------+----------+
75 rows in set (1.115 sec)

Yes, It truly use 3 days.

@iwinstar
Copy link
Author

iwinstar commented Oct 12, 2021

SELECT * FROM `photos`  WHERE (place_id <> '' AND place_id <> 'zz' AND place_src <> '' AND place_src <> 'estimate') ORDER BY ABS(DATEDIFF(taken_at, '2017-02-20 15:29:32')) ASC,`photos`.`id` ASC LIMIT 1

It usually used more than 1.5s to complete this query before I add the index.
It only use 0.1s after I add an index to place_src column.

@lastzero
Copy link
Member

May I ask what hardware you have? Our test server here has 100k files, but uses an 8 years old Core i3 processor. It's not doing this for days. Note that the first run may be longer. Was this the first time?

@iwinstar
Copy link
Author

I use a Synology DS918+ sever, run PhotoPrism in docker, all HDD drivers, just a weak J3455 processor.
I don't know whether it's the first time. I will check if it's faster this time.

@lastzero
Copy link
Member

Surprised place_src helps as the selection is by taken_at... if the database filters by place_src first then yes, it may be slow. Note the cardinality of place_src should be between 1 and 4. Typically doesn't make sense to add an index in this case. You're welcome to contribute if you know how to optimize the query for the databases we currently support.

May I ask how long the initial indexing took?

Looking at the hardware specs of the DS918+, in particular the processor, I'm not surprised you have performance issues (like 70% slower than even a Core i3):

benchmark

@lastzero
Copy link
Member

With these hardware specs, feel free to go ahead and set PHOTOPRISM_WAKEUP_INTERVAL to a higher value and disable location estimates in Settings. This is probably using the most CPU power and you may not want it anyway. 👍

@iwinstar
Copy link
Author

I understand your point, this is not the first time. it used 24 days to generate thumbs and complete index for the first time.

MariaDB [photoprism]> select date_format(created_at, '%m-%d'), count(1) from photos group by date_format(created_at, '%m-%d');
+----------------------------------+----------+
| date_format(created_at, '%m-%d') | count(1) |
+----------------------------------+----------+
| 05-22                            |     9215 |
| 05-23                            |    35012 |
| 05-24                            |    10156 |
| 05-25                            |     9364 |
| 05-26                            |    11411 |
| 05-27                            |    10138 |
| 05-28                            |     8152 |
| 05-29                            |    10640 |
| 05-30                            |    11437 |
| 05-31                            |    10041 |
| 06-01                            |     9060 |
| 06-02                            |     6694 |
| 06-03                            |     9975 |
| 06-04                            |     9369 |
| 06-05                            |     9008 |
| 06-06                            |     9775 |
| 06-07                            |     9803 |
| 06-08                            |     9437 |
| 06-09                            |     9526 |
| 06-10                            |     9275 |
| 06-11                            |     9024 |
| 06-12                            |     8461 |
| 06-13                            |     9158 |
| 06-14                            |     6803 |
| 07-04                            |     3212 |
| 09-25                            |       10 |
| 09-26                            |     8391 |
| 09-27                            |     1027 |
| 10-04                            |      167 |
+----------------------------------+----------+
29 rows in set (0.856 sec)

@lastzero
Copy link
Member

So when it took 24 days to index, it's not surprising location estimates also take long. It's not a cheap query, but usually also no issue. I'd disable it in your case, which should also speed up the background check. Note the worker will also compare faces when you use facial recognition etc, so I wouldn't get rid of it completely unless you know what implications this has and are OK with it.

@iwinstar
Copy link
Author

Screen Shot 2021-10-12 at 22 22 40

Screen Shot 2021-10-12 at 22 20 17

Most resources are used by MariaDB, PhotoPrism uses just a few for the last 7 days.

@lastzero
Copy link
Member

Yes, this is because we use a lot of SQL. We know queries and tables can be optimized further, but don't consider it the most important issue right now. You're welcome to make specific suggestions and sponsor them to be included in a future release.

@iwinstar
Copy link
Author

Sure, I will check SQLs at weekends, and I will set PHOTOPRISM_WAKEUP_INTERVAL to 3 months or even higher. This will solve my problem, Thanks.

@srett
Copy link
Contributor

srett commented Oct 15, 2021

Adding this here instead of opening yet another issue. I found that mysql gets hammered with this query thousands of times according to show processlist after updating the index with even just a few new photos:
UPDATE photos SET checked_at = NULL WHERE id IN (SELECT f.photo_id FROM files f JOIN markers m ON ...

The query gets truncated there, but maybe it's enough to know where it comes from. The individual query seems cheap and finishes almost immediately, it just runs many many times. Maybe this can be aggregated?

This makes mysql occupy almost two cores for 15 minutes in my case. Not a dealbreaker per se, but I cannot start a new indexing run while this is happening, even though the UI suggests PP is idle at this point, i.e. nothing happening in the log output, "Start" button enabled again.

@lastzero
Copy link
Member

The query is supposed to trigger a photo metadata / title / keywords refresh after people visible on a picture have been added, changed or removed. If it was running for 15 full minutes then maybe it needs optimization. Didn't observe any issues so far, otherwise we wouldn't have released it like that.

@lastzero
Copy link
Member

@srett If it was running thousands of times then probably by a background worker while faces on photos were matched to people. Does it happen every day or did it stop?

@lastzero
Copy link
Member

The background worker is independent of the indexing worker so that it doesn't get blocked randomly. Of course this doesn't mean the load caused by the background doesn't slow down indexing or that database table locking will never be an issue. We fix everything we can understand and reproduce unless we need historic hardware for this.

@srett
Copy link
Contributor

srett commented Oct 15, 2021

Ah, so it seems it was just coincidental, that the worker kicked off after the indexing was done. However, on a first glance, the log output now looks like it's doing the same thing over and over again with no apparent progress:

time="2021-10-15T21:53:51+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.021458, Ø 0.810000 from 5832 samples, collision Ø 0.224095"
time="2021-10-15T21:53:51+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.234095, Ø 0.810000 from 5832 samples, collision Ø 0.011458"
time="2021-10-15T21:53:51+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.064502, Ø 0.810000 from 4541 samples, collision Ø 0.272797"
time="2021-10-15T21:53:51+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.282797, Ø 0.810000 from 4541 samples, collision Ø 0.054502"
time="2021-10-15T21:53:51+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.021458, Ø 0.810000 from 5665 samples, collision Ø 0.377361"
time="2021-10-15T21:53:51+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.064502, Ø 0.810000 from 5665 samples, collision Ø 0.011458"
time="2021-10-15T21:53:51+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.621333, Ø 0.810000 from 5665 samples, collision Ø 0.054502"
time="2021-10-15T21:53:51+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T21:53:51+02:00" level=info msg="faces: resolved 7 / 7 collisions [400.754893ms]"
time="2021-10-15T22:11:27+02:00" level=info msg="faces: revised 3431 matches after collision"
time="2021-10-15T22:11:27+02:00" level=info msg="marker: collision reported for mr075xd3bib7lks2, face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB, source manual, subject jr0d5jc29ngukzho <> jr0d5p91qq3hz39w"
time="2021-10-15T22:11:31+02:00" level=info msg="faces: 3431 markers updated, 3432 faces recognized, 0 unknown [17m40.013685589s]"
time="2021-10-15T22:23:11+02:00" level=info msg="metadata: optimized 31 photos"
time="2021-10-15T22:23:26+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.021458, Ø 0.810000 from 5832 samples, collision Ø 0.224095"
time="2021-10-15T22:23:26+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T22:23:26+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.234095, Ø 0.810000 from 5832 samples, collision Ø 0.011458"
time="2021-10-15T22:23:26+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T22:23:26+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.064502, Ø 0.810000 from 4541 samples, collision Ø 0.272797"
time="2021-10-15T22:23:26+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T22:23:26+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.282797, Ø 0.810000 from 4541 samples, collision Ø 0.054502"
time="2021-10-15T22:23:26+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T22:23:26+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.021458, Ø 0.810000 from 5665 samples, collision Ø 0.377361"
time="2021-10-15T22:23:26+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T22:23:26+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.064502, Ø 0.810000 from 5665 samples, collision Ø 0.011458"
time="2021-10-15T22:23:26+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T22:23:26+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.621333, Ø 0.810000 from 5665 samples, collision Ø 0.054502"
time="2021-10-15T22:23:26+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T22:23:27+02:00" level=info msg="faces: resolved 7 / 7 collisions [362.195357ms]"
time="2021-10-15T22:42:56+02:00" level=info msg="faces: revised 3808 matches after collision"
time="2021-10-15T22:42:56+02:00" level=info msg="marker: collision reported for mr075xd3bib7lks2, face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB, source manual, subject jr0d5jc29ngukzho <> jr0d5p91qq3hz39w"
time="2021-10-15T22:42:57+02:00" level=info msg="faces: 3808 markers updated, 3809 faces recognized, 0 unknown [19m30.288843766s]"
time="2021-10-15T22:53:08+02:00" level=info msg="metadata: optimized 9 photos"
time="2021-10-15T22:53:23+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.021458, Ø 0.810000 from 5832 samples, collision Ø 0.224095"
time="2021-10-15T22:53:23+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.234095, Ø 0.810000 from 5832 samples, collision Ø 0.011458"
time="2021-10-15T22:53:23+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.064502, Ø 0.810000 from 4541 samples, collision Ø 0.272797"
time="2021-10-15T22:53:23+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.282797, Ø 0.810000 from 4541 samples, collision Ø 0.054502"
time="2021-10-15T22:53:23+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.021458, Ø 0.810000 from 5665 samples, collision Ø 0.377361"
time="2021-10-15T22:53:23+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.064502, Ø 0.810000 from 5665 samples, collision Ø 0.011458"
time="2021-10-15T22:53:23+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.621333, Ø 0.810000 from 5665 samples, collision Ø 0.054502"
time="2021-10-15T22:53:23+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T22:53:23+02:00" level=info msg="faces: resolved 7 / 7 collisions [372.3249ms]"
time="2021-10-15T23:12:39+02:00" level=info msg="faces: revised 3803 matches after collision"
time="2021-10-15T23:12:39+02:00" level=info msg="marker: collision reported for mr075xd3bib7lks2, face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB, source manual, subject jr0d5jc29ngukzho <> jr0d5p91qq3hz39w"
time="2021-10-15T23:12:41+02:00" level=info msg="faces: 3803 markers updated, 3804 faces recognized, 0 unknown [19m17.466337963s]"
time="2021-10-15T23:22:59+02:00" level=info msg="metadata: optimized 15 photos"
time="2021-10-15T23:23:13+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.021458, Ø 0.810000 from 5832 samples, collision Ø 0.224095"
time="2021-10-15T23:23:13+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T23:23:13+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict at dist 0.234095, Ø 0.810000 from 5832 samples, collision Ø 0.011458"
time="2021-10-15T23:23:13+02:00" level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: collision has been resolved"
time="2021-10-15T23:23:13+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.064502, Ø 0.810000 from 4541 samples, collision Ø 0.272797"
time="2021-10-15T23:23:13+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T23:23:13+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict at dist 0.282797, Ø 0.810000 from 4541 samples, collision Ø 0.054502"
time="2021-10-15T23:23:13+02:00" level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: collision has been resolved"
time="2021-10-15T23:23:13+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.021458, Ø 0.810000 from 5665 samples, collision Ø 0.377361"
time="2021-10-15T23:23:13+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T23:23:13+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.064502, Ø 0.810000 from 5665 samples, collision Ø 0.011458"
time="2021-10-15T23:23:13+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T23:23:13+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict at dist 0.621333, Ø 0.810000 from 5665 samples, collision Ø 0.054502"
time="2021-10-15T23:23:13+02:00" level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: collision has been resolved"
time="2021-10-15T23:23:14+02:00" level=info msg="faces: resolved 7 / 7 collisions [403.645743ms]"

I guess whatever was up with the photos I added today confused the faces engine...

Also, is this something to worry about?
time="2021-10-15T21:31:02+02:00" level=warning msg="faces: Error 1213: Deadlock found when trying to get lock; try restarting transaction while setting a face for marker mr0a7wa1fx5r1xr8"

Btw. I should mention I'm still running on 210925-96168e4b-Linux-x86_64, in case there's been any improvements there already. Hopefully have time to upgrade this weekend.

@lastzero
Copy link
Member

You may try updating as we fixed and improved a few things - before we invest more time in debugging. Shouldn't endlessly resolve collisions, but I can't prove it: https://en.wikipedia.org/wiki/Halting_problem

@lastzero
Copy link
Member

Running this command in a terminal may help when there are conflicts, although the latest version should be able to resolve all of them automatically:

docker-compose exec photoprism photoprism faces audit --fix

See https://docs.photoprism.org/getting-started/docker-compose/#command-line-interface for a general command overview. You can skip the prefix if not using Docker Compose.

@srett
Copy link
Contributor

srett commented Oct 16, 2021

I left the old version running over night to see if anything improves; it just kep doing this over and over though. Then I just updated to the 2021-10-10 and after one more iteration, it was fixed. Although after updating I thought that I should have tested whether a simple restart of the old version would have fixed it. I'll check the logs next time I import new photos. :-)

That's what I saw after updating, and after that, everything was back to normal:

level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: ambiguous subject at dist 0.021458, Ø 0.810000 from 5832 samples, collision Ø 0.224095"
level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict has been resolved"
level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: ambiguous subject at dist 0.234095, Ø 0.810000 from 5832 samples, collision Ø 0.011458"
level=info msg="face OFULY2WJ5HNGD546GAWKWZNT2PX6BKS4: conflict has been resolved"
level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: ambiguous subject at dist 0.064502, Ø 0.810000 from 4541 samples, collision Ø 0.272797"
level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict has been resolved"
level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: ambiguous subject at dist 0.282797, Ø 0.810000 from 4541 samples, collision Ø 0.054502"
level=info msg="face QJU7ORPBMMFKXLY7UJLTFG4YBQQWR33E: conflict has been resolved"
level=info msg="face 25W3B4OEHEJSNGZNX5YE23XNJ2OTMIF3: ambiguous subject at dist 0.801216, Ø 0.810000 from 17 samples, collision Ø 0.000000"
level=info msg="faces: revised 2 matches after conflict"
level=info msg="face 25W3B4OEHEJSNGZNX5YE23XNJ2OTMIF3: conflict has been resolved"
level=info msg="face 25W3B4OEHEJSNGZNX5YE23XNJ2OTMIF3: ambiguous subject at dist 0.774640, Ø 0.810000 from 17 samples, collision Ø 0.791216"
level=info msg="faces: revised 6 matches after conflict"
level=info msg="face 25W3B4OEHEJSNGZNX5YE23XNJ2OTMIF3: conflict has been resolved"
level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: ambiguous subject at dist 0.021458, Ø 0.810000 from 5665 samples, collision Ø 0.377361"
level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict has been resolved"
level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: ambiguous subject at dist 0.064502, Ø 0.810000 from 5665 samples, collision Ø 0.011458"
level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict has been resolved"
level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: ambiguous subject at dist 0.621333, Ø 0.810000 from 5665 samples, collision Ø 0.054502"
level=info msg="face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB: conflict has been resolved"
level=info msg="faces: resolved 9 / 9 ambiguous subjects [1.779080189s]"
level=info msg="faces: found 10 new clusters"
level=info msg="faces: ignoring H5YOFHKT6DWSGBZLAWMY2K5WQOMEK4NB, cluster unsuitable for matching"
level=info msg="faces: ignoring 36LCG6JH25OY2TBILSVXOLK7HVYLQSQB, cluster unsuitable for matching"
level=info msg="faces: added 8 new faces [1m15.831891885s]"
level=info msg="faces: revised 3802 matches after conflict"
level=warning msg="marker mr075xd3bib7lks2: face EYURNBZGN5PWKH6VOKCIID6ELC7FZXVB has ambiguous subjects jr0d5jc29ngukzho <> jr0d5p91qq3hz39w, subject source manual"
level=info msg="faces: updated 3,914 markers, recognized 3,809 faces, 106 unknown [20m19.577985911s]"
level=info msg="metadata: optimized 2 photos"

@lastzero
Copy link
Member

Doesn't surprise me much, clustering and matching are a mathematical hell: A tiny mistake and you end up with infinite loops or bad results. Overall, it's really not too bad for a first version and the resources we have right now.

@iwinstar
Copy link
Author

PHOTOPRISM_WAKEUP_INTERVAL doesn't work as I expected. I config it to 31536000 (one year). But task started up just after 15mins.

version: 211010-83b4f783-Linux-x86_64 (latest)

started as root
usermod: no changes
running as uid 1024:100
photoprism start
INFO[2021-10-16T08:06:01Z] start: read-only mode enabled
INFO[2021-10-16T08:06:01Z] webdav: server disabled
INFO[2021-10-16T08:06:01Z] http: starting web server at 0.0.0.0:2342
INFO[2021-10-16T08:21:13Z] photo pqulc6n1ydu3lqpy: keeping meta title Sheffield
INFO[2021-10-16T08:22:34Z] photo pr01cna2mosf0wwj: keeping meta title “Ford Mustang”
INFO[2021-10-16T08:24:02Z] photo pr01lv01yg3eex4y: keeping meta title “Street photo in Dahab , South Sinai , Egypt”
INFO[2021-10-16T08:24:18Z] metadata: optimized 39 photos
INFO[2021-10-16T08:24:46Z] index: updating counts
INFO[2021-10-16T08:24:51Z] index: updating covers
......
MariaDB [photoprism]> select checked_at from photos where checked_at >= '2021-10-16 08:06:01' limit 10;
+---------------------+
| checked_at          |
+---------------------+
| 2021-10-16 08:21:06 |
| 2021-10-16 08:21:07 |
| 2021-10-16 08:21:07 |
| 2021-10-16 08:21:08 |
| 2021-10-16 08:21:08 |
| 2021-10-16 08:21:09 |
| 2021-10-16 08:21:10 |
| 2021-10-16 08:21:10 |
| 2021-10-16 08:21:11 |
| 2021-10-16 08:21:11 |
+---------------------+
10 rows in set (0.001 sec)

@iwinstar
Copy link
Author

iwinstar commented Oct 16, 2021

root@PhotoPrism:~# photoprism config | grep "wakeup-interval"
wakeup-interval           900
admin@nas:~$ sudo docker inspect 148de4aa2f41 | grep "PHOTOPRISM_WAKEUP_INTERVAL"
                "PHOTOPRISM_WAKEUP_INTERVAL=31536000",
                "PHOTOPRISM_WAKEUP_INTERVAL=31536000",

a little strange?

@iwinstar
Copy link
Author

"PHOTOPRISM_WORKERS=0",
"PHOTOPRISM_WAKEUP_INTERVAL=31536000",
"PHOTOPRISM_AUTO_INDEX=-1",
"PHOTOPRISM_AUTO_IMPORT=-1",

vs

workers                   2
wakeup-interval           900
auto-index                0
auto-import               0

Those configs don't take effect.

@lastzero
Copy link
Member

You may use the GitHub search to find the code that handles specific config values.

In this case, you'll find that the max value currently is 86400 (24 hours):

func (c *Config) WakeupInterval() time.Duration {

This is because not running the worker will break certain functionality, like location estimates and facial recognition. We can add the ability to disable the worker completely (which is what you really want) in combination with the new PHOTOPRISM_UNSAFE env variable we recently introduced.

@lastzero
Copy link
Member

Would you be able to test the runtime of the worker one more time with Estimates disabled in Library > Settings and PHOTOPRISM_DISABLE_FACES: "true" in your docker-compose.yml?

@lastzero
Copy link
Member

Setting workers to 0 also wouldn't work BTW because these are not background but index workers.... so you couldn't index a single file.

@iwinstar
Copy link
Author

config.go explains my confusion. I add some new photos and will check how long will it take after I disable Estimates and FACES.

I hoped PP will finish its work ASAP before, so I config workers to 0.

@iwinstar
Copy link
Author

MariaDB [photoprism]> select count(*) from lenses WHERE lens_slug <> 'zz' AND id NOT IN (SELECT lens_id FROM photos);
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.291 sec)

MariaDB [photoprism]> DELETE FROM lenses WHERE lens_slug <> 'zz' AND id NOT IN (SELECT lens_id FROM photos);
Query OK, 0 rows affected (12 min 38.555 sec)

MariaDB [photoprism]> DELETE l FROM lenses l LEFT JOIN (SELECT DISTINCT lens_id FROM photos) p ON l.id = p.lens_id WHERE l.lens_slug <> 'zz' AND p.lens_id IS NULL;
Query OK, 0 rows affected (0.743 sec)

Please consider rewrite this SQL, maybe my optimized version?

@lastzero
Copy link
Member

Sure, did you test it on SQLite as well? I don't remember how we came up with every single query, but it looks like this is something SQLite doesn't support so we implemented it in an "unoptimized" way:

https://stackoverflow.com/questions/24511153/how-delete-table-inner-join-with-other-table-in-sqlite

It's now possible to use different queries depending on the SQL dialect, so we can optimize it for MySQL / MariaDB only.

@iwinstar
Copy link
Author

iwinstar commented Oct 17, 2021

I test it on SQLite v3.32.3, SSD driver

sqlite> .timer ON

sqlite> select count(*) from lenses WHERE lens_slug <> 'zz' AND id NOT IN (SELECT lens_id FROM photos);
0
Run Time: real 0.054 user 0.053050 sys 0.000524

sqlite> DELETE FROM lenses WHERE lens_slug <> 'zz' AND id NOT IN (SELECT lens_id FROM photos);
Run Time: real 0.054 user 0.052925 sys 0.000601

sqlite> DELETE l FROM lenses l LEFT JOIN (SELECT DISTINCT lens_id FROM photos) p ON l.id = p.lens_id WHERE l.lens_slug <> 'zz' AND p.lens_id IS NULL;
Run Time: real 0.000 user 0.000029 sys 0.000005
Error: near "l": syntax error

Subquery version works fine for SQLite. It still doesn't support DELETE FROM table JOIN now.

BTW, I use https://github.com/dumblob/mysql2sqlite to dump all data to SQLite from MariaDB

@lastzero
Copy link
Member

Please create a new GitHub issue for "SQL Optimization" when you're done so that we can put it on our roadmap 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants