Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems purging the cache (purge_outdated)? #439

Closed
interplanetarychris opened this issue Aug 31, 2017 · 9 comments
Closed

Problems purging the cache (purge_outdated)? #439

interplanetarychris opened this issue Aug 31, 2017 · 9 comments
Labels
bug Bug reports. macOS Related to running on macOS.

Comments

@interplanetarychris
Copy link

I'm doing a very large Picture/Content dupe check on repositories that include Lightroom and Aperture directories on local and AFP filesystems.

The file at the end was not included in the current search, but apparently there was a problem in purging the cache.

Application Identifier: com.hardcoded-software.dupeguru
Application Version: 4.0.3
Mac OS X Version: Version 10.12.6 (Build 16G29)

Traceback (most recent call last):
File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 111, in getitem
KeyError: 'path:/Volumes/storage/Aperture Libraries/Genealogy Library (active) 3.aplibrary/Masters/2010/01/06/20100106-121355/00774_n_9aek3kmar0118.jpg'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "build/dupeGuru.app/Contents/Resources/py/cocoa/inter.py", line 259, in pulse
File "build/dupeGuru.app/Contents/Resources/py/hscommon/gui/progress_window.py", line 101, in pulse
File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 323, in _job_error
File "build/dupeGuru.app/Contents/Resources/py/hscommon/jobprogress/performer.py", line 43, in _async_run
File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 780, in do
File "build/dupeGuru.app/Contents/Resources/py/core/scanner.py", line 137, in get_dupe_groups
File "build/dupeGuru.app/Contents/Resources/py/core/pe/scanner.py", line 31, in _getmatches
File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 167, in getmatches
File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 65, in prepare_pictures
File "build/dupeGuru.app/Contents/Resources/py/core/pe/cache_shelve.py", line 121, in purge_outdated
File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 113, in getitem
KeyError: b'path:/Volumes/storage/Aperture Libraries/Genealogy Library (active) 3.aplibrary/Masters/2010/01/06/20100106-121355/00774_n_9aek3kmar0118.jpg'

@silentnyte
Copy link

silentnyte commented Sep 2, 2017

I think that you are on to the problem. I changed my search path trying to narrow the problem down. Originally I was searching /Volumes/Local01/Pictures/2008 as a test which worked. Then I moved on to a larger directory /Volumes/Local01/Pictures/2015. That is when I go this error. That will not correct. Notice that the path is the first path.

Deleting ~/Library/Application Support/dupeGuru/cashed_pictures.shelve.db fixed this.

Application Identifier: com.hardcoded-software.dupeguru
Application Version: 4.0.3
Mac OS X Version: Version 10.12.4 (Build 16E195)

Traceback (most recent call last):
File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 111, in getitem
KeyError: 'path:/Volumes/Local01/Pictures/2008/07/20080706_124659-5.jpg'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "build/dupeGuru.app/Contents/Resources/py/cocoa/inter.py", line 259, in pulse
File "build/dupeGuru.app/Contents/Resources/py/hscommon/gui/progress_window.py", line 101, in pulse
File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 323, in _job_error
File "build/dupeGuru.app/Contents/Resources/py/hscommon/jobprogress/performer.py", line 43, in _async_run
File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 780, in do
File "build/dupeGuru.app/Contents/Resources/py/core/scanner.py", line 137, in get_dupe_groups
File "build/dupeGuru.app/Contents/Resources/py/core/pe/scanner.py", line 31, in _getmatches
File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 167, in getmatches
File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 65, in prepare_pictures
File "build/dupeGuru.app/Contents/Resources/py/core/pe/cache_shelve.py", line 121, in purge_outdated
File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 113, in getitem
KeyError: b'path:/Volumes/Local01/Pictures/2008/07/20080706_124659-5.jpg'

@ghost
Copy link

ghost commented Sep 17, 2017

I tried again today to reproduce the error, to no avail. I'm thinking that the issue has to do with hash collision in the cache and that to reproduce this, a large number of photos are needed. I don't have a large photo collection so I can't reproduce.

As I write in #402, I really don't like the idea of fixing a problem blindly, but then again, because many users have a large photo collection (why would you use dupeGuru otherwise?), I'm going to do it. @silentnyte @interplanetarychris would you be willing to confirm or infirm the fix if I created a test build?

@interplanetarychris
Copy link
Author

interplanetarychris commented Sep 17, 2017 via email

@silentnyte
Copy link

silentnyte commented Sep 19, 2017 via email

ghost pushed a commit that referenced this issue Sep 19, 2017
@ghost
Copy link

ghost commented Sep 19, 2017

https://download.hardcoded.net/dupeguru_osx_4_0_3_shelvetest.dmg

This test version is the same as v4.0.3, but with the addition of the commit referenced above. @interplanetarychris @silentnyte Could you confirm that it works properly in situation where the vanilla v4.0.3 failed?

@ghost ghost mentioned this issue Sep 21, 2017
@interplanetarychris
Copy link
Author

I've since been able to try the shelve version in similar scenarios as the prior failures. I have added and removed folders from the scan list to engage the additions and subtractions to the cache. I have yet to have it crash yet with file/image repositories ranging from a few thousand to about 35K. Thanks for the fix!

@fuzzy76
Copy link

fuzzy76 commented Mar 25, 2020

I got this error on 4.0.3, upgraded to 4.0.4, got the same error, deleted the cache manually, reran, and now it seems to be working. Does that mean the bug is gone from 4.0.4 or not?

@fuzzy76
Copy link

fuzzy76 commented Mar 26, 2020

Application Identifier: com.hardcoded-software.dupeguru
Application Version: 4.0.4
Mac OS X Version: Version 10.15.4 (Build 19E266)

Traceback (most recent call last):
  File "build/py/shelve.py", line 111, in __getitem__
KeyError: 'path:/Volumes/bertha/!Sorteres/39.jpg'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "build/py/cocoa/inter.py", line 259, in pulse
  File "build/py/hscommon/gui/progress_window.py", line 101, in pulse
  File "build/py/core/app.py", line 323, in _job_error
  File "build/py/hscommon/jobprogress/performer.py", line 43, in _async_run
  File "build/py/core/app.py", line 780, in do
  File "build/py/core/scanner.py", line 137, in get_dupe_groups
  File "build/py/core/pe/scanner.py", line 31, in _getmatches
  File "build/py/core/pe/matchblock.py", line 167, in getmatches
  File "build/py/core/pe/matchblock.py", line 65, in prepare_pictures
  File "build/py/core/pe/cache_shelve.py", line 121, in purge_outdated
  File "build/py/shelve.py", line 113, in __getitem__
KeyError: b'path:/Volumes/bertha/!Sorteres/39.jpg'

Parts of path have been cut from the message. This happens when scanning a collection of 200.000 images.

@fuzzy76
Copy link

fuzzy76 commented Mar 26, 2020

Weird. I tried re-run dupeGuru on another folder, and got the same crash referencing the old folder (which was not part of the scan) again...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug reports. macOS Related to running on macOS.
Projects
None yet
Development

No branches or pull requests

4 participants