Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save paths to temp file / moveToTrash requested path is not a file #7

Open
chapmanjacobd opened this issue Jul 25, 2023 · 2 comments
Open

Comments

@chapmanjacobd
Copy link

chapmanjacobd commented Jul 25, 2023

I'm not sure how this is possible (because of the unique index media_path_index) but it seems like cbird tried to delete something that it deleted just one line before

$cbird -p.dht 1 -similar -select-result -sort-rev resolution -chop -nuke
[D][Database::Database] "."                                                                                                                                                                                                       
[D][Database::connect] thread:0x5637468b13b0 sqlite_0_0 ~/.local/cbird/art/_index/media0.db                                                                                                                                       
[D][Database::connect] thread:0x5637468b13b0 sqlite_1_1 ~/.local/cbird/art/_index/media1.db                                                                                                                                       
[D][Database::connect] thread:0x5637468b13b0 sqlite_2_2 ~/.local/cbird/art/_index/media2.db                                                                                                                                       
[D][Database::connect] thread:0x5637468b13b0 sqlite_3_3 ~/.local/cbird/art/_index/media3.db                                                                                                                                       
[I][Database::fillMediaGroup] sql query: 1400001                                                                                                                                                                                  
[D][Database::similar] loading index for algo 0                                                                                                                                                                                   
[I][DctHashIndex::load] sql query: 96% 1,399,808 hashes                                                                                                                                                                           
[I][DctHashIndex::load] 1400443 hashes, 426ms                                                                                                                                                                                     
[I][Database::similar] index loaded in 4105ms                                                                                                                                                                                     
[I][Database::similar]  41000 1400443                                                                                                                                                                                             
[W][DctHashIndex::find] no hash for needle: "~/.local/cbird/art/91_New_Art/emoji-kitchen/u1f429_u1f90d.jpg"    
...
[I][Database::similar]  1346000 1400443                                                                                                                                                                                           
[W][DctHashIndex::find] no hash for needle: "~/.local/cbird/art/95_Memes/stars.jpg"                                                                                                                                               
[I][Database::similar]  1400000 1400443                                                                                                                                                                                           
[I][Database::similar] searched 1400443 items and found 1400443 matches in 488ms                                                                                                                                                  
[D][Database::similar] filter matches                                                                                                                                                                                             
[D][Database::loadWeeds] loaded 0 weeds                                                                                                                                                                                           
[I][Database::similar] filtered 1400443 matches to 28487 in 1427ms                                                                                                                                                                
                                                                                                                                                                                                                                  
nuke: about to move 69923 items to trash, proceed? [y/N]: y    
...
[I][DesktopHelper::runProgram] QList("trash-put", "~/.local/cbird/art/91_New_Art/jonathanmccabe/20262979831_20262979831_54b4f72f6f_o.jpg")
[D][DesktopHelper::runProgram] portable PATH: "/tmp/.mount_cbirdrlRyyk/cbird/bin/" LD_LIBRARY_PATH: "/tmp/.mount_cbirdrlRyyk/cbird/lib/"
[I][DesktopHelper::runProgram] QList("trash-put", "~/.local/cbird/art/91_New_Art/jonathanmccabe/19634149424_19634149424_27dd7ce56c_o.jpg")
[D][DesktopHelper::runProgram] portable PATH: "/tmp/.mount_cbirdrlRyyk/cbird/bin/" LD_LIBRARY_PATH: "/tmp/.mount_cbirdrlRyyk/cbird/lib/"
[W][DesktopHelper::moveToTrash] requested path is not a file: "~/.local/cbird/art/91_New_Art/jonathanmccabe/20262979831_20262979831_54b4f72f6f_o.jpg"

# exit 0

I'm not sure how to reproduce this bug but it has only happened once and I'm not too concerned about it but:

  1. cbird exits with code 0 after this. The exit code should probably be non-zero ?
  2. it would be convenient to have an option to save the selected paths to a random or named NUL-delimited file

my short term workaround:

cbird -p.dht 1 -similar -select-result -sort-rev resolution -chop -dump > out
cat out | grep path | sed 's|.*  = ||' | sed 's|~/.local/cbird/art/||' | string unescape | parallel -j20 rm {} 
@chapmanjacobd
Copy link
Author

chapmanjacobd commented Jul 25, 2023

also, instead of exiting when this happens you might consider doing what rmlint does: check that the "original" exists before each delete/trash operation:

  • if the "original" exists, ignore ENOENT / ENOTDIR when trying to delete the duplicate(s), print a warning, but don't exit the program
  • if the "original" doesn't exist, skip any operations on the duplicate(s), print a warning, but don't exit the program

however, I don't think cbird currently is checking that the "original" exists before deleting so the current behavior of exiting the whole program makes sense to me

@scrubbbbs
Copy link
Owner

There would have to be two groups that contain the same thing, e.g. A=>C, B=>C so with the chop you get { C, C } in the list of deletions. This would mean that A,C and B,C are closer to each other than A,B. These could be a false matches especially if we know that A,B are unique.

Exit code is -1 for -nuke, I'm not sure where the 0 status is coming from.

Yeah, there is no concept of "original" anywhere really, there is only the "needle". Once you get to the "nuke" phase which is more of a "trust me bro" option for cases where you have some known duplicates to discard.

Ideally nuke would never exit early unless there was a problem it couldn't otherwise detect or resolve.

I am moving towards something where we can reliably batch-delete duplicates. Based on your comments it seems clear that the system would need:

  1. some concept of "original"
    • some way to elect which one should be the original (digikam has this)
    • don't delete originals with -nuke, or dups if the original went missing
  2. if something seems off like copies in the list, put up an "are you sure ...?" prompt
  3. add a flag to "just say yes" to all warnings that would stop -nuke from finishing
  4. add another version of -nuke that would take originals into account

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants