You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a 633.9GiB dataset in one fs and try this project on it. it looks it take more than 5 minutes ( I didn't finish it)
I write a simple single thread python script and just filter with file into (size, head bytes, tail bytes) then filter the same inode, there are only 9GiB files to do a full hash 🤔
Then I try this project, it seems that it need to hash 40GB files
INFO:pydupes:Size filter reduced file count to: 20 (40.2GiB)
The text was updated successfully, but these errors were encountered:
trim21
changed the title
poor performance compared to sinple size, head, tail and inode pre filtering
poor performance compared to simple size, head, tail and inode pre filtering
Apr 21, 2023
Late response, but that log statement is only referring to the size filter, not necessarily what is hashed. I didn't add an explicit log on the sum of what needs to be hashed since that requires a pass over all the files to completed.
I have a 633.9GiB dataset in one fs and try this project on it. it looks it take more than 5 minutes ( I didn't finish it)
I write a simple single thread python script and just filter with file into
(size, head bytes, tail bytes)
then filter the same inode, there are only 9GiB files to do a full hash 🤔Then I try this project, it seems that it need to hash 40GB files
The text was updated successfully, but these errors were encountered: