You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature Description
To avoid redundant scans, I suggest a feature for loading/importing saved lists as-is without re-comparing hashes, at most just checking if those files still exist physically or as a symlink.
I spent a week to scan all my drives, after identifying all duplicate files, I saved them in a list, but I cannot import them back into Czkawka.
So I lost all progress due to a crash and I'm forced to repeat the scan again, the cached .bin hashes are not helping because it's still going to recompare all hashes all over again, taking days.
It's genuinely dowright depressing to lose a week time of progress in a whim and having the constant fear of this happening again.
The text was updated successfully, but these errors were encountered:
Made this python script to convert a saved duplicate finder result from Czkawka into a Dupeguru file. So you can keep working on Dupeguru if Czkawka crashes without wasting time rescanning the drives again from scratch.
Dupeguru also allows to choose which file you want to keep as the original (Right click > "Mark Selected into Reference") when you symlink groups ("Actions > "Send Marked to Recycle Bin" > "Link deleted files" > "Symlink") as requested here #903 and #149
import json
import xml.etree.ElementTree as ET
def convert_json_to_xml(json_file, xml_file):
# Read JSON data from the input file
with open(json_file, 'r', encoding='utf-8') as f:
data = json.load(f)
# Create the root element of the XML document
results = ET.Element("results")
# Iterate over the data and create XML structure
for size_group in data.values():
for group in size_group:
group_element = ET.SubElement(results, "group")
for file in group:
file_element = ET.SubElement(group_element, "file")
file_element.set("path", file["path"])
file_element.set("words", "")
file_element.set("is_ref", "n")
file_element.set("marked", "n")
# Create an ElementTree object and write it to the XML file
tree = ET.ElementTree(results)
tree.write(xml_file, encoding='utf-8', xml_declaration=True)
# Convert JSON to XML
convert_json_to_xml('czkawka_duplicates.json', 'dupeguru_duplicates.dupeguru')
Feature Description
To avoid redundant scans, I suggest a feature for loading/importing saved lists as-is without re-comparing hashes, at most just checking if those files still exist physically or as a symlink.
I spent a week to scan all my drives, after identifying all duplicate files, I saved them in a list, but I cannot import them back into Czkawka.
So I lost all progress due to a crash and I'm forced to repeat the scan again, the cached .bin hashes are not helping because it's still going to recompare all hashes all over again, taking days.
It's genuinely dowright depressing to lose a week time of progress in a whim and having the constant fear of this happening again.
The text was updated successfully, but these errors were encountered: