Interesting (non-cryptographic) hashes implemented in pure Python.
-
Updated
Sep 3, 2021 - Python
Interesting (non-cryptographic) hashes implemented in pure Python.
Duplicates Detector is a cross-platform GUI utility for finding duplicate files, allowing you to delete or link them to save space. Duplicate files are displayed and processed on two synchronized panels for efficient and convenient operation.
Advanced Duplicate File Finder for Python
Tool for removing duplicate documents from Elasticsearch
Advanced similarity and duplicate source code proof of concept for our research efforts.
QGIS Processing plugin to add an algorithm for upserting features from a source vector layer to an existing target vector layer.
A command-line tool which automate the deletion of duplicate files based on their hash or perceptual-hash.
Find, remove and avoid duplicates with dugu: The Duplicates Guru
A python program to locate duplicate files - and do it fast
Uses SSIM and MSE to get rid of duplicates and near duplicates
This program finds duplicate files in a folder and its subfolders. Duplicates are moved to a separate folder. A few other modes of operation are also planned/available.
This cleans up duplicate SMS entries in a backup created by SMS Backup & Restore Android app.
Remove duplicates from parallel corpora
Compares the files in a folder with md5 checksums and deletes duplicate files or moves them to the desired folder.
A project aiming to leverage text embeddings and Milvus, a high-performance vector search engine, to detect duplicate job postings.
🐻 The decluttering deduplicator
A fast random number file generator, that generates a text document of randomized integers (uniquely random also supported) within the specified constraints. This project was created to aid in benchmarking my own Double-Edged Sorting algorithm.
Identification of gene paralogs in genomes, and calculation of dS and dN/dS values for paralogous gene pairs
Add a description, image, and links to the duplicates topic page so that developers can more easily learn about it.
To associate your repository with the duplicates topic, visit your repo's landing page and select "manage topics."