Interact, analyze and structure massive text, image, embedding, audio and video datasets
-
Updated
Jun 27, 2024 - Python
Interact, analyze and structure massive text, image, embedding, audio and video datasets
Interactive code for image similarity using SIFT algorithm
Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
Advanced similarity and duplicate source code proof of concept for our research efforts.
Advanced Duplicate File Finder for Python
CLI utility to find near duplicate images and remove all but the best copy.
Find similar audio files easily
An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
Duplicates Detector is a cross-platform GUI utility for finding duplicate files, allowing you to delete or link them to save space. Duplicate files are displayed and processed on two synchronized panels for efficient and convenient operation.
Findm is a python script to find duplicate file copies in a given directory.
A discord bot for automatically detecting duplicate images using perceptual image hashing and similar techniques.
Find, remove and avoid duplicates with dugu: The Duplicates Guru
Python tool to help you knockout duplicate entries from multiple files and generate the final output
This Python packages identifies duplicate files in a folder of interest.
An End-to-End Evaluation Framework for Entity Resolution Systems
A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash
When Importing multiple CSV files Bitwarden creates Duplicate Entries. So this Python script will remove duplicate entries and keep ONE.
A duplicate file finder like rdfind/fdupes et al that may be faster in environments with millions of files and terabytes of data or over high latency filesystems (e.g. NFS).
Uses SSIM and MSE to get rid of duplicates and near duplicates
Add a description, image, and links to the duplicate-detection topic page so that developers can more easily learn about it.
To associate your repository with the duplicate-detection topic, visit your repo's landing page and select "manage topics."