ISCC: International Standard Content Code
-
Updated
Apr 30, 2024 - Python
ISCC: International Standard Content Code
A Simple Image Clustering Script using CLIP and Hierarchial Clustering
Bachelor's Thesis on Near-Duplicate Image Detection. This repo contains all resources, code, and documentation developed during the process.
Python library for detecting near duplicate texts in a corpus at scale using Locality Sensitive Hashing, as described in chapter three of Mining Massive Datasets.
Language of Vectors (LangVec) is a simple Python library designed for transforming numerical vector data into a language-like structure using a predefined set of words (lexicon).
Add a description, image, and links to the near-duplicate-detection topic page so that developers can more easily learn about it.
To associate your repository with the near-duplicate-detection topic, visit your repo's landing page and select "manage topics."