PHP script to recursively find duplicates of images in a directory to facilitate easy removal of them
aziraphale/Image-De-duplicator
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
master
Could not load branches
Nothing to show
Could not load tags
Nothing to show
{{ refName }}
default
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
-
Clone
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI. Learn more.
- Open with GitHub Desktop
- Download ZIP
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Image De-Duplicator ======================================= This PHP script will scan a specified directory of images, building up a database of information about each image within that directory (or its subdirectories), including the images' MD5 hashes, file sizes, dimensions and an approximation of their graphical content. This database is then cross-referenced with itself to generate a list of images which are either identical (matching MD5 hashes, etc.) or visually similar (perhaps two sizes of the same image, two formats of the same image content (e.g. GIF vs. PNG) or re-compressed JPEG images (which will therefore have different compression artefacts). This generated list of duplicate images can then be used to manually delete images, as desired. Initial Setup ------------------- Before you can use this script you must first create the two MySQL database tables it requires and change the configuration lines at the top of the script. The required database tables are: CREATE TABLE `comp` ( `a` varchar(255) NOT NULL, `b` varchar(255) NOT NULL, `pixeldiffs` text NOT NULL, `avgpixeldiff` float DEFAULT NULL, PRIMARY KEY (`a`,`b`), KEY `avgpixeldiff` (`avgpixeldiff`) ); CREATE TABLE `image` ( `name` varchar(255) NOT NULL, `size` mediumint(9) NOT NULL, `hash` char(32) NOT NULL, `isimage` tinyint(1) NOT NULL, `w` smallint(6) NOT NULL, `h` smallint(6) NOT NULL, `pixels` text, PRIMARY KEY (`name`), KEY `size` (`size`), KEY `hash` (`hash`) ); Usage ------------------- The first execution of this script should be from the command line in order to generate the database of images: `php image-deduplicator.php` You can then access the same script via a Web browser (and associated httpd) to view its results. The Web half of this script supports a few GET arguments: * "limit" - An integer specifying the number of results to show. * "offset" - An integer specifying the offset to pass to the MySQL "SELECT" query. This can be used to paginate the results. * "rm" - If this argument is present, the script will return a string of quoted filenames that can be passed directly to Linux's `rm` command to delete the images that would have otherwise been displayed if this argument wasn't present.
About
PHP script to recursively find duplicates of images in a directory to facilitate easy removal of them
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published