Skip to content
Andrew Binstock edited this page Sep 28, 2020 · 7 revisions

What it is

FileDedupe is a utility that checks one or more directories for duplicate files. Just run it with a list of directories on the command line. The default is to check all subdirectories. This can be controlled (see below). The output is a text file, which is written to stdout consists of the name of files that have duplicates. The file is given followed by its duplicates.

An article on this utility and how it was designed and written appears in Oracle's Java Magazine

A subsequent article explains version 2.0, which contains the optimizations that led to a 9x-11x performance improvement.

How to run

FileDedupe is written in Java 8. To run it, run the JAR file with the directory or directories to scan for duplicates. Note that directory of . is supported. Options:

-nosubdirs this flag prevents FileDedupe from checking subdirectories for duplicates.

-help or -h o '--h': shows this usage information

Testing

The tests included here generate code coverage of 80%. And FileDedupe has been tested repeatedly on directories of more than 600,000 files.

A big thank you to JetBrains for its support for open source via a license for IntelliJ IDEA, which was used on this project (and many others).

www.jetbrains.com/idea

Clone this wiki locally