Skip to content

codeburd/Confero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Confero is a utility for searching large collections of files and finding ones that are similar, but not identical.

It's not as capable at finding duplicate images as an image-specific utility, or as capable at finding duplicate audio as an audio-specific utility. But it does work as a much more general utility.

It will find, for example, different archives that contain an overlap of files within - providing the archives use the same compression. Or different edits all derived from the same source document.

It uses some common algorithms to identify files which have a large overlap in byte sequence, regardless of the offset of those sequences within the file.

The mman.c/.h files are only needed for compiling on Windows. They just wrap the Windows API functions, and are copied for convenience from https://github.com/boldowa/mman-win32/

About

A general-purpose near-duplicate file finder.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages