Skip to content
/ DIF Public

A simple and fast web app to remove duplicate images from your datasets.

Notifications You must be signed in to change notification settings

rajatdv/DIF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Duplicate Image Finder

When creating image dataset for deep learning project, there are high changes that the dataset contains multiple duplicate images.
This standalone tool will help in finding and removing those duplicate images using a simple interface. And if you are a developer you can easily customize the code according to your need.

Table of contents

Demo

(top)

Installation

(top)

  1. git clone https://github.com/rajat-1994/DIF.git
  2. cd DIF
  3. pip install -r requirements.txt

Usage

(top)

Just run below command after installation and you are good to go.

  • python app.py

NOTE : As you delete images from the interface, in the backend a file files.csv is saved. After you are done with cleaning your dataset you can just read the csv and filter the deleted images.

df = pd.read_csv('files.csv')
df = df[df.is_deleted==0]        

Reference

(top)

Releases

No releases published

Packages

No packages published