Image Difference Segmentation

Automatically generate segmentation masks and train segmentation models from image differences.

This project enables creation of large binary segmentation datasets through use of image differences. Certain domains, such as comic books or manga, take particularly well to the proposed approach. Creating a dataset and training a segmentation model involves two manual steps (outside of the code in this repository):

Finding and sorting suitable data. Ideally, your data should have two or more classes wherein the only difference between the classes should be the subject that is to be segmented. An example would be an English page from a comic and a French page from the same comic.
Segmentation masks must be manually created for a small number of image differences. Using a pretrained DiffNet requires only 20-50 new masks. Re-training DiffNet from scratch requires 100-200 masks. For quickly generating binary segmentation masks, simple-masker was written/used.

Prerequisites

The following must be on your system:

Python 3.6+
An accompanying Pip installation
Python and Pip must be accessible from the command line
An NVIDIA GPU that is CUDA-capable (6GB+ of VRAM likely needed)

Installation

Clone the repository to a convenient location.

git clone https://github.com/Brikwerk/image-difference-segmentation

Navigate to the repository on your command line and install the required packages with pip. Please note, you may need to install PyTorch/Torchvision using installation links at https://pytorch.org

pip install -r requirements.txt

Creating a Dataset

Before beginning, make sure you have your data sorted into class-based folders. In the below example, the Comic_Pages dataset contains two classes, ENG and FRENCH. Each class should contain images that roughly match. For example, a page from the same comic book but in different languages.

Comic_Pages
    |
    --> ENG
    |    |
    |     --> 00001.png
    |     --> 00002.png
    |         ...
    --> FRENCH
        |
        --> 00001.png
        --> 00002.png
            ...

Run the match_images.py folder to (A) match up all images between the two classes and (B) generates image differences. match_images.py takes CLI arguments that determine where data comes from and where data goes. For details on the CLI arguments, run the file with the --help flag.

python match_images.py --help

The strictly necessary flags are the a_root, b_root and dst_root flags. Other flags can be used to finetune output for better results.

You should have two folders in your destination directory: a_images and b_images. Both will contain the warped/matched/renamed input images and the respective differences images for each class.

Creating Masks from Difference Images

To create masks for a folder full of difference images, run the batch_process.py file. The main CLI arguments associated with this file are:

weights_path: The path to a weights file for a DiffNet instance
input_path: The to path to a folder that contains difference images
output_path: The path to a folder where masks will be output. This folder does not necessarily need to exist.

For more info on other configuration options, run batch_process.py with the --help flag:

python batch_process.py --help

Training a DiffNet or SegNet instance

To train an instance of DiffNet or SegNet, run the train.py file. The following arguments must be specified for a proper training run:

images_path: The to path to a folder that contains input images
masks_path: The path to a folder that contains masks.

The above image/mask pairs should match name-wise.

For more info on other configuration options, run train.py with the --help flag:

python train.py --help

Using a Pretrained Model

Downloading the Weights File

Weights for this project are hosted at HuggingFace under weights directory. Currently, a DiffNet instance trained on text differences is provided. To use this model, download it and move it to the weights directory in your local copy of this repository.

Using Pretrained Weights

Pretrained weights can be used in the batch_process.py file and the evaluate.py file. For both files, specify the path to your weights file using the --weights_path CLI argument.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
imgs		imgs
src		src
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
batch_process.py		batch_process.py
clean_dataset.py		clean_dataset.py
evalute.py		evalute.py
match_images.py		match_images.py
requirements.txt		requirements.txt
train.py		train.py

License

Brikwerk/image-difference-segmentation

Folders and files

Latest commit

History

Repository files navigation

Image Difference Segmentation

Automatically generate segmentation masks and train segmentation models from image differences.

Prerequisites

Installation

Creating a Dataset

Creating Masks from Difference Images

Training a DiffNet or SegNet instance

Using a Pretrained Model

Downloading the Weights File

Using Pretrained Weights

License

About

Resources

License

Stars

Watchers

Forks

Languages