This repository provides a script and a set of instructions to construct datasets (e.g. Male-to-Female or Glasses Removal) suitable for training of the unpaired image-to-image translation models (CycleGAN, CouncilGAN, UVCGAN, etc).
-
Download the CelebA dataset. The following files are required:
- Train/Val/Test Partitions. File name
list_eval_partition.txt
- Attributes Annotations. File name
list_attr_celeba.txt
- Aligned and Cropped Images. Archive name
img_align_celeba_png.7z
- Train/Val/Test Partitions. File name
-
Unpack the CelebA image archive
img_align_celeba_png.7z
. For example,
7z x img_align_celeba_png.7z # requires 7zip installed
- Use the provided
convert_celeba.py
script to convert the raw CelebA dataset into the CycleGAN form.
To create a Male-to-Female dataset you can use the following command:
python3 convert_celeba.py \
--list-attr PATH_TO_list_attr_celeba.txt \
--list-part PATH_TO_list_eval_partition.txt \
--attr Male \
PATH_TO_EXTRACTED_CELEBA_IMAGES \
OUTPUT_DIRECTORY
Or, to create a Glasses removal dataset, you can run the following command:
python3 convert_celeba.py \
--list-attr PATH_TO_list_attr_celeba.txt \
--list-part PATH_TO_list_eval_partition.txt \
--attr Eyeglasses \
PATH_TO_EXTRACTED_CELEBA_IMAGES \
OUTPUT_DIRECTORY
To run the convert_celeba.py
script one needs to have a working python3
interpreter and the following additional packages installed:
pandas
tqdm
To unpack the original CelebA dataset one needs to have a 7zip
installed.
The checksums
directory contains the reference checksums of the
Male-to-Female and Glasses removal datasets. These checksums were calculated
over the datasets provided by CouncilGAN (CouncilGAN file names changed to
match the CelebA).
The checksums
directory also contans the reference checksums of the original
CelebA dataset archive.