Creating your own pix2pix dataset
=================================



## Installation requirements

To run this and the following Pix2Pix notebooks you may need to install some new Python packages. To do so, open a terminal and first make sure your environment is active
```
conda activate dmlap
```
It is not obligatory, but since we only just recently tested these new dependencies, you may want to duplicate your current dmlap environment and use that instead 
with 
```
conda create -n dmlap2 --clone dmlap
```
and then use the new `dmlap2` environment instead.
With the environment active:
```
pip install face_recognition
pip install pyglet
```
Then:
```
conda install conda-forge::cairo
```
Followed by:
```
conda install conda-forge::pycairo
```

If you have not done so already, you should also need to install the [py5canvas](https://github.com/colormotor/py5canvas) module. To do so use 
```
pip install git+https://github.com/colormotor/py5canvas.git
```

**NOTE** there is a chance that the installation with pycairo might go wrong on Mac. If that is the case, you may need to do:
```
xcode-select --install
``` 
From the command line.
### Updating py5canvas
If you already installed py5canvas from the previous examples, you will need to updated it to the latest version. To do so use 
```
pip install --upgrade  --force-reinstall --no-deps git+https://github.com/colormotor/py5canvas.git
```

## Pix2pix datasets
A pix2pix dataset consists of a series of image pairs. Each pair consists of a *source* (or input) image and *target* image. The Pix2Pix model learns the transformation from source to target images and, hopefully after sufficient training, it learns to transform images similar to the training sources into images similar to the training targets. The standard Pix2Pix implementation operates on input and output images with a size of 256x256 pixels. There are different ways in which a pix2pix training set may be organized. Here we adopt the convetion of source and target images are layed next to each other into single training images that are 512x256 pixels. The code below allows you to create such a training set from arbitrarily sized images. Let's begin by importing some necessary modules:

In [1]:
## Modules
import matplotlib.pyplot as plt
import numpy as np
from skimage import io, transform
from skimage import feature, filters
import cv2
import glob
from tqdm.auto import tqdm
import random

## Setting up 
The following code is designed to work with different kinds of inputs. It requires setting the following parameters:

-   `target_path` defines where your **target** images are located.
-   `source_path` defines where your **source** images are located, if you already have these. Otherwise, set this to an empty string `''`.
-   `dataset_path` defines where your pix2pix dataset will be saved.
-   `is_input_pix_to_pix` set this to `True` if the input dataset already consists of an source and target pairs merged into a single image. This will be the case if you want to modify the source for an existing pix2pix dataset. An example of this situation may be that we have a dataset that translates edges to images of faces and we want to modify the input so it consists of face landmarks. In this case we need to extract only the target.
-   `target_index` indicates where the target will is located in the training set.  If the target image is to the left set it to (`0`) or to (`1`) if the target image is to the right.

Note you will have to put exactly the path to your image directories here, this code does not recursively search for images. Also note that the most common use case for this system will be with you providing an dataset of targets (desired outputs) that you will process to create the corresponding inputs (e.g. with edge detection or finding face landmarks). In that case you should not worry about the `source_path` directory below.

Here, by default we will load the &ldquo;Face 2 comics&rdquo; dataset. Download the dataset from [https://www.kaggle.com/datasets/defileroff/comic-faces-paired-synthetic](https://www.kaggle.com/datasets/defileroff/comic-faces-paired-synthetic), unzip, and place the `face2comics_v1.0.0_by_Sxela` subdirecory of the archive in the dataset directory relative to this notebook. This is already a &ldquo;pix2pix-friendly&rdquo; dataset consisting, however, of pairs of images that are separated. We will use the images to create an &ldquo;Edges to comics&rdquo; dataset, where we apply edge detection to a subset of the source images and leave the corresponding comic version unchanged.



In [2]:
import os

target_path = './datasets/face2comics_v1.0.0_by_Sxela/comics/'
source_path = './datasets/face2comics_v1.0.0_by_Sxela/face/'  # Only used if we already have source image examples
dataset_path = './datasets/edge2comics'
max_images = 500
is_input_pix_to_pix = False
target_index = 1

# Uncomment and adjust paths to perform face detection as the source
# target_path = './datasets/edges2rembrandt'
# source_path = ''
# dataset_path = './datasets/landmarks2rembrandt'
# max_images = 500
# is_input_pix_to_pix = True
# target_index = 1

The code above also contains a commented section with paths for the case in which you operate on an existing pix2pix dataset consisting of 512x256 images, and you want to replace the source with a custom one. Later in the code you will find a commented section that identifies face landmarks in the targets of the dataset and use the polylines of the face landmarks as a source. For this specific example to work, it is expected that you download the ["rembrandt pix2pix dataset"](https://www.kaggle.com/datasets/grafstor/rembrandt-pix2pix-dataset/code) and unzip the images into a "edges2rembrandt" folder inside the dataset folder relative to this notebook.

## Load the images to process



Now let&rsquo;s load our target images and, optionally, our source images if we have set the `source_path` directory

In [None]:

def load_image(path):
    w, h = (256, 256)
    if is_input_pix_to_pix: # In case we are already loading a pix2pix image
        w, h = (512, 256)
    img = io.imread(path) #image.load_img(path, target_size=size)
    img = transform.resize(img, (h, w), anti_aliasing=True)
    # If we are loading a pix2pix dataset just extract the target
    if is_input_pix_to_pix:
        if target_index==0:
            img = img[:,:h,:]
        else:
            img = img[:,h:,:]
    return (img*255).astype(np.uint8)

def load_images_in_path(path):
    files = glob.glob(path + '/*')
    images = []
    if max_images:
        n = len(files)
        files = files[:max_images]
        print('%d of %d images'%(len(files), n))
    else:
        print('%d images'%len(files))
    for imgfile in tqdm(files): #, desc='Loading images in ' + path):
        img = load_image(imgfile)
        images.append(img)
    return images

print('Loading targets')
target_images = load_images_in_path(target_path)
if source_path:
    print('Loaded sources')
    source_images = load_images_in_path(source_path)


## Defining a sources and targets

Now the procedure is to create our source images. Usually this will be done by applying a **transormation** to the target. The network will then learn to "translate" between this transformed source and the target. Note that for the pix2pix model to learn something useful, we need to have some form of correlation between source and target pairs. The code below has a number of transformations already setup for you. These are:

-   `apply_canny_cv2` Applies Canny edge detection to an image. This uses OpenCV to apply the edge detection filter. You can set two parameters (thresholds between 0 and 255) that will determine the result of the edge detection: `thresh1` and `thresh2`. Experiment with these values to adjust the results to your liking. Additional details can be seen [here](https://docs.opencv.org/4.x/dd/d1a/group__imgproc__feature.html#ga04723e007ed888ddf11d9ba04e2232de).
-   `apply_canny_skimage` Also applies Canny edge detection to an image, but it uses[scikit-image](https://scikit-image.org) for the edge detection, which has different parameters to OpenCV. You can set one parameter, `sigma` that determines the number of edges. In general, a higher number will produce less edges. See [this](https://scikit-image.org/docs/stable/auto_examples/edges/plot_canny.html) for additional details.
-   `apply_face_landmarks` Finds face landmarks in an image by using [face_recognition](https://pypi.org/project/face-recognition/) and uses the Canvas API to draw the landmark polygons. Note that this function will fail if the face detector cannot find a face in the image. The code is set up so the image won't be included in the generated dataset if face detection fails.
-   `apply_nothing` Leaves an image unchanged.



In [None]:
def apply_canny_cv2(img, thresh1=160, thresh2=250):
    import cv2
    invert = False
    grayimg = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    edges = cv2.Canny(grayimg, thresh1, thresh2)
    if invert:
        edges = cv2.bitwise_not(edges)
    return cv2.cvtColor(edges, cv2.COLOR_GRAY2RGB)

def apply_canny_skimage(img, sigma=1.5):
    import cv2
    from skimage import feature
    invert = False
    grayimg = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    edges = (feature.canny(grayimg, sigma=sigma)*255).astype(np.uint8)
    if invert:
        edges = cv2.bitwise_not(edges)
    return cv2.cvtColor(edges, cv2.COLOR_GRAY2RGB)

def apply_face_landmarks(img, stroke_weight=2):
    from py5canvas import canvas
    import face_recognition
    
    c = canvas.Canvas(256, 256)
    c.background(0)
    landmarks = face_recognition.face_landmarks(img)

    if not landmarks:
        # print('Failed to find landmarks')
        return None
    c.stroke_weight(stroke_weight)
    c.no_fill()
    c.stroke(255)
    for points in landmarks[0].values():
        c.polyline(points)
    return c.get_image()

def apply_nothing(img):
    return img

# Used to assign the tranformations to source or target
def transform_source(func):
    
    def transform_func(index):
        return func(source_images[index])
    return transform_func

def transform_target(func):
    def transform_func(index):
        return func(target_images[index])
    return transform_func



You select how to combine these functions and how to generate source and target image by assigning two variables `source_fun` and `target_fun`. We assign a transformation to them via a `transform_source` and `transform_target` functions that take one of the transformations above (or one you define) as a parameter. For example:
```Python
source_fun = transform_target(apply_canny_skimage)
target_fun = transform_target(apply_nothing)
```
Means:
- Create a new source image by applying `apply_canny_skimage` to the provided target image
- Create a new target image by applying `apply_nothing` to the provided target image (that is leave it unchanged)

Or 
```Python
source_fun = transform_source(apply_canny_skimage)
target_fun = transform_target(apply_nothing)
```
Means:
- Create a new source image by applying `apply_canny_skimage` to the provided **source** image (assumes we secified one)
- Create a new target image by applying `apply_nothing` to the provided target image (that is leave it unchanged)

And so forth... 
By default, the code below takes the source images from the face2comics datasets and creates a new "edges2comics" dataset by applying edge detection to the original source images (simply images of faces). The commented section applies face detection to the target images, you can use that if you decide to build the "landmarks2rembrandt" dataset.

In [None]:
source_fun = transform_source(apply_canny_skimage)
target_fun = transform_target(apply_nothing)

# landmarks2rembrandt example
# source_fun = transform_target(apply_face_landmarks)
# target_fun = transform_target(apply_nothing)


# Show an example for the transformation that creates the source, 
# we give the function the index of the image we want to process
index = 1
target = target_images[index]
source = source_fun(index)
plt.figure()
plt.subplot(1, 2, 1)
plt.imshow(source)
plt.subplot(1, 2, 2)
plt.imshow(target)
plt.show()


## Create the dataset!



Once we have defined all these settings, we only need to apply the transformations and save our dataset. We loop through all the input source and target images, apply the transformationsand stitch the new source and targets into a single image. 

In [None]:

# target_index = 1 # You can redefine this if you wish for example to flip target and source

num_images = max_images 
shuffle = False
image_indices = list(range(len(target_images)))
if shuffle:
    random.shuffle(image_indices)
if num_images != 0:
    image_indices = image_indices[:num_images]

os.makedirs(dataset_path, exist_ok=True)

index = 1
for i in tqdm(image_indices, desc='Saving dataset to ' + dataset_path):
    target = target_fun(i)
    source = source_fun(i)
    if source is None:
        # print('Failed to transform image %d of %d'%(i+1, len(image_indices)))
        continue

    # Concatenate images into one and save
    if target_index==1:
        combined = np.hstack([source, target])
    else:
        combined = np.hstack([target, source])
    io.imsave(os.path.join(dataset_path, '%d.png'%(index)), combined)
    index += 1

print("Final dataset contains %d images"%index)