# Working with Synthesis AI face dataset
At first, we add some imports for the visualisation.

In [None]:
%matplotlib inline

In [None]:
import matplotlib.pyplot as plt
import cv2
import numpy as np

We use `FaceApiDataset` class to access synthesis datasets.


In [None]:
from face_api_dataset import FaceApiDataset, Modality

**Warning!** Some of modalities requires additional libraries to be installed:
`SEGMENTS` and `RGB` modalities use `opencv-python` library,
 while `DEPTH`, `ALPHA` and `NORMALS` modalities
 use `tiffile`  and `imagecodecs` libraries for effective work with floating point tiff files.
 If dataset with these modalities will be created without corresponding libraries present, an `ImportError` is raised.


In [None]:
data_root = "../test_dataset"
dataset = FaceApiDataset(data_root)

The only required parameter is dataset root. By default all the modailities are loaded.

In [None]:
len(dataset)

There are 16 items in the test dataset. Let's explore them closer.

In [None]:
item = dataset[0]
item2 = dataset[1]

Each item is a dict with different modalities as keys.

In [None]:
print(item.keys())

`RENDER_ID` is the id of the image (number used in the stem of the file).

In [None]:
item[Modality.RENDER_ID]

`RGB` modality is the rendered image.

In [None]:
plt.figure(figsize=(20,20))
plt.imshow(item[Modality.RGB])

`FACE_BBOX` modality is a face bounding box in format `x0, y0, x1, y1`.

In [None]:
def plot_bbox(image, face_bbox):
    plt.figure(figsize=(20,20))

    image = cv2.rectangle(image.copy(), face_bbox[:2], face_bbox[2:], (255, 0, 0), 2)
    plt.imshow(image)

In [None]:
plot_bbox(item[Modality.RGB], item[Modality.FACE_BBOX])

`SEGMENTS` modality is the segmentation map. We can see mapping of different segments to numbers with `segments`
method o the dataset.

In [None]:
dataset.segments

**Warning!** `render_id.segments.png` file should not be loaded as is for the segmentation as segment mapping differs
between different files. Fortunately `FaceApiDataset` class solves this issue for us.

Segmentation is an integral numpy array with the same dimensions as the original image.
To display it we write a simple helper function:

In [None]:
def discrete_show(data):
    cmap = plt.get_cmap('RdBu', np.max(data) - np.min(data) + 1)
    plt.imshow(data, cmap=cmap, vmin=np.min(data) - .5,
               vmax = np.max(data) + .5, interpolation="nearest")

Let's look at segmentation results.

In [None]:
plt.figure(figsize=(20,20))
discrete_show(item[Modality.SEGMENTS])

Different datasets define different segments. If we need to comply to some standards, we may define our own
segmentation mapping:

In [None]:
segments = { 'default': 0,
             'background': 0,
             'beard': 1,
             'body': 3,
             'brow': 1,
             'cheek_left': 1,
             'cheek_right': 1,
             'chin': 1,
             'clothing': 3,
             'ear_left': 1,
             'ear_right': 1,
             'eye_left': 1,
             'eye_right': 1,
             'eyelashes': 1,
             'eyelid': 1,
             'eyes': 1,
             'forehead': 1,
             'glasses': 0,
             'hair': 2,
             'head': 1,
             'headphones': 0,
             'headwear': 0,
             'jaw': 1,
             'jowl': 1,
             'lip_lower': 1,
             'lip_upper': 1,
             'mask': 0,
             'mouth': 1,
             'mouthbag': 1,
             'mustache': 1,
             'neck': 3,
             'nose': 1,
             'nose_outer': 1,
             'nostrils': 1,
             'shoulders': 3,
             'smile_line': 1,
             'teeth': 1,
             'temples': 1,
             'tongue': 1,
             'undereye': 1 }

We can provide this mapping during the creation of the dataset to change the segmentation modality output.

In [None]:
plt.figure(figsize=(20,20))
discrete_show(FaceApiDataset(data_root, segments=segments)[0][Modality.SEGMENTS])

Normals modality is a 3-channel numpy array with each chanel values from `-1.0` to `1.0`.

In [None]:
plt.figure(figsize=(20,20))
plt.imshow(((item[Modality.NORMALS] + 1) / 2 * 255).astype(np.uint8))

Alpha channel is a grayscale image, useful ex. for matting.
Unlike segmentation, it can represent semi-transparent parts of the face (ex. hair).

In [None]:
plt.figure(figsize=(20,20))
plt.imshow((item[Modality.ALPHA]).astype(np.uint8), cmap="gray")

Depth modality is an array of positive floats. Background is set to have depth equal to `0`,
and for the rest of the image it represents distance to camera space in centimeters.

We write a simple helper function and display it.

In [None]:
def depth_show(img, shift=0.1):
    eps = 0.003
    d_min = img[img > eps].min()
    d_max = img[img > eps].max()
    d_img = np.copy(img)
    d_img[d_img < eps] = 0
    d_img[d_img > eps] = (d_img[d_img > eps] - d_min) / (d_max - d_min) *  (1 - shift) + shift
    plt.imshow((d_img * 255).astype(np.uint8), cmap="gray_r")

In [None]:
plt.figure(figsize=(20,20))
depth_show(item[Modality.DEPTH])

Landmarks are in iBUG format. Each of 68 landmarks is represented by its `x` and `y` coordinates in image space,
`y` coordinate going from top to bottom.

In [None]:
def landmark_show(img, landmarks, radius=2, labels=True):
    l_img = np.copy(img)
    for name, x, y in landmarks:
        int_p = (int(x), int(y))
        cv2.circle(l_img, int_p, radius=radius, color=(255, 0, 0), thickness=cv2.FILLED)
    plt.imshow(l_img)

In [None]:
plt.figure(figsize=(20,20))
landmark_show(item[Modality.RGB], item[Modality.LANDMARKS_IBUG68])

**Warning!** Contour landmarks are slightly different from iBUG.
Instead of showing the contour of the face in our dataset they have the fixed position on the face.

This looks a bit strange on rotated images, but these landmarks are more useful for multiple tasks,
such as facial pose retrival and special effects.

In [None]:
plt.figure(figsize=(20,20))
landmark_show(item[Modality.RGB], item[Modality.LANDMARKS_CONTOUR_IBUG68])

iBUG does not provide landmarks for pupiles, but they can be important in multiple tasks.
Thus we provide two additional landmarks for them:

In [None]:
item[Modality.PUPILS]

In [None]:
plt.figure(figsize=(20,20))
landmark_show(item[Modality.RGB], item[Modality.PUPILS])

Gaze direction can be important in multiple tasks. We provide gaze direction in
format `(horizontale_angle, vertical_angle)`.

In [None]:
item[Modality.GAZE]

In [None]:
def gaze_show(img, pupils, gaze, length = 100):
    g_img = np.copy(img)
    start = pupils.astype(np.uint32)
    end = (pupils + np.sin(gaze * np.pi / 180.) * length).astype(np.uint32)
    for eye in [0, 1]:
        cv2.arrowedLine(g_img, tuple(start[eye]), tuple(end[eye]),
                        color=(255, 0, 0), thickness=2, tipLength=1)
    plt.imshow(g_img)

In [None]:
plt.figure(figsize=(20,20))
gaze_show(item[Modality.RGB], item[Modality.PUPILS], item[Modality.GAZE])

In [None]:
plt.figure(figsize=(20,20))
gaze_show(item2[Modality.RGB], item2[Modality.PUPILS], item2[Modality.GAZE])

In addition there are modalities to represent different kind of metadata:

Identity (for face_id tasks)

In [None]:
item[Modality.IDENTITY]

Identity metadata, such as age or gender.

In [None]:
item[Modality.IDENTITY_METADATA]

Information about hairstyle.

In [None]:
item[Modality.HAIR]

Information about facial hair.

In [None]:
item2[Modality.FACIAL_HAIR]

If corresponding attribute is not present on the image, the modality is `None`.

In [None]:
item[Modality.FACIAL_HAIR]

Expression modality shows facial expression and its intensity.

In [None]:
item[Modality.EXPRESSION]

Usually not all the modalities are needed, so we can only load selected modalities in the dataset.

In [None]:
dataset2 = FaceApiDataset(data_root, modalities=[Modality.RGB, Modality.SEGMENTS], segments=segments)

In [None]:
dataset2[0].keys()

Usually there are no problems with the amount of synthetic images,
however augmentations are still useful as they help to bridge the reality gap.
We can provide transformations in dataset constructor to implement augmentations needed.

Below we show how to use `albumentations` library with Synthesis AI dataset for the segmentation task.

In [None]:
import albumentations as A

In [None]:
aug = A.Sequential([A.RandomRotate90(p=1), A.GridDistortion(p=1)])

def transform(item):
    augmented = aug(image=item[Modality.RGB], mask=item[Modality.SEGMENTS])
    return {
        Modality.RGB: augmented['image'],
        Modality.SEGMENTS: augmented['mask']
    }

dataset3 = FaceApiDataset(data_root, modalities=[Modality.RGB, Modality.SEGMENTS],
                          segments=segments, transform=transform)

aug_item = dataset3[0]
aug_item2 = dataset3[0]

In [None]:
plt.figure(figsize=(20,20))
plt.imshow(aug_item[Modality.RGB])


In [None]:
plt.figure(figsize=(20,20))
discrete_show(aug_item[Modality.SEGMENTS])


In [None]:
plt.figure(figsize=(20,20))
plt.imshow(aug_item2[Modality.RGB])

In [None]:
plt.figure(figsize=(20,20))
discrete_show(aug_item2[Modality.SEGMENTS])
