# Introduction
In this era of rapidly advancing digital technologies, the intersection of privacy and technological innovation has become a focal point of concern. Our project, "Incognito Face," conceived and developed as part of the Medienverarbeitung 2023/2024 curriculum, seeks to address these concerns by developing a solution that can effectively disrupt these technologies. This introduction outlines the objective, approach, and structure of our project, emphasizing the significance of our goal and the distinct division of our work into two primary components: Face Detection (FD) and Face Recognition (FR).

#### Project Objective
Our primary objective is to implement undetectable filters that significantly hinder state-of-the-art face detection and recognition technologies. We aim to achieve this by increasing the false-positive rates of these technologies without altering the visual perception of the human eye. In essence, our goal is to enhance online privacy by reducing the accuracy of automated facial recognition systems, particularly in scenarios where individuals share images on digital platforms.

#### Approach and Methodology
Our approach involves a user-centric design where individuals can upload custom photos and apply a selection of our implemented filters. These filters are designed to test their efficiency in disrupting face detection and recognition technologies. We have incorporated a range of filters, including those that prevent face detection, increase false positives, and artistic filters, all developed with the intention of preserving the original image's aesthetic while impairing automated recognition systems.

#### Project Structure
**Face Detection (FD)**

The FD component of our project focuses on developing filters that prevent state-of-the-art face detection algorithms from detecting faces. This involves a dive into understanding and experimenting with various algorithms like Viola Jones, Histogram of Oriented Gradients (HOG) + Support Vector Machine (SVM), MTCNN, SSD, and CNN. Our efforts here are geared towards artistic modifications to pictures that can fool these detection algorithms or to create the illusion of more faces to cause false-positives.

**Face Recognition (FR)**

In the FR segment, we extend our work to challenge face recognition systems. This part of the project is dedicated to implementing filters that can effectively mask or morph key facial features in a manner that causes recognition algorithms to fail or misidentify subjects.

# Face Detection

In this section, the state-of-the-art face detection algorithms will be introduced. Afterwards, the filters that prevent the algorithms from detecting faces, the filters that increase the false positive rate of face detection, and artitstic filters will be investigated.

## Face Detection Algorithms

### Viola Jones

The Viola-Jones face detection algorithm is a widely used and efficient method for detecting faces in images.
It was proposed by Paul Viola and Michael Jones in their 2001 paper, "Rapid Object Detection using a Boosted Cascade of Simple Features."

The Viola-Jones algorithm employs a machine learning approach, specifically a variant of the AdaBoost algorithm, to train a cascade of classifiers for face detection. The training process involves selecting a set of Haar-like features, which are simple rectangular patterns that can be computed quickly. These features capture local intensity variations in the image.

In the following, we will give a brief overview of the steps in Viola-Jones.

#### Step 1: Selecting Haar-like features

Haar-like features are essential building blocks in the Viola-Jones face detection algorithm,
capturing distinctive patterns in faces. These features are rectangular and can take various forms,
such as edges, lines, or rectangles with different orientations.

For example, a Haar-like feature might capture the contrast between the eyes and the nose. The choice
of these features is crucial as they serve as the basis for distinguishing between positive (faces) and
negative (non-faces) examples during the training phase.

Here's a simple example image illustrating a Haar-like feature capturing the vertical contrast
between the left and right sides of a face:

<div>
<img src="images/haar-like-features.png" width="400"/>
</div>
#### Step 2 - Creating an integral image

To efficiently compute Haar-like features, the Viola-Jones algorithm uses an integral image. The integral
image is a transformed version of the original image, where each pixel represents the cumulative sum of
all pixels above and to the left of it.

<div>
<img src="images/integral-image.png" width="400"/>
</div>

The integral image enables rapid calculation of the sum of pixel values within any rectangular region,
which is essential for evaluating Haar-like features in constant time.

#### Step 3 - Running AdaBoost training

AdaBoost is a machine learning algorithm employed by the Viola-Jones face detection method to create
a robust and accurate classifier. In this context, the weak classifiers are decision stumps based on
Haar-like features.

The AdaBoost training process involves iteratively selecting the best weak classifiers while assigning
higher weights to misclassified examples from the previous iteration. This iterative process continues
until a predefined number of weak classifiers are trained.

Consider an example image dataset with positive examples (faces) and negative examples (non-faces).
During AdaBoost training, the algorithm learns to focus on the features that effectively discriminate
between the two classes, building a strong classifier that is adept at face detection.

#### Step 4 - Creating classifier cascades

The trained AdaBoost classifier is organized into a cascade of stages in the Viola-Jones algorithm.
Each stage consists of multiple weak classifiers applied sequentially. The cascade structure allows
for the rapid rejection of non-face regions, contributing to the algorithm's efficiency.

<div>
<img src="images/cascade-classifier.png" width="900"/>
</div>

The cascade of classifiers is constructed in such a way that a region of the image must pass all
the classifiers in a stage to be considered a potential face region. If at any stage a region fails
to pass a classifier, it is promptly rejected, saving computational resources. This cascade structure
enhances the Viola-Jones algorithm's speed, making it well-suited for real-time face detection applications.

#### Practical Applications
Our python implementation for Viola-Jones is using the following Code in the backend:


In [9]:
def highlight_face_viola_jones(img: Image):
    img = cv2.cvtColor(numpy.array(img), cv2.COLOR_RGB2BGR)
    gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = viola_jones_detector.detectMultiScale(gray_image, scaleFactor=1.1, minNeighbors=5, minSize=(40, 40))

    for (x, y, w, h) in faces:
        cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 6)

    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    return Image.fromarray(img_rgb), len(faces), '?'

NameError: name 'Image' is not defined

This implementation of Viola-Jones face detection processes a given PIL Image object. It converts the image to BGR format, then to grayscale. Using a pre-trained Haar cascade classifier for frontal faces, it detects faces in the grayscale image. This pre-trained classifier is already provided by the library, so we do not need to train or create our own. Detected faces are outlined with blue rectangles, and the modified image is converted back to RGB format before being returned. The algorithm provides a visual representation of the input image with highlighted face regions.

Here is an example output of the algorithm:

<div>
<img src="images/detected-faces-examples/detected_face_viola_jones.png" width="400"/>
</div>

### HOG and SVM
Another approach to detect faces is using a Histogram of Oriented Gradients (HOG) in combination with a support vector machine as classifier.
HOG is a feature descriptor and is commonly used in image processing that was published by Dalal and Triggs. The algorithm typically consists of the following steps:
1. Image Preprocessing
2. Calculate Gradient
3. Create Histogram of Oriented Gradients
4. Normalise Histogram Vectors

In the following, we will give a brief overview of the steps in HOG.

#### Step 1 - Image Preprocessing

HOG (Histogram of Oriented Gradients) is only used on the part of the image that is relevant for examining a particular subject (a face in our case). For this purpose, it is necessary to first crop this part of the image. For the subsequent calculation of gradients in blocks, the cropped image is resized to a width-to-height ratio of 1:2. In the publication by Dalal and Triggs, 64:128 was chosen because it provided enough information for pedestrian recognition, which was the primary focus of the publication. This image is then divided into blocks of size 8x8, as features are extracted from blocks of pixels rather than individual pixels. Graphically, one can envision this as an 8x16 grid of 8x8 blocks drawn on the image.

#### Step 2 - Calculate Gradient

Since edges represent the boundaries between regions in the image with a significant change in intensity, they are essential to determine the contours of an image. The contours of an image often suffice to classify the objects in the image (a face in our case). To determine the edges, the gradient vector is used because it indicates the direction of the greatest local change in intensity, and its magnitude represents the extent of the change. The gradient vector of a 2-dimensional image is mathematically the partial derivative in the x and y directions. Since the colors of images in computers are represented by discrete color values and are not continuous as in reality, the change in the x and y directions is calculated as follows:
Let $I$ be a function that takes as input the x and y positions of a pixel in the image and outputs the intensity (between 0 and 255). Then, the partial derivative in the x-direction and hence the gradient component $G_x$ is calculated as follows: $G_x = I(x+1, y) - I(x-1, y)$. Similarly, the partial derivative in the y-direction and hence the gradient component $G_y$ is calculated as follows: $G_y = I(x, y+1) - I(x, y-1)$. Thus, the changes in intensity are calculated by considering the horizontal and vertical neighbors of a pixel.
Consider the following image as an example for the pixel with intensity 60, where only 4 out of the 64 values of the 8x8 block are displayed:

![Alt text](images/Hog8x8Grid.png)

For the pixel 60 on the image, the gradient in x-direction will be:
$$G_x = I(x+1, y) - I(x-1, y) = 70 - 40 = 30$$

and for the y-direction:
$$G_y = I(x, y+1) - I(x, y-1) = 70 - 20 = 50$$

Using the gradient in x and y direction, the magnitude and direction of the gradient vector will be calculated using:
$$\text{magnitude} = \sqrt{G_x^2 + G_y^2}$$
$$\text{direction} = \arctan\left(\frac{G_y}{G_x}\right)$$

Here, it should be noted that arctan has a range of values from -90 to 90 degrees, which does not cover a full circle of 360 degrees. In practice, the function arctan2 is often used, which has a range of values from -180 degrees to 180 degrees, thus allowing a bijective mapping to 0-360 degrees. For the example, the calculation looks as follows:
$$\sqrt{G_x^2 + G_y^2} = \sqrt{30^2 + 50^2} \approx 58.31 $$

and the direction would be:
$$\arctan\left(\frac{50}{30}\right) \approx 59.04 \degree $$

This calculation is performed for each pixel in the 8x8 grid, resulting in an 8x8 matrix for the magnitude and an 8x8 matrix for the direction of the gradient vectors. The border is a special case that needs to be addressed (i.e. by using padding). If the image has colors, the calculation is performed for each color channel of a pixel, and the gradient vector with the greatest magnitude is selected from the color channels. The direction of the selected vector is then assigned to the 8x8 direction matrix, and the magnitude of the selected vector is assigned to the 8x8 magnitude matrix for this pixel.

#### Step 3 - Create Histogram of Oriented Gradients

The next step involves creating histograms from the 8x8 matrices of magnitude and direction for all 8x8 blocks obtained in Step 2.

<div>
<img src="images/HOGDia.png" width="500"/>
</div>

On the x-axis are the various directions of the gradient vectors of the respective pixels within an 8x8 block, and on the y-axis is the sum of the magnitude of the gradient vectors for each direction. Usually, only directions between 0-180 degrees are considered, and anything beyond is reduced to this interval due to the symmetry of the gradient. The symmetry of the gradient implies that a strong change in intensity within the range of 180-360 degrees only differs in sign from a strong change in intensity within the range of 0-180 degrees. This means that angles greater than 180 degrees can be brought into the interval between 0-180 degrees by subtracting 180 degrees beforehand without losing important information. The 180 degrees are divided into 9 different bins (0, 20, 40, 60, 80, 100, 120, 140, 160) on the x-axis, and the calculation of the magnitude for these bins is as follows:

Case 1) Precise Allocation Possible
If precise allocation into a bin is possible (e.g., if a pixel has a magnitude of 50 and a direction of 20 degrees), then 50 is added to the sum of the 20-degree bin.

Case 2) Precise Allocation Not Possible
If precise allocation into a bin is not possible (e.g., if a pixel has a magnitude of 50 and a direction of 30 degrees), then the proximity of the pixel to the classes between which it lies (here 20 and 40) is taken as a weight (here, $\frac{1}{2}$ each, as 30 is exactly between 20 and 40). The weight is multiplied by the magnitude and added to the sum of the respective bin. In this example, $\frac{1}{2} \cdot 50 = 25$ is added to the sum of both the 20-degree and 40-degree bins.

Case 3) Angle Between 160 and 180 Degrees:
In this case, everything operates similarly to Case 2), with the difference that even though the proximity of the pixel between the classes of 160 and 180 is calculated, the result of $\text{weight for 180} \cdot \text{magnitude of the vector}$ is added to the sum of the bin in class 0 due to symmetry. However, the result of $\text{weight for 160} \cdot \text{magnitude of the vector}$ is added to the sum of the bin in class 160, similar to Case 2).

When performing this calculation for each pixel of the 8x8 block, the resulting output is the histogram. This histogram can be transformed into a 9x1 vector containing the weighted sum of magnitudes as entries. For an image with dimensions of 64x128, divided into an 8x16 grid of 8x8 blocks, there would then be $8 \cdot 16 = 128$ such 9x1 vectors.

#### Step 4 - Normalise Histogram Vectors

The gradient of an image is sensitive to the overall illumination of the image. When darkening the image (e.g., by halving the intensity values), the length of the gradient vector also halves, resulting in the values in the histogram being halved as well. However, a face should not have different features with half the intensity, which is why the vector needs to be normalized. For normalization, Dalal and Triggs tested various methods. A typical method frequently used for HOG nowadays constructs a 16x16 block from four 8x8 blocks and combines the information into a 36x1 vector (four 9x1 vectors). This vector with 36 entries ($v_1$ to $v_{36}$) is normalized using the L2-norm:

$$\text{magnitude} = \sqrt{v_1^2 + v_2^2 + ..... + v_{36}^2}$$
$$\text{normalised vector} = [\frac{v_1}{\text{magnitude}}, \frac{v_2}{\text{magnitude}}, ....., \frac{v_{36}}{\text{magnitude}}]$$

To extract information from the entire image with dimensions of 64x128, divided into an 8x16 grid of 8x8 blocks, the 16x16 block is first placed at the top left of the image. Then, the block is moved from left to right with a step size of 1 through the entire row of the image. Once a row is completed, the process continues with the next row, iterating until the block traverses the entire image (similar to a sliding window). The block can be shifted a total of 7 times per row and 15 times downwards, resulting in performing $7 \cdot 15$ computations that yield a 36x1 vector as a result. Thus, a total of $7 \cdot 15 \cdot 36 \cdot 1 = 3780$ different entries are obtained, which are transformed into a 3780x1 vector and then passed on to a classifier (e.g., a Support Vector Machine (SVM)). Before passing it to an SVM, this vector probably has to be reduced (to prevent overfitting) using for instance PCA (Principal Component Analysis). However, this will not be explained in this article.

#### SVM with HOG features

The resulting vector from the HOG algorithm, that was potentially reduced using PCA, is often fed to an SVM. The SVM tries to find a hyperplane that best separates the datapoints of different classes in a high-dimensional space. On a basic level, the datapoints can be classified into images that contain a face (positive samples), and images that don't contain a face (negative samples). The HOG features extracted from negative and positive samples can then be used to train the SVM so that it learns to distinguish between images that contain faces and ones that don't. Additionally, a trained SVM can be used as a sliding window that analyses a small part of a predefined size of the image to determine whether this part contains a face or not. This allows to not only classify images with faces correctly, but also to detect faces on the image.

#### Practical Application

Our python implementation for Hog-SVM is using the following Code in the backend:

In [None]:
def highlight_face_hog_svm(img: Image):
    img = cv2.cvtColor(numpy.array(img), cv2.COLOR_RGB2BGR)
    gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = hog_svm_detector(gray_image)

    for face in faces:
        x, y, w, h = face.left(), face.top(), face.width(), face.height()
        cv2.rectangle(img, (x, y), (x + w, y + h), (255, 255, 0), 6)

    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    return Image.fromarray(img_rgb), len(faces), '?'

This implementation of Hog-SVM face detection processes a given PIL Image object. It converts the image to BGR format, then to grayscale. Using a pre-trained hog_svm_detector, it detects faces in the grayscale image. This pre-trained detector is already provided by the library, so we do not need to train or create our own. Detected faces are outlined with turquoise rectangles, and the modified image is converted back to RGB format before being returned. 

Here is an example output of the algorithm:
<div>
<img src="images/detected-faces-examples/detected_face_hog_svm.png" width="400"/>
</div>

### MTCNN

A more recently used approach is the MTCNN face detection method. MTCNN stands for Multi-task Cascaded Convolutional Networks and, as the name suggests, is based on Convolution Neural Networks. As the algorithm, like Viola-Jones, has a cascaded structure and can therefore exclude non-face regions at an early stage, the method is suitable for real-time face detection. MTCNN basically consists of 3 different steps:
1. Face Detection
2. Facial Landmark Detection
3. Face Classification

In the following, we will give a brief overview of the steps in MTCNN.

#### Step 1 - Face Detection

In the first step, the MTCNN recognizes potential candidate faces in the input image. It uses a cascade of convolutional networks to filter out regions that are unlikely to contain a face and focuses on regions with a higher probability of containing a face.
The cascade structure comprises several stages consisting of different CNNs. At each stage, the network limits the number of eligible regions by the result of the CNN.
The end result of this step is a series of bounding boxes that represent the potential face regions in the image.

#### Step 2 - Facial Landmark Detection
Once the potential facial regions are identified, the second step of MTCNN is responsible for locating facial keypoints within each bounding box.
Facial keypoints are specific points on the face, such as the eyes, nose and mouth. These landmarks are critical for tasks such as facial alignment and detection.
The network at this step is designed to regress the coordinates of these facial features for each recognized face.

#### Step 3 - Face Classification

The third step of MTCNN deals with the classification of each bounding box as face or non-face. This step helps to eliminate false positives and improves the accuracy of the overall face recognition system.
A classifier is trained to distinguish between faces and non-faces by extracting features from the candidate regions. 
The result of this step is a refined set of bounding boxes, with the corresponding face keypoints, which are more likely to contain actual faces.

#### Practical Application

Our python implementation for MTCNN is using the following Code in the backend:

In [None]:
def highlight_face_mtcnn(img: Image):
    img = numpy.array(img)
    # Disable printing
    with io.StringIO() as dummy_stdout:
        with redirect_stdout(dummy_stdout):
            faces = mtcnn_detector.detect_faces(img)

    confidence = 100

    for face in faces:
        confidence = round(face['confidence'] * 100, 3)
        x, y, w, h = face['box'][0], face['box'][1], face['box'][2], face['box'][3]
        cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 255), 6)

    return Image.fromarray(img), len(faces), confidence

This implementation of MTCNN face detection processes a given PIL Image object. It converts the image to a numpy array. Using a pre-trained mtcnn detector, it detects faces in the image. This pre-trained detector is already provided by the library, so we do not need to train or create our own. Detected faces are outlined with pink rectangles, and are then being returned.

Here is an example output of the algorithm:
<div>
<img src="images/detected-faces-examples/detected_face_mtcnn.png" width="400"/>
</div>

### SSD

Another approach that has been used more recently is the SSD face detection method. SSD stands for Single-Shot Multibox Detector. It is actually an approach for object recognition, but can also be used for face detection. Just like MTCNN, SSD is based on a CNN that is used for feature extraction. The ability to perform all steps of face detection in a single pass makes this method suitable for real-time face detection. The result of SSD is multiple bounding boxes with potential faces that need to be evaluated against a confidence threshold.

#### Practical Application

Our python implementation for SSD is using the following Code in the backend:

In [None]:
def highlight_face_ssd(img: Image):
    img = cv2.cvtColor(numpy.array(img), cv2.COLOR_RGB2BGR)
    resized_rgb_image = cv2.resize(img, (300, 300))
    imageBlob = cv2.dnn.blobFromImage(image=resized_rgb_image)
    ssd_detector.setInput(imageBlob)
    detections = ssd_detector.forward()

    confidence = 100
    number_of_faces = 0

    # only show detections over 80% certainty
    for row in detections[0][0]:
        if row[2] > 0.80:
            confidence = round(row[2] * 100, 3)
            number_of_faces += 1
            x1, y1, x2, y2 = int(row[3] * img.shape[1]), int(row[4] * img.shape[0]), int(row[5] * img.shape[1]), int(
                row[6] * img.shape[0])
            cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 255), 6)

    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    return Image.fromarray(img_rgb), number_of_faces, confidence

This implementation of SSD face detection processes a given PIL Image object. It converts the image to BGR format and then resized to 300x300 pixels. Using a pre-trained ssd detector, it detects faces in the image. This pre-trained detector is already provided by the library, so we do not need to train or create our own. The detector provides a result for each box position, so we have to sort out unwanted results. Detected faces are outlined with yellow rectangles, and are then being returned.

Here is an example output of the algorithm:
<div>
<img src="images/detected-faces-examples/detected_face_ssd.png" width="400"/>
</div>

## Filter

This part of the documentation will introduce the filters and investigate how precise filters can prevent HOG and SVM from detecting faces successfully. HOG and SVM was chosen because we currently determine the keypoints of the faces with this approach. The images, which were used during this section, stem from the Labeled Faces in the Wild (LFW) dataset. The dataset contains more than 13,000 images of 5,749 people. However, we focused exclusively on individuals for whom a minimum of 100 facial images were available, leading to 1140 final images. Without any modifications, the algorithm detects 1099/1140 faces.

### Cow Face

This filter applies a cow pattern overlay to the facial area of the image. It calculates the bounding box for the face using facial keypoints and resizes the cow pattern to fit this area. The pattern is then applied over the face with a specified level of transparency (alpha_of_cow_pattern), creating a cow-patterned effect on the facial area.

In [None]:
def apply_cow_pattern(image: Image, keypoints, alpha_of_cow_pattern: int = 85) -> Image:
    foreground = Image.open('../backend/filters/cow_pattern.png').convert("RGBA")
    foreground_parts = Image.new('RGBA', image.size)
    for (box, face_keypoints, face_shape_landmarks, _) in keypoints:
        (minX, maxX), (minY, maxY), (width, height) = calculate_face_shape_landmarks_box_positions(face_shape_landmarks)
        new_foreground_part = foreground.resize((width, height), resample=Image.LANCZOS)
        foreground_parts.paste(new_foreground_part, (minX, minY), new_foreground_part)
    foreground_parts.putalpha(alpha_of_cow_pattern)
    image = apply_filter_on_faces(image, keypoints, foreground_parts)
    return image

The idea behind the Cow Face Filter is that it falsifies the magnitude and direction of the gradient vectors in the face, which HOG uses to extract features. The pattern has strong intensity changes (black and white) that will create gradient vectors with a significantly bigger magnitude than the vectors of the original face. We used different alpha values for the pattern, to see how it affects the detection. An example with an alpha value of 45 can be seen below:
<div>
<img src="images/CowMaskwithAlphaof45.png" width="500"/>
</div>

Using the LFW dataset with different alpha values yields the following result:
<div>
<img src="images/CowMaskModification.png" width="500"/>
</div>
This approach does work for preventing face detection, although it also significantly alters the facial features.

### Salt and Pepper Filter

This filter generates a 'salt and pepper' noise effect and applies it over the facial area. The noise is created by randomly assigning black and white pixels in equal proportions and then resizing this noise pattern to fit the face. The pattern is applied with a specified alpha value (alpha_of_salt_n_pepper), overlaying the face with this distinctive noise effect.

In [None]:
def apply_salt_n_pepper(image: Image, keypoints, alpha_of_salt_n_pepper: int = 90) -> Image:
    foreground_parts = Image.new('RGBA', image.size)
    for (box, face_keypoints, face_shape_landmarks, _) in keypoints:
        (minX, maxX), (minY, maxY), (width, height) = calculate_face_shape_landmarks_box_positions(face_shape_landmarks)
        pixels = np.zeros(width * height, dtype=np.uint8)
        pixels[:width * height // 2] = 255  # Set first half to white (value 255)
        np.random.shuffle(pixels)
        rgb_box = np.stack((pixels, pixels, pixels), axis=-1)
        rgb_box_reshaped = np.reshape(rgb_box, (height, width, 3))
        rgb_box_image = Image.fromarray(rgb_box_reshaped)
        rgb_box_image.putalpha(255)
        foreground_parts.paste(rgb_box_image, (minX, minY), rgb_box_image)
    foreground_parts.putalpha(alpha_of_salt_n_pepper)
    image = apply_filter_on_faces(image, keypoints, foreground_parts)
    return image

The Salt and Pepper Filter, which has an additional alpha value for the transparency of the salt and pepper pattern, was based on the same idea as the Cow Face Filter. An example with an alpha value of 45 can be seen below:
<div>
<img src="images/SaltandPepperwithalphaof45.png" width="500"/>
</div>

Using the LFW dataset with different alpha values yields the following result:
<div>
<img src="images/SaltandPepperModification.png" width="500"/>
</div>

Since the salt and pepper is very similar to the cow pattern, the resulting diagram is almost identical.

### Pixelate

This filter creates a pixelation effect by initially reducing the image's resolution and then scaling it back to its original size. The initial downscaling reduces detail, creating larger 'blocks' of color, and the subsequent upscaling maintains this blocky appearance. The degree of pixelation is dictated by the pixel_size parameter, with larger values producing more pronounced pixelation. The effect can be applied to the whole image or focused on a specific area, like the face, as determined by keypoints.

In [None]:
def apply_pixelate(image: Image, keypoints, only_face=True, pixel_size=10) -> Image:
    small = image.resize((image.size[0] // pixel_size, image.size[1] // pixel_size), Image.NEAREST)
    modified_image = small.resize(image.size, Image.NEAREST)
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

The Pixelate filter pixelates the image. On the website, it can be applied only to the face or to the entire image. Since the LFW dataset only contains faces, the filter was applied to the entire image. An example with a pixel size of 2 can be seen below:
<div>
<img src="images/Pixelate.png" width="500"/>
</div>

Using the LFW dataset with different pixel size values yields the following result:
<div>
<img src="images/PixelateModification.png" width="500"/>
</div>

For low pixel size values, the number of detected faces is high and the face is still recognizable. With increasing pixel size values, the face is barely recognizable and the number of detected faces is declining.


### Blur

The function takes a NumPy array representing an image as input. It converts the array to a PIL Image, applies a Box Blur filter with a radius of 10 to the image using the filter() method, and then converts the modified PIL Image back to a NumPy array before returning the result.

In [10]:
def apply_blur(image: np.ndarray):
    pilImage = Image.fromarray(image)
    pilImage = pilImage.filter(ImageFilter.BoxBlur(10))
    return np.array(pilImage)

NameError: name 'np' is not defined

The Blur Filter applies a box blur effect to the image. Similarly to the Pixelate filter, it can be applied only to the face or to the entire image. Since the LFW dataset only contains faces, the filter was applied to the entire image. An example with a radius of 1 can be seen below:
<div>
<img src="images/BoxBlur.png" width="500"/>
</div>

Using the LFW dataset with a radius of 1 yields the following result:
<div>
<img src="images/BoxBlurModification.png" width="500"/>
</div>

The algorithm still detects 1090/1140 faces after the modification.

### Sunglasses

This filter overlays sunglasses onto the face in an image. It identifies the positions of the left and right eyes using facial keypoints and calculates the angle between them to rotate the sunglasses image accordingly. The distance between the eyes is used to dynamically scale the sunglasses' size, ensuring they fit the face proportionally. The sunglasses image is resized and rotated before being superimposed onto the face, creating the appearance of the subject wearing sunglasses.

In [None]:
def apply_sunglasses(image: Image, keypoints, scale_factor: float = 2.5) -> Image:
    foreground = Image.open('filters/sunglasses.png')
    for (box, face_keypoints, face_shape_landmarks, _) in keypoints:
        left_eye = face_keypoints['left_eye']
        right_eye = face_keypoints['right_eye']
        dx = right_eye[0] - left_eye[0]
        dy = right_eye[1] - left_eye[1]
        angle_radians = math.atan2(-dy, dx)
        angle_degrees = math.degrees(angle_radians)
        eye_distance = math.dist((right_eye[0], right_eye[1]), (left_eye[0], left_eye[1]))
        foreground_width_to_height_ratio = foreground.size[0] / foreground.size[1]
        foreground = foreground.resize(size=(
            int(scale_factor * eye_distance), int(scale_factor * eye_distance / foreground_width_to_height_ratio)))
        rotated_overlay = foreground.rotate(angle_degrees, expand=True)
        left_part = (scale_factor - 1) / 2
        left_upper_sunglasses = (int(left_eye[0] - eye_distance * left_part),
                                 int(left_eye[1] - eye_distance * left_part / foreground_width_to_height_ratio))
        left_upper_paste = (left_upper_sunglasses[0], int(left_upper_sunglasses[1] - math.fabs(
            math.cos(math.radians(90 - angle_degrees)) * scale_factor * eye_distance)))
        image.paste(rotated_overlay, left_upper_paste, rotated_overlay)
    return image

A more artistic approach, is the application of sunglasses on the eyes of a given face:
<div>
<img src="images/SunglassesonFace.png" width="500"/>
</div>

Using the LFW dataset with different alpha values yields the following result:
<div>
<img src="images/SunglassesonFaceModification.png" width="500"/>
</div>

Since this approach does not nearly cover enough area of the face and is rather artistic, the result is not surprising to us. 1000 of the 1140 faces were successfully discovered after the modification.

### Medicine Mask

This filter overlays a medical-style mask image onto the face. It locates the position of the mouth and nose using facial keypoints and adjusts the size and rotation of the mask image to align with these features. The mask is placed to cover the lower half of the face, resembling the appearance of wearing a medical mask.

In [None]:
def apply_medicine_mask(image: Image, keypoints) -> Image:
    foreground = Image.open('filters/medicine_mask.png').convert("RGBA")
    for (box, face_keypoints, face_shape_landmarks, _) in keypoints:
        left_mouth = face_keypoints['left_mouth']
        right_mouth = face_keypoints['right_mouth']
        dx = right_mouth[0] - left_mouth[0]
        dy = right_mouth[1] - left_mouth[1]
        angle_radians = math.atan2(-dy, dx)
        angle_degrees = math.degrees(angle_radians)
        face_width = box[2]
        foreground_width_to_height_ratio = foreground.size[0] / foreground.size[1]
        foreground = foreground.resize(size=(face_width, int(face_width / foreground_width_to_height_ratio)))
        rotated_overlay = foreground.rotate(angle_degrees, expand=True)
        left_upper_face_mask = (box[0], face_keypoints['nose'][1])
        left_upper_paste = (left_upper_face_mask[0], int(left_upper_face_mask[1] - math.fabs(
            math.cos(math.radians(90 - angle_degrees)) * face_width)))
        image.paste(rotated_overlay, left_upper_paste, rotated_overlay)
    return image

Another artistic approach is the application of a medicine mask to the face:
<div>
<img src="images/MedicineMaskExample.png" width="500"/>
</div>

Using the LFW dataset with different alpha values yields the following result:
<div>
<img src="images/MedicineMaskModification.png" width="500"/>
</div>

This approach covers more area of the face. It leads to a considerable change (120 out of 1140 were detected) in comparison to the Sunglasses approach.

### Hiding Faces

This filter randomly places multiple small face mask images over the image. It ensures that these masks do not overlap with the facial areas identified by the keypoints. Each mask is resized according to specified dimensions (`face_mask_width` and `face_mask_height`) and applied with a certain level of transparency (`alpha_of_masks`). This creates a scattered mask effect across the image, avoiding the actual facial areas.

In [None]:
def apply_hide_with_masks(img: Image, keypoints, number_of_masks: int = 40,
                          face_mask_width: int = 75, face_mask_height: int = 75,
                          alpha_of_masks: int = 45) -> Image:
    foreground = Image.open('../backend/filters/whole_face_mask.png').convert('RGBA')
    foreground_alpha = apply_alpha_to_transparent_image(foreground, alpha_of_masks)
    face_and_mask_coordinates = find_face_rectangles_mtcnn(keypoints)
    mask_cords = find_free_coordinates_outside_of_rectangles(img, number_of_masks, face_mask_width, face_mask_height,
                                                             face_and_mask_coordinates)
    for mask_coords in mask_cords:
        resized_foreground = foreground_alpha.resize((face_mask_width, face_mask_height), resample=Image.LANCZOS)
        img.paste(resized_foreground, (mask_coords[0], mask_coords[1]), resized_foreground)

    return img

The goal of this approach is to add additional artifacts to the image, that should erroneously be classified as face by classifiers. An example would be adding face masks to the image and making them barely visible with a low alpha value:
<div>
<img src="images/HideWithMaskExample.png" width="900"/>
</div>

This approach only slightly modifies the face (for low alpha values), and it increases the false positives of the classifiers.

## Face Recognition

After we had completed the part with face detection, we wanted to deal with the area of face recognition. Face recognition is simpler than face detection, as a face must first be detected before it can be recognised. But what is face recognition?
Face recognition is the task of recognising faces. However, recognising faces can mean several things. On the one hand, it can mean that faces can be assigned to known persons. These known persons are then stored in a database and for each picture of a face it can then be said whether it is, for example, Olaf Scholz or Christian Lindner. Another method of recognition is that you don't even know which person a face belongs to, but you can say that two faces belong to the same person.
<div>
<img src="images/FaceRecognitionDatabase.png" width="900"/>
</div>

As we don't want to create a large database that might have to store sensitive data, we decided that our WebApp should only be able to tell whether two pictures show the same person.
To do this, we used the [face-recognition library](https://github.com/ageitgey/face_recognition) . This library offers many useful methods for face recognition. The library uses the face recognition model from [dlib](http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2) to compare the faces. The model calculates a 128D vector from the keypoints of the face, which represents the characteristics of the face. Two of these vectors can then be checked for similarity using the Euclidean distance. If the distance is below a defined threshold value, it is assumed that the person is the same. The model has achieved an accuracy of 99.38% on the standard LFW face recognition benchmark. Dlib also provides a small [demo](http://dlib.net/face_recognition.py.html) if you want to test the model yourself.

The face-recognition library provides two different methods for face recognition. The first method is based on five keypoints, taking only two keypoints from both eyes and one keypoint from the nose. The other method is based on 68 keypoints and uses the same keypoints that we use for the filters. As the difference in speed is very small, but the accuracy of 68 keypoints was better, we decided in favour of the 68 keypoints approach.

The whole process is shown in this diagram:
<div>
<img src="images/FaceRecognitionProcess.png" width="900"/>
</div>


## Practical Application

Our python implementation for the face recognition is using the following Code in the backend:

In [2]:
def recognize_faces(orig_img: Image, mod_img: Image, orig_keypoints):
    orig_img_bgr = cv2.cvtColor(np.array(orig_img), cv2.COLOR_RGB2BGR)
    mod_img_bgr = cv2.cvtColor(np.array(mod_img), cv2.COLOR_RGB2BGR)

    gray_image = cv2.cvtColor(np.asarray(mod_img_bgr), cv2.COLOR_BGR2GRAY)
    faces = hog_svm_detector(gray_image)
    face_encodings_unknown = []
    boxes_mod = []
    for face in faces:
        box = [face.left(), face.top(), face.width(), face.height()]
        face_encodings_unknown.append(np.array(calculate_face_encoding(np.asarray(mod_img_bgr), box)))
        boxes_mod.append((face.left(), face.top(), face.width(), face.height()))

    face_encodings_orig = []
    boxes_orig = []
    for box, _, _, face_encoding_orig in orig_keypoints:
        face_encodings_orig.append(np.array(face_encoding_orig))
        boxes_orig.append(box)

    count_of_matches = 0

    for j, face_encoding_unknown in enumerate(face_encodings_unknown):
        matches = face_recognition.compare_faces(face_encodings_orig, face_encoding_unknown, tolerance=0.55)

        for i, match in enumerate(matches):
            if match:
                count_of_matches += 1
                selected_color = palette[(j * 2) % num_colors]
                bgr_color = tuple(int(value * 255) for value in selected_color)
                box_orig = boxes_orig[i]
                box_mod = boxes_mod[j]
                top, right, bottom, left = box_orig[1], box_orig[0] + box_orig[2], box_orig[1] + box_orig[3], box_orig[
                    0]
                cv2.rectangle(orig_img_bgr, (left, top), (right, bottom), bgr_color, 6)

                top, right, bottom, left = box_mod[1], box_mod[0] + box_mod[2], box_mod[1] + box_mod[3], box_mod[0]
                cv2.rectangle(mod_img_bgr, (left, top), (right, bottom), bgr_color, 6)

    orig_img_rgb = cv2.cvtColor(orig_img_bgr, cv2.COLOR_BGR2RGB)
    mod_img_rgb = cv2.cvtColor(mod_img_bgr, cv2.COLOR_BGR2RGB)
    return Image.fromarray(orig_img_rgb), Image.fromarray(mod_img_rgb), count_of_matches Code

SyntaxError: invalid syntax (341758604.py, line 41)

This implementation takes an original image with the calculated keypoints, which also contains the 128D vector of the characteristics of the face and a potential modified image that is compared to the original image. The modified image is pre-processed and then the face bounding boxes are calculated using the svm+hog face detection algorithm. The bounding boxes are then used to compute the 128D vector of the faces found in the image. 

Each face found in the modified image is then compared to each face in the original image. If the Euclidean distance is less than the given tolerance of 0.55, a match is registered. For each match, a pair of colored boxes is drawn in both of the images where the matching faces are. The number of pairs is also counted.

Here is an example output of the algorithm:
<div>
<img src="images/detected-faces-examples/example_face_recognition.png" width="500"/>
</div>

## Data Modification

This part of the documentation will investigate how precise filters can prevent the face recognition. The images, which were used during this section, stem from the Labeled Faces in the Wild (LFW) dataset. The dataset contains more than 13,000 images of 5,749 people. In this section, we only consider pairs of two images containing the same person, resulting in 1100 final image pairs. Without any modifications, the algorithm recognizes 918/1100 faces.

### Filters

The filters we use are based on those already used for face detection and new filters specifically designed to prevent face recognition. The description of the face detection filters can be found in the face detection section.

### Morphing
We used the following morphing function in the next three filters to modify the original image. The function performs localized morphing on an image using specified keypoints. It takes an image (`image_cv`), a set of keypoints, a `radius` defining the area of effect around each keypoint, and a `morph_strength` value to control the intensity of the morph.

For each keypoint, the function creates a grid of coordinates covering the image. It calculates the distance from each grid point to the keypoint and applies a circular mask based on the specified radius. This mask isolates the effect to a circular area around the keypoint. A displacement field is then computed, diminishing exponentially with distance from the keypoint and scaled by `morph_strength`. This field dictates how much each point in the masked area is moved, creating the morphing effect.

The displacement is applied to the grid points, shifting them in a manner that reflects the direction and magnitude of the displacement field. The OpenCV function cv2.remap is used to remap the image based on the adjusted grid coordinates, effectively warping the image around each keypoint. This process is repeated for all keypoints, with the cumulative effect resulting in a complex morphing of the image. The function finally returns the morphed image.

In [None]:
def morph_image(image_cv, keypoints, radius, morph_strength):
    h, w = image_cv.shape[:2]
    for point in keypoints:
        x_grid, y_grid = np.meshgrid(np.arange(w), np.arange(h))
        dx = x_grid - point[0]
        dy = y_grid - point[1]
        distance = np.sqrt(dx ** 2 + dy ** 2)
        mask = np.where(distance < radius, 1, 0)
        displacement = np.exp(-distance / radius) * morph_strength
        displacement *= mask
        map_x = x_grid + displacement * np.sign(dx)
        map_y = y_grid + displacement * np.sign(dy)
        image_cv = cv2.remap(image_cv, map_x.astype(np.float32), map_y.astype(np.float32), cv2.INTER_LINEAR)
    return image_cv

#### Morph Eyes

This filter specifically targets the eyes in an image to apply a morphing effect. It first checks for the presence of facial keypoints and, if found, identifies the positions of the left and right eyes. The filter then applies a morphing algorithm to these eye regions. The radius parameter, which is dynamically set based on the size of the detected face, defines the area around each eye that is affected by the morphing. The morph_strength parameter controls the intensity of the morphing effect. The result is a transformation of the eye regions, creating a unique, modified appearance of the eyes in the image.

In [None]:
def apply_morph_eyes(image: Image, keypoints, radius=75, morph_strength=10.0) -> Image:
    if len(keypoints) == 0:
        return image
    image_cv = np.array(image)
    image_cv = cv2.cvtColor(image_cv, cv2.COLOR_RGB2BGR)
    eye_points = [face_keypoints[key] for _, face_keypoints, _, _ in keypoints for key in ['left_eye', 'right_eye']]
    radius = keypoints[0][0][2] / 3
    image_cv = morph_image(image_cv, eye_points, radius, morph_strength)
    return Image.fromarray(cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB))

<div>
<img src="images/MorphEyeswithmorphstrengthof5.png" width="500"/>
</div>

#### Morph Mouth

This filter is designed to morph the mouth area of an image. After ensuring that facial keypoints are present, it locates the positions of the left and right mouth corners and the nose. These points define the region around the mouth to be morphed. Similar to the Morph Eyes Filter, the radius for the morphing effect is determined based on the face size, specifically set to half the radius used for the eyes. The morph_strength parameter adjusts the level of morphing applied. This filter alters the mouth region, modifying its appearance in a distinctive way.

In [None]:
def apply_morph_mouth(image: Image, keypoints, radius=75, morph_strength=10.0) -> Image:
    if len(keypoints) == 0:
        return image
    image_cv = np.array(image)
    image_cv = cv2.cvtColor(image_cv, cv2.COLOR_RGB2BGR)
    mouth_points = [face_keypoints[key] for _, face_keypoints, _, _ in keypoints for key in
                    ['left_mouth', 'right_mouth', 'nose']]
    radius = keypoints[0][0][2] / 6
    image_cv = morph_image(image_cv, mouth_points, radius, morph_strength)
    return Image.fromarray(cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB))

<div>
<img src="images/MorphMouthwithmorphstrengthof5.png" width="500"/>
</div>

#### Morph All

This filter applies a comprehensive morphing effect to all key facial features. It utilizes all the detected facial keypoints, including points defining the face outline. The filter calculates a radius that is smaller compared to the previous filters, as it applies the morphing effect more broadly across the face. The morph_strength parameter still controls the intensity of the morphing. This all-encompassing approach results in a more dramatic transformation of the entire face, altering multiple features simultaneously for a significant visual change.

In [3]:
def apply_morph_all(image: Image, keypoints, radius=75, morph_strength=10.0) -> Image:
    if len(keypoints) == 0:
        return image
    image_cv = np.array(image)
    image_cv = cv2.cvtColor(image_cv, cv2.COLOR_RGB2BGR)
    all_points = [point for _, face_keypoints, outline, _ in keypoints for key, point in face_keypoints.items()] + [pt for _, _, outline, _ in keypoints for pt in outline]
    radius = keypoints[0][0][2] / 12
    image_cv = morph_image(image_cv, all_points, radius, morph_strength)
    return Image.fromarray(cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB))

NameError: name 'Image' is not defined

<div>
<img src="images/MorphAllwithmorphstrengthof5.png" width="500"/>
</div>

### Result

We only tested our most promising filters which should prevent face recognition without altering the image too much. The result is shown in the bar chart below:
<div>
<img src="images/FaceRecognitionAnalysis.png" width="500"/>
</div>

The diagram shows that many of the filters achieve good results.
The Salt&Pepper and Cowface filters are probably so good because they prevent face detection. The Medicine Mask filter hides the lower keypoints, which is why it's so good. It is also noticeable that the keypoints of the eyes are very important for face recognition, as Morph Eyes is significantly better than Morph Mouth. 

# The Web Application
We based our project on the provided demo applications. The app is split into a seperate frontend and backend component.
The frontend is implemented using Vue and the Vuetify framework.

It operates on a simple yet effective principle: users can upload their custom photos to the platform. Once uploaded, they have the option to apply a selection of developed filters. These filters are designed to test their effect on face detection and recognition algorithms.


### Frontend
The frontend can be run by installing node and the required dependencies `npm install` afterwards the application is compiled and started through `npm run serve`.

The UI is seperated into two tabs, one for FD and one for FR, this allows a differentiation between the two needed user input setups while keeping it on the same loaded page.

**FD-Tab**
<div>
<img src="images/WebApp-FD.png" width="900"/>
</div>

**FR-Tab**
<div>
<img src="images/WebApp-FR.png" width="900"/>
</div>

### Backend
The backend is structured around the FastAPI framework, designed to provide a robust and efficient interface for our web application. The backend is pivotal in handling the core functionalities of our project, including image processing, filter application, and face detection and recognition algorithms. Below, we outline the various API endpoints incorporated into our backend, each playing a crucial role in our application's functionality.

The backend can be run by installing node and the required dependencies `pip install -r .\requirements.txt` afterwards the application is started through `uvicorn app.main:app --reload`

#### Endpoints

**`/`** A simple endpoint to verify that the API is online, returning a basic confirmation message.

**`/convert-image`** This endpoint accepts an image in base64 format and converts it, ensuring it's in the correct RGB mode for processing.

**`/get-filters`** Retrieves the list of available filters with their respective attributes, indicating their applicability to face detection, recognition, and whether they target the face only.

**`/get-algorithms`** Provides a list of available face detection algorithms, including Viola-Jones, HOG-SVM, MTCNN, and SSD, each with a distinct approach to detecting faces in images.

**`/apply-filter`** Applies a specified filter to an image. It processes the image based on the selected filter, which can range from blurring and pixelation to more complex operations like morphing facial features.

**`/run-face-detection`** Runs face detection algorithms on the provided image, employing a multi-threaded approach to process different algorithms concurrently for efficiency.

**`/run-face-recognition`** This endpoint is designed to execute face recognition on the original and modified images, assessing the impact of applied filters on recognition accuracy.

**`/generate-keypoints`** Initiates the generation of keypoints on the provided image, a crucial step in applying certain filters and for the facial recognition process.

#### Modular Filter System
To add new filters, you should first add a new entry to the `FILTERS` array in the JSON structure at the top of the backend code. This entry should specify the name of the filter, the display name, the face detection and recognition flags, and whether it applies to the face only. For example:
```json
{
    "name": "newFilterName",
    "displayName": "New Filter Display Name",
    "faceDetection": true/false,
    "faceRecognition": true/false,
    "faceOnly": true/false
}
```
Then, in the `/apply-filter` method, include the filter's specific processing logic. This method uses a match-case statement to apply the selected filter based on its name. This is where the specific image processing actions for the new filter need to be implemented:
```python

@app.post('/apply-filter')
async def apply_filter(data: ApplyFilterRequestData):
   <...>
        match data.filter:
            <...>
            case 'newFilterName':
               apply_newFilterName(...)
```

## Videos
**FD-Tab demonstation**
<div>
<iframe width="900" height="506" src="https://youtube.com/embed/5wzBoQGOEZA"></iframe>
</div>

[Youtube-Link](https://www.youtube.com/watch?v=5wzBoQGOEZA)


**FR-Tab demonstration**
<div>
<iframe width="900" height="506" src="https://youtube.com/embed/rygrHl7ffP4"></iframe>
</div>

[Youtube-Link](https://www.youtube.com/watch?v=rygrHl7ffP4)

# Additional Implemented image manipulation ideas

### `apply_dithering`
The dithering process in this filter reduces the color range of the image to a predefined palette, in this case, 16 colors. It achieves this through a quantization process that groups similar colors, followed by a conversion back to an RGB format. The outcome is an image characterized by distinct color blocks and a notable reduction in color gradation. This effect can be applied globally to the image or localized to a region determined by facial keypoints.

In [None]:
def apply_dithering(image: Image, keypoints, only_face=True) -> Image:
    modified_image = image.quantize(colors=16).convert('RGB')
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

### `apply_max_filter`
This filter operates by scanning the image and replacing each pixel with the maximum pixel value in its neighborhood, defined by a specified filter size. As a result, brighter areas within the filter's radius become more prominent, while darker regions are subdued. This enhances the luminance contrast and can accentuate certain features of the image. The filter can be applied to the entire image or restricted to a region identified by facial keypoints.

In [None]:
def apply_max_filter(image: Image, keypoints, only_face=True) -> Image:
    modified_image = image.filter(ImageFilter.MaxFilter(9))
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

### `apply_min_filter`
This filter is the inverse of the Max Filter. It scans the image and replaces each pixel with the minimum pixel value in its neighborhood. This process emphasizes darker areas and reduces the prominence of brighter ones. The Min Filter accentuates shadows and darker regions, providing a contrasting effect to the Max Filter. It can be used across the whole image or targeted to a specific area using facial keypoints.

In [15]:
def apply_min_filter(image: Image, keypoints, only_face=True) -> Image:
    modified_image = image.filter(ImageFilter.MinFilter(9))
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

NameError: name 'Image' is not defined

### `apply_closing`
The Closing Filter combines the effects of the Min and Max Filters in sequence. Initially, it applies the Min Filter, which reduces the prominence of smaller bright areas, followed by the Max Filter, which enhances the surrounding brighter regions. This sequence effectively 'closes' small gaps and dark spots, resulting in a smoother appearance in bright areas of the image. This filter can be applied universally or selectively based on keypoints.

In [None]:
def apply_closing(image: Image, keypoints, only_face=True) -> Image:
    modified_image = apply_min_filter(apply_max_filter(image, keypoints, False), keypoints, False)
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

### `apply_opening`
The Opening Filter reverses the sequence of the Closing Filter. It starts with the Max Filter, which diminishes small dark regions, followed by the Min Filter, which then reduces the surrounding darker areas. This 'opening' effect is particularly noticeable in darker parts of the image, creating a sense of expansion in these regions. The filter can be applied to the entire image or localized to a specific area using keypoints.

In [None]:
def apply_opening(image: Image, keypoints, only_face=True) -> Image:
    modified_image = apply_max_filter(apply_min_filter(image, keypoints, False), keypoints, False)
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

### `apply_color_shift`
This filter introduces a random shift in the color channels of the image. It independently alters the red, green, and blue channels by a random value within a specified range. The effect of this is a shift in the overall color balance of the image, producing a variety of color tones and hues. The degree of color shift is controlled by the max_shift_intensity parameter. The filter can modify the entire image or be confined to a particular region, such as the face, based on keypoints.

In [None]:
def apply_color_shift(image: Image, keypoints, only_face=True, max_shift_intensity=25) -> Image:
    r_shift = random.randint(-max_shift_intensity, max_shift_intensity)
    g_shift = random.randint(-max_shift_intensity, max_shift_intensity)
    b_shift = random.randint(-max_shift_intensity, max_shift_intensity)
    r, g, b = image.split()
    r = r.point(lambda i: i + r_shift)
    g = g.point(lambda i: i + g_shift)
    b = b.point(lambda i: i + b_shift)
    modified_image = Image.merge('RGB', (r, g, b))
    if only_face:
        return swap_images_at_face_position(image, keypoints, modified_image)
    else:
        return modified_image

### `apply_whole_face_mask`
This filter applies a face mask image over the entire face. It uses facial keypoints to determine the position and orientation of the face. The mask is resized to match the width of the face and rotated to align with the angle between the eyes. The mask is then placed over the face, covering it entirely, simulating the effect of wearing a full-face mask.

In [None]:
def apply_whole_face_mask(image: Image, keypoints) -> Image:
    foreground = Image.open('filters/whole_face_mask.png').convert("RGBA")
    for (box, face_keypoints, face_shape_landmarks, _) in keypoints:
        left_eye = face_keypoints['left_eye']
        right_eye = face_keypoints['right_eye']
        dx = right_eye[0] - left_eye[0]
        dy = right_eye[1] - left_eye[1]
        angle_radians = math.atan2(-dy, dx)
        angle_degrees = math.degrees(angle_radians)
        face_width = box[2]
        foreground_width_to_height_ratio = foreground.size[0] / foreground.size[1]
        foreground = foreground.resize(size=(face_width, int(face_width / foreground_width_to_height_ratio)))
        rotated_overlay = foreground.rotate(angle_degrees, expand=True)
        left_upper_face_mask = (box[0], box[1])
        left_upper_paste = (left_upper_face_mask[0], int(left_upper_face_mask[1] - math.fabs(
            math.cos(math.radians(90 - angle_degrees)) * face_width)))
        image.paste(rotated_overlay, left_upper_paste, rotated_overlay)
    return image

### `apply_highlight_keypoints`
This filter visually highlights the facial keypoints and connections between them on the image. It draws circles around each keypoint and lines connecting them, using different colors for different facial features. This filter serves to visually emphasize the positions and relationships of facial features as identified by the keypoints.

In [None]:
def apply_highlight_keypoints(image: Image, keypoints) -> Image:
    draw = ImageDraw.Draw(image)
    if len(keypoints) > 0:
        for keypoint_set in keypoints:
            for j in range(len(keypoint_set[2])):
                x, y = keypoint_set[2][j]
                if j < len(keypoint_set[2]) - 1:
                    next_x, next_y = keypoint_set[2][j + 1]
                    draw.line((x, y, next_x, next_y), fill='lightgreen', width=3)
                radius = 5
                draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill='green', outline='lightgreen')
            for feature, coords in keypoint_set[1].items():
                x, y = coords
                radius = 10
                draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill='red', outline='red')
    return image

### `apply_distance_transformation`
This filter applies a distance transformation technique to an image. It starts by converting the image to grayscale and then to a binary format based on a specified threshold (in this case, 128). Binary dilation and erosion operations are performed on the binary image to highlight regions of change. Dilation expands the white areas, while erosion shrinks them. The filter then compares the dilated and eroded images, highlighting the differences with white pixels. The result is an image that emphasizes the structural changes in the original image, providing a unique visual representation of distance transformation.

In [16]:
def apply_distance_transformation(image: Image) -> Image:
    image_np = np.array(image.convert('L'))
    threshold = 128
    binary_image = (image_np > threshold).astype(np.uint8)
    dilated = binary_dilation(binary_image, iterations=5)
    eroded = binary_erosion(binary_image, iterations=5)
    morphed_image = np.where(dilated != eroded, 255, 0).astype(np.uint8)
    return Image.fromarray(morphed_image).convert('RGB')

NameError: name 'Image' is not defined

### `apply_vertical_edge`
 This filter emphasizes vertical edges in an image using a specific kernel in a convolution process. The kernel used is designed to respond strongly to vertical lines or edges by subtracting the pixel value on the left from the pixel value on the right. This operation enhances vertical features while diminishing horizontal features. The filter is particularly effective in highlighting vertical structures or details in an image, making them more pronounced against the background.

In [None]:
def apply_vertical_edge(image: Image) -> Image:
    return image.filter(ImageFilter.Kernel((3, 3), (-1, 0, 1, -2, 0, 2, -1, 0, 1), 1, 0))

### `apply_horizontal_edge`
Similar to the Vertical Edge Filter, this filter is designed to accentuate horizontal edges in an image. It uses a convolution kernel that contrasts pixel values above and below each pixel in the image. By doing so, horizontal lines or edges become more pronounced. This filter is useful for emphasizing horizontal features or details within the image, enhancing their visibility and distinction from the rest of the image elements.

In [None]:
def apply_horizontal_edge(image: Image) -> Image:
    return image.filter(ImageFilter.Kernel((3, 3), (-1, -2, -1, 0, 0, 0, 1, 2, 1), 1, 0))

# First image manipulation ideas (not in web-app)

### `create_rotate_image`

In [13]:
def create_rotate_image(angle: int):
    def rotate_image(image: np.ndarray):
        pilImage = Image.fromarray(image)
        pilImage = pilImage.rotate(angle=angle)
        return np.array(pilImage)
    return rotate_image

### `flip_image_horizontally`
The function takes a NumPy array representing an image as input, converts it to a PIL Image, flips the image horizontally using the transpose() method with the specified transformation (FLIP_LEFT_RIGHT), and then converts the horizontally flipped PIL Image back to a NumPy array before returning the result.

In [14]:
def flip_image_horizontally(image: np.ndarray):
    pilImage = Image.fromarray(image)
    pilImage = pilImage.transpose(Image.FLIP_LEFT_RIGHT)
    return np.array(pilImage)

NameError: name 'np' is not defined

### `change_to_grayscale`
The function accepts a NumPy array representing an image as input, transforms it into a PIL Image, converts the color image to grayscale using the convert() method with the 'L' mode, and then converts the resulting grayscale PIL Image back to a NumPy array before returning the processed image.

In [None]:
def change_to_grayscale(image: np.ndarray):
    pilImage = Image.fromarray(image)
    pilImage = pilImage.convert('L')
    return np.array(pilImage)

### `apply_gauss_noise`
The function generates Gaussian noise with a mean of 50 and a standard deviation of 10 using NumPy's random.normal() function. This noise is created to match the shape of the input image. The generated noise is then added to the original image using the OpenCV add() function, resulting in an image with applied Gaussian noise.

In [None]:
def apply_gauss_noise(image: np.ndarray):
    gauss = np.random.normal(50, 10, image.shape).astype('uint8')
    return cv2.add(image, gauss)

### `blur_edges`

In [None]:
def blur_edges(image: np.ndarray):
    # edge detection
    grey_scale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    horizontalEdgeImage = convolve2d(grey_scale_image, np.array([[-1, -2, -1], [0, 0, 0], [1, 2, 1]]), mode='same', boundary='symm')
    verticalEdgeImage = convolve2d(grey_scale_image, np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]), mode='same', boundary='symm')
    plt.matshow(horizontalEdgeImage, cmap='gray')
    plt.show()
    plt.matshow(verticalEdgeImage, cmap='gray')
    plt.show()
    edge_image = pow(pow(horizontalEdgeImage, 2) + pow(verticalEdgeImage, 2), 0.5)
    plt.matshow(edge_image, cmap='gray')
    plt.show()

    plt.matshow(np.abs(edge_image), cmap='gray')
    plt.show()

    print(edge_image.shape)
    edge_image_all_colors = np.tile(edge_image, (1, 3))

    print(edge_image_all_colors.shape)
    edge_image_all_colors_reshaped = edge_image_all_colors.reshape(edge_image.shape[0], edge_image.shape[1], 3)

    print(edge_image_all_colors_reshaped.shape)
    image_edge_blur = image + ((edge_image_all_colors_reshaped - image) * 0.2).astype('uint8')
    plt.matshow(image_edge_blur)
    plt.show()

    laplacian = cv2.Laplacian(grey_scale_image, cv2.CV_64F)
    plt.matshow(laplacian, cmap='gray')
    plt.show()

    plt.matshow(np.abs(laplacian), cmap='gray')
    plt.show()

    return image_edge_blur

### `create_add_black_squares`
The outer function, `create_add_black_squares`, takes two integer parameters, `square_size` and `number_of_squares`, and returns an inner function `add_black_squares`. When this inner function is called with a NumPy array representing an image, it iteratively adds black squares to the image. The size and number of squares are determined by the parameters provided during the creation of the outer function. The positions of the black squares are randomly generated within the image dimensions, and the pixel values within the specified square regions are set to 0 (black). The resulting image is then adjusted to ensure pixel values remain within the valid 0-255 range before being returned.

In [None]:
def create_add_black_squares(square_size: int, number_of_squares: int):
    def add_black_squares(image: np.ndarray):
        for _ in range(number_of_squares):
            x = np.random.randint(0, image.shape[0] - square_size)
            y = np.random.randint(0, image.shape[1] - square_size)
            image[x:x + square_size, y:y + square_size] = 0  # 0 because squares are black

        # Ensure the image values stay within 0-255 range
        return np.clip(image, 0, 255).astype(np.uint8)

    return add_black_squares

### `apply_bilateral_filter`
The function applies a bilateral filter to the input image using the OpenCV `bilateralFilter` function. The bilateral filter is a non-linear, edge-preserving smoothing filter that considers both spatial and intensity differences between pixels. The parameters used for this filter are set to 9 for the diameter of the pixel neighborhood, and 75 for both the color and spatial sigma values. These parameters control the extent of filtering in terms of pixel proximity and color similarity. The resulting filtered image is then returned as the output of the function.

In [None]:
def apply_bilateral_filter(image):
    # bilateral filter (explain params)
    filtered_image = cv2.bilateralFilter(image, 9, 75, 75)
    return filtered_image