# Homework 4: Adversarial Models (100 points)

### Overview

Finally, we will review adversarial machine learning models, a powerful paradigm that encompasses multiple classes of predictive models. These models can be used for anything from image generation to data augmentation to text translation.

### Model Architecture

An adversarial model consists of two pieces: a generator and a discriminator. The generator produces some output using a given noise vector (for example, an artificial image). The goal of the discriminator is then to differentiate real images from generated images.

The discriminator can be trained directly using ground-truth data, and the generator can then be trained in a feedback loop with the discriminator. In this way, the generator aims to fool the discriminator, and the discriminator aims to become more and more discerning.

### Your Task

Complete the questions below. Note that while most of this assignment refers to GANs (Generative Adversarial Networks), the first question speaks to generative models as a broader model architecture rather than GANs specifically.

Below each question is a cell with the text “Type Markdown and LaTex.” Double-click the cell and type your response to the question. Save your responses by clicking on the floppy disk icon or choosing File - Save and Checkpoint.

After responding to the questions, download your notebook as a `.html` file by choosing File - Download as - html (.html). You will be submitting this `.html` file to your instructor for grading.

## Homework Questions

### Question 1: Generative Models vs Discriminative Models (20 points)

What are the key differences between a generative model and a discriminative model from a statistical point of view? Explain them. 

**Answer:**

From a statistical point of view, a generative model seeks to minimize the objective function in such a way that D(G(z)) is close to 1, which implies the ability of the generative model to generate fake images which look real. In other words, the parameters learned maximize the joint probability of P(X,Y). It tries to explain how the data was generated by taking into account the distribution of the dataset to generate a probability and is generally used in unsupervised learning tasks.

On the other hand, a discriminative model tries to maximize the objective function in such a way that D(x) is close to 1 (which represents the recognition of real images with a high level of confidence) and D(G(z)) is close to 0 (which represents the recognition of fake images with a high level of confidence). In other words, the parameters learned maximize the conditional probability P(Y|X). It draws boundaries in the data space using conditional probability and is generally used in supervised learning tasks.

### Question 2: GAN Trade-offs (20 points)

What are the main advantages and disadvantages of GANs over standard machine learning models? Explain them.

**Answer:**

Advantages of GANs over standard machine learning models:

1. GANs can be used to generate realistic-looking data (images, text, video, audio, etc.) which are involved in a wide range of applications.

2. GANs can be used to augment data in innovative and unconventional ways, adding intricate nuances and diversity to homogeneous datasets.

3. Unlike many standard machine learning models, GANs can learn the density distributions and the internal representations of the data, and can be trained using unlabeled data.

Disadvantages of GANs over standard machine learning models:

1. GANs are susceptible to the phenomenon of mode collapse, wherein the generator generates samples that do not have a lot of variety or when a it starts generating the same images, which defeats the entire purpose of its usage.

2. GANs are susceptible to the problem of vanishing gradients wherein the discriminator works so well that the generator gradient vanishes and learns nothing.

3. GANs are susceptible to the problem of non-convergence of model parameters and are sensitive to hyperparameter selections, making them hard to train.

### Question 3: Adversarial Attacks (20 points)

What is an adversarial attack on a machine learning model? Explain how it works.

**Answer:**

An adversarial attack is a technique of incorporating small modifications to the training data in such a way that the machine learning model misclassifies them with high precision. The training data is modified in a manner increasing the model's loss and this can be done in a variety of ways:

1. Fast Gradient Sign Method: this popular method involves calculating the loss after backpropagation, calculating the gradient with respect to the pixels of the image (if images are used as training data), and nudging the pixels of the image in the direction of the gradients that maximize the calculated loss.

2. White-box Attack: they are effective attacks wherein the entirety of the trained model including the input features, model architecture, and the model parameters are known.

3. Black-box Attack: they are less effective than white-box attacks and are significantly harder to perform since the internal gradients and the model parameters are unknown. Only the output confidence scores for each class or the labels are provided.

4. Targeted Attack: in this method, the input data is perturbed specifically towards a defined target class y'.

5. Untargeted Attack: in this method, the pixel's intensities are perturbed in such a way that the confidence of the original class is lowered such that it isn't the largest in the prediction vactor anymore.

### Question 4: Retraining (20 points)

A company has found that their internal image recognition tool is producing empirically poor output for a set of images submitted by their users. Upon further investigation, it is determined that these poor performing images are adversarial attacks designed to fool their image recognition software. However, the same set of images are reused by these attackers over and over. Given this set of successful adversarial attacks, how could this be mitigated in the short run? Will this work in the long run? Why or why not?

**Answer:**

In the short run, a denoising ensemble network could be developed to mitigate these attacks. The denoiser woild remove any added noise to the input image (e.g. added Gaussian noise). The objective of this implementation would be to reduce the error between the reconstructed image and the original image during the training process to ensure that the image is sufficiently denoised such that it is as close to the original uncorrupted image as possible. The next step would be to pass these denoised images through a verification ensemble which tries to classify them as "denoised" or "noisy". This provides an extra layer of confidence in the supposed corruption level of the images, ensuring that only uncorrupted or denoised images move forward in the pipeline. However, this only works for a particular type of noise. To denoise the images other kinds of noise, the model would have to be retrained to account for them.

Another short-term solution might be to generate a large number of adversarial networks and train the model to resist them, a technique known as Adversarial Training. However, this is unlikely to work in the long run since it would not be a robust solution.

A better solution which would likely work in the long run would be the technique of Defensive Distillation, which involves training another model whose surface is smoothed in the direction the adversarial attack would try to exploit, making it difficult for the attackers to discover adversarial input tweaks that lead to poor model performance. This is shown to work and the reason is that the second model is trained on the “soft” probability outputs of the original model rather than the “hard” (0/1) true labels from the training data.

### Question 5: Applications (20 points)

What are potential applications of GANs? Research and briefly present three machine learning innovations that used GANs.

**Answer:**

GANs find varied and extensive use in many interesting domains, many of them involving images, video, audio, and text. Some of them are:

1. Image-to-Image Translation: GANs have found a wide range of applications in the image space, including the translation of paintings and sketches to photographs and vice versa, the translation of satellite photographs to Google Maps, the translation of black and white photographs to color, etc. These tasks are detailed in a 2016 paper authored by Phillip Isola et. al. presenting their "pix2pix" approach, which involves the usage of conditional adversarial networks (termed as "Conditional GANs" or "cGANs").

2. Text-to-Image Translation: GANs can generate realistic-looking 256x256 photographs from textual descriptions of simple objects like birds and flowers, which has been demonstrated in a 2016 paper authored by Han Zhang et. al. detailing the "StackGAN" which involves a "Stage I GAN" and a "Stage II GAN" stacked upon each other, along with a Conditioning Augmentation technique that incorporates smoothness in the latent conditioning manifold to improve the diversity of the synthesized images and to stabilize training.

3. Video Prediction: GANs have been involved in prediction of video frames from static elements of the scene, as detailed in a 2016 paper authored by Carl Vondrick et. al., which describes the usage of a GAN with a spatio-temporal convolutional architecture that supposedly untangles the static scene’s foreground from the background and is instrumental in predicting up to a second of video frames with success.