# Warning! Advanced Local Set-up Required!

 We've have had some success by setting up a local environment with python 3.7, `tensorflow==1.15.0`, and appropriate versions of `tensorflow-hub`, etc., and removing the `compat.v1` and `.disable_v2_behavior()` commands. The 'Module Install' cell block lists out many of the libraries/versions needed.
 
 Google Colab doesn't allow you to set up earlier versions of python, so that is likely not a plausible route.

# Image GANs for Generating Dogs 
We've been learning about AI-generated images in our module on Vision, Images, and Art. Many computer-generated images, like deep fake videos, leverage something called Generative Adversarial Netowrks (GANs). In this homework, we explore using existing GANs libraries for image generation!

Here you will be using a generative adversarial network trained on [ImageNet](http://www.image-net.org/), which we've discussed in lecture previously. Although there are many non-dog classes in ImageNet, we will be focusing our experiments on dog photos. A high quality GAN is tricky to design well and takes dozens of hours or days to train, so to save ourselves computation time, we are using a pre-trained GAN with [TensorFlow](https://www.tensorflow.org/).

This activity is largely to familiarize yourself with some of what GANs are capable of doing, experience existing GAN libraries, and practice picking up some new python modules on your own.

# Part 1: Setup (6% effort)
Your goal for this setup part is simply to successfully run all the code below. This can be frustratingly difficult to setup in Jupyter notebook, so **use Google Colab** for this assignment. Give yourself some credit if you manage to get the modules up and working!

This assignment requires a little funky module installation. Remember: the course discussion forum and Student Help Hours are your friend if you hit installation errors, don’t suffer alone!

**HIGHLY** recommend you read the Set-up instructions detailed in the GitHub README for this Notebook!

### Module Install

This code uses TensorFlow version 2+ (forced to be compatible with version 1) and Python 3.8+, along with a few other modules. If you're using Google Colab, you probably don't have to install them. The comments in the code below are one potential way to set-up your conda environment to run the notebook locally, although, hopefully, the Google Colab should be a simpler way of doing this.



In [1]:
import sys
print("\nPYTHON VERSION:", sys.version)

""" # Possible commandline set-up for conda + running notebook locally 
# May not be necessary with the last few lines in this cell!
conda create -n assignment5 python==3.7.15
conda activate assignment5
conda install ipykernel --update-deps --force-reinstall
conda install tensorflow-estimator=1.15
conda install tqdm
conda install -c conda-forge opencv
conda install scikit-image
conda install requests
conda install tensorflow-hub=0.8
"""

# Using TensorFlow v2 that's compatible with v1
import tensorflow.compat.v1 as tf
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR) # reduce error messages
tf.disable_v2_behavior()
print("\nTENSORFLOW VERSION:", tf.__version__)


PYTHON VERSION: 3.8.18 (default, Sep 11 2023, 08:28:20) 
[Clang 14.0.6 ]


2024-04-29 15:12:53.128110: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.



TENSORFLOW VERSION: 2.13.1


### Environment Setup 

Now we'll import what we need from the modules we've installed.

_Hint_: If you're getting errors on the cell below, you're missing modules! Take notes of which ones you need to install, close this notebook, and view the Github README for set-up instructions!

In [None]:
from io import BytesIO
import IPython.display
import numpy as np
import urllib
import PIL.Image
from scipy.stats import truncnorm
from skimage import io, data, transform 
import requests
# Using TensorFlow v 1
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
# End tf old school
from tensorflow.python.framework import ops
import tensorflow_hub as hub #pip install tensorflow_hub
import scipy.misc
from tqdm import tqdm
import random
import cv2 # pip install opencv-python

### BigGAN Model Setup

For this exercise, we will use a pre-trained model called _BigGAN_ generator available on [TensorFlow Hub](https://tfhub.dev/deepmind/biggan-128/2). For more information about this model, check out the authors' paper [_"Large Scale GAN Training for High Fidelity Natural Image Synthesis"_ Brock et al. 2019](https://arxiv.org/abs/1809.11096). Next is the address to download this model from TensorFlow Hub:

In [None]:
# this model will output 128 by 128 pixel images.
module_path ='https://tfhub.dev/deepmind/biggan-128/2'  
print(module_path)

### Helper Code Setup

This code below is adapted from the [BigGANs Tutorial](https://colab.research.google.com/drive/1rqDwIddy0eunhhV8yrznG4SNiB5XWFJJ) from [Machine Learning for Artists](https://ml4a.github.io/) by Gene Kogan. Many of the exercises here are inspired from that tutorial, so check it out if you want to have more fun with GANs later!

In [None]:
''' Better code design would dictate placing this code in 'helpers.py' and then importing the file, 
    but to make our GOOGLE COLAB life easier, I've included the helper code in this cell below.
    This makes it so we can run this notebook without uploading *any* external files!
    There is no need for you to try and understand what this cell block is doing! 
    Just run it so you can use the helper.py utilities.
'''
"""
Helper code from the BigGAN tutorial: https://colab.research.google.com/drive/1rqDwIddy0eunhhV8yrznG4SNiB5XWFJJ

Adapted for Human-AI Interaction class.

"""

from io import BytesIO
import IPython.display
import numpy as np
import urllib
import PIL.Image
from scipy.stats import truncnorm
from skimage import io, data, transform  # pip install scikit-image
import requests
# Using TensorFlow v 1
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
# End tf old school
from tensorflow.python.framework import ops
import tensorflow_hub as hub
import scipy.misc
from tqdm import tqdm
#from random import random
import cv2  # pip install opencv-python

class GANSession:
    def __init__(self, module_path):
        ops.reset_default_graph()
        print('Loading BigGAN module from:', module_path)
        self.module = hub.Module(module_path) # this line currently throws an error
        self.inputs = {k: tf.placeholder(v.dtype, v.get_shape().as_list(), k)
                       for k, v in self.module.get_input_info_dict().items()}
        self.output = self.module(self.inputs)

        print("\n")
        print('Inputs:\n', '\n'.join('  {}: {}'.format(*kv)
                                     for kv in self.inputs.items()))
        print("\n")
        print('Output:', self.output)

        self.input_z = self.inputs['z']
        self.input_y = self.inputs['y']
        self.input_trunc = self.inputs['truncation']

        self.dim_z = self.input_z.shape.as_list()[1]
        self.vocab_size = self.input_y.shape.as_list()[1]

        # Create a TensorFlow session and initialize variables
        initializer = tf.global_variables_initializer()
        self.sess = tf.Session()
        self.sess.run(initializer)

    def truncated_z_sample(self, batch_size, truncation=1., seed=None):
        state = None if seed is None else np.random.RandomState(seed)
        values = truncnorm.rvs(-2, 2, size=(batch_size,
                                            self.dim_z), random_state=state)
        return truncation * values

    def one_hot(self, index, vocab_size=None):
        if(not vocab_size):
            vocab_size = self.vocab_size
        index = np.asarray(index)
        if len(index.shape) == 0:
            index = np.asarray([index])
        assert len(index.shape) == 1
        num = index.shape[0]
        output = np.zeros((num, vocab_size), dtype=np.float32)
        output[np.arange(num), index] = 1
        return output

    def one_hot_if_needed(self, label, vocab_size=None):
        if(not vocab_size):
            vocab_size = self.vocab_size
        label = np.asarray(label)
        if len(label.shape) <= 1:
            label = self.one_hot(label, vocab_size)
        assert len(label.shape) == 2
        return label

    def sample(self, noise, label, truncation=1., batch_size=8, vocab_size=None):
        sess = self.sess
        if(not vocab_size):
            vocab_size = self.vocab_size
        noise = np.asarray(noise)
        label = np.asarray(label)
        num = noise.shape[0]
        if len(label.shape) == 0:
            label = np.asarray([label] * num)
        if label.shape[0] != num:
            raise ValueError('Got # noise samples ({}) != # label samples ({})'
                             .format(noise.shape[0], label.shape[0]))
        label = self.one_hot_if_needed(label, vocab_size)
        ims = []
        for batch_start in tqdm(range(0, num, batch_size)):
            s = slice(batch_start, min(num, batch_start + batch_size))
            feed_dict = {self.input_z: noise[s],
                         self.input_y: label[s], self.input_trunc: truncation}
            ims.append(sess.run(self.output, feed_dict=feed_dict))
        ims = np.concatenate(ims, axis=0)
        assert ims.shape[0] == num
        ims = np.clip(((ims + 1) / 2.0) * 256, 0, 255)
        ims = np.uint8(ims)
        return ims

    def interpolate(self, A, B, num_interps):
        alphas = np.linspace(0, 1, num_interps)
        if A.shape != B.shape:
            raise ValueError(
                'A and B must have the same shape to interpolate.')
        return np.array([(1-a)*A + a*B for a in alphas])

    def imgrid(self, imarray, cols=5, pad=1):
        if imarray.dtype != np.uint8:
            raise ValueError('imgrid input imarray must be uint8')
        pad = int(pad)
        assert pad >= 0
        cols = int(cols)
        assert cols >= 1
        N, H, W, C = imarray.shape
        rows = int(np.ceil(N / float(cols)))
        batch_pad = rows * cols - N
        assert batch_pad >= 0
        post_pad = [batch_pad, pad, pad, 0]
        pad_arg = [[0, p] for p in post_pad]
        imarray = np.pad(imarray, pad_arg, 'constant', constant_values=255)
        H += pad
        W += pad
        grid = (imarray
                .reshape(rows, cols, H, W, C)
                .transpose(0, 2, 1, 3, 4)
                .reshape(rows*H, cols*W, C))
        if pad:
            grid = grid[:-pad, :-pad]
        return grid

    def interpolate_and_shape(self, A, B, num_samples, num_interps):
        interps = self.interpolate(A, B, num_interps)
        return (interps.transpose(1, 0, *range(2, len(interps.shape)))
                .reshape(num_samples * num_interps, -1))

    def get_interpolated_yz(self, categories_all, num_interps, noise_seed_A, noise_seed_B, truncation):
        nt = len(categories_all)
        num_samples = 1
        z_A, z_B = [self.truncated_z_sample(num_samples, truncation, noise_seed)
                    for noise_seed in [noise_seed_A, noise_seed_B]]
        y_interps = []
        for i in range(nt):
            category_A, category_B = categories_all[i], categories_all[(
                i+1) % nt]
            y_A, y_B = [self.one_hot([category] * num_samples)
                        for category in [category_A, category_B]]
            y_interp = self.interpolate_and_shape(
                np.array(y_A), np.array(y_B), num_samples, num_interps)
            y_interps.append(y_interp)
        y_interp = np.vstack(y_interps)
        z_interp = self.interpolate_and_shape(
            z_A, z_B, num_samples, num_interps * nt)

        return y_interp, z_interp

    def get_transition_yz(self, classes, num_interps, truncation):
        noise_seed_A, noise_seed_B = 10, 20   # fix this!
        return self.get_interpolated_yz(classes, num_interps, noise_seed_A, noise_seed_B, truncation=truncation)

    def get_random_yz(self, num_classes, num_interps, truncation):
        random_classes = [int(1000*random()) for i in range(num_classes)]
        return self.get_transition_yz(random_classes, num_interps, truncation=truncation)

    def get_combination_yz(self, categories, noise_seed, truncation):
        z = np.vstack([self.truncated_z_sample(1, truncation, noise_seed)]
                      * (len(categories)+1))
        y = np.zeros((len(categories)+1, 1000))
        for i, c in enumerate(categories):
            y[i, c] = 1.0
            y[len(categories), c] = 1.0
        return y, z

    def slerp(self, A, B, num_interps):  # see https://en.wikipedia.org/wiki/Slerp
        # each unit step tends to be a 90 degree rotation in high-D space, so this is ~360 degrees
        alphas = np.linspace(-1.5, 2.5, num_interps)
        omega = np.zeros((A.shape[0], 1))
        for i in range(A.shape[0]):
            tmp = np.dot(A[i], B[i]) / \
                (np.linalg.norm(A[i])*np.linalg.norm(B[i]))
            omega[i] = np.arccos(np.clip(tmp, 0.0, 1.0))+1e-9
        return np.array([(np.sin((1-a)*omega)/np.sin(omega))*A + (np.sin(a*omega)/np.sin(omega))*B for a in alphas])

    def slerp_and_shape(self, A, B, num_interps):
        interps = self.slerp(A, B, num_interps)
        return (interps.transpose(1, 0, *range(2, len(interps.shape)))
                .reshape(num_interps, *interps.shape[2:]))

    def imshow(self, a, format='png', jpeg_fallback=True):
        a = np.asarray(a, dtype=np.uint8)
        str_file = BytesIO()
        PIL.Image.fromarray(a).save(str_file, format)
        png_data = str_file.getvalue()
        try:
            disp = IPython.display.display(IPython.display.Image(png_data))
        except IOError:
            if jpeg_fallback and format != 'jpeg':
                print ('Warning: image was too large to display in format "{}"; '
                       'trying jpeg instead.').format(format)
                return self.imshow(a, format='jpeg')
            else:
                raise
        return disp
    
# You can safely ignore the tensorflow WARNING outputs
gan = GANSession(module_path)
print("This cell takes Colab 1-5 mins).")

## Part 2: Experimenting with generating deep fake puppies

### Task 2.A (2% effort) Choose a dog breed from `dog_classes.txt`, replacing the "263...'corgi" below.

In [None]:
## Task 2.A: Modify this code!
truncation = random.uniform(0.02,1) # min:0.02, max:1
noise_seed = random.randint(0,100) # min:0, max:100
category = "263: 'Pembroke, Pembroke Welsh corgi'" # put dog breed here

num_samples = 1 # Number of images to generate with these parameters

# returns an ndarray, each item is another ndarray is used to generate the images
z = gan.truncated_z_sample(num_samples, truncation, noise_seed) 
y = int(category.split(':')[0])

print('truncation: ',truncation, 'noise seed:',noise_seed)

ims = gan.sample(z, y, truncation=truncation) # ims is a numpy array
max_columns = 20
gan.imshow(gan.imgrid(ims, cols=min(num_samples, max_columns)))

### Task 2.B (7% effort) Initial Explorations
Run the above image generator cell several times. Experiment! Find your best looking result and place the code in the cell block below. Find your worst looking result, and place it in the cell block after the best one. Be sure to run the cells again, so the images are generated! 

Describe what you are seeing: what kinds of visual errors do you think this model is creating? 

**ANSWER:** _Double click this text to write your answer to the question here._

In [None]:
# BEST LOOKING RESULT (code):

In [None]:
# WORST LOOKING RESULT (code):

### Task 2.C (5% effort) Generate a row of dogs with different noise_seeds
The code below generates two images with the same `truncation` value, but with two different `noise_seed`s. Adapt this code so that it generates a row of 11 dog images, where each dog has a different value of the parameter `noise_seed` evenly distributed across the range 0 to 100 inclusive (i.e., 0, 10, 20, ...100).

For a Welsh Pembroke Corgi with a fixed `truncation` value of `0.95` this looks like the file included in the [Github repository:_`2c_corgi-noise-seeds.png`](https://github.com/UberHowley/haii-a5/blob/main/2c_corgi-noise-seeds.png) where the leftmost Corgi has a `noise_seed` of `0` and the rightmost has a `noise_seed` of 100.

In [None]:
## Task 2.C: Modify this code!
truncation = 0.95 # min:0.02, max:1
category = "263: 'Pembroke, Pembroke Welsh corgi'" # put dog breed here

num_samples = 1
z_list = []

# Grab the first element returned by truncated_z_sample
z_item = gan.truncated_z_sample(num_samples, truncation, 0)
# Add that first element to our list of np.ndarray
z_list.append(z_item[0])
# Repeat
z_item = gan.truncated_z_sample(num_samples, truncation, 100)
z_list.append(z_item[0])

z = np.asarray(z_list) # convert our list into an np.ndarray
y = int(category.split(':')[0])

ims = gan.sample(z, y, truncation=truncation)
gan.imshow(gan.imgrid(ims,cols=11))

### Task 2.D (12% effort) Generate a grid of dogs, varying noise_seed and truncation
Take the code from 2.C above and copy that below. Now, your target output is a _grid_ of dogs. Just as before in 2.C, each _row_ will be 11 generated dogs in an ordered range by `noise_seed` 0-100. Now the _columns_ of the grid shall be different values of the `truncation` parameter, in an ordered range from 0.02-1.0. Your finished grid should have 11 rows, 11 columns, where the top left dog in the grid has a `noise_seed = 0` and `truncation = 0.02` and the values of those parameters increase top to bottom left to right. The dog at the bottom right will have a `noise_seed = 100` and `truncation = 1.0`.

If you're unsure if you've got the right grid, you can see what it should look like for corgis in the [Github repository:_`2d_corgi-dog-grid.png`](https://github.com/UberHowley/haii-a5/blob/main/2d_corgi-dog-grid.png) .

In [None]:
## Task 2.D: Add new code here!

### Task 2.E (12% effort) Hypothesize the relationship between noise_seed and truncation. 
Given your experimentations above and your grid of dogs, hypothesize:
 - What might be the relationship between `noise_seed` and what a generated dog looks like?
 - What might be the relationship between `truncation` and what a generated dog looks like?

**ANSWER:** _Double click this text to write your answer to the question here._

# Part 3: Experimenting with transforming puppies

In this section, we will experiment with different forms of interpolation to transform and combine different dog images.

### Task 3.A (10% effort) Compare truncation and noise_seed across breeds
Modify and run the code below, putting in your favorite `truncation` and `noise_seed` values from your experiments in Part 2. Run the code with a pair of closely similar breeds (similar to the Pembroke and the Cardigan Welsh corgies shown below) and a pair of very different breeds. 
 - What do you hypothesize this code is doing with the two dog image samples? Where is `noise_seed_A`'s impact appearing? What about `noise_seed_B`? What is being `interpolated`?
 - Hypothesize: Does the performance (visual quality of the output) differ based on qualities of the source images such as similar dog breed or similar image background?

**ANSWER:** _Double click this text to write your answer to the question here._

In [None]:
## Task 3.A: Modify this code!
num_interps = 10 # min:2, max:1000
truncation = 0.46 # min:0.02, max:1
noise_seed_A = 2 # min:0, max:100
category_A = "263: 'Pembroke, Pembroke Welsh corgi'" 
noise_seed_B = 99 # min:0, max:100
category_B = "264: 'Cardigan, Cardigan Welsh corgi'" 

y_interp, z_interp = gan.get_interpolated_yz([int(category_A.split(':')[0]), int(category_B.split(':')[0])], num_interps, noise_seed_A, noise_seed_B, truncation=truncation)
imgs = gan.sample(z_interp, y_interp, truncation=truncation)
gan.imshow(gan.imgrid(imgs, cols=num_interps))

### Task 3.B (8% effort) Different truncations for different dalmations
The code below does a different, but similar operation to the code in Task 3.A.

Modify and run the code below, putting in your favorite `truncation` and `noise_seed` values from your experiments in Part 2. Run the code with a pair of closely similar breeds (similar to the Pembroke and the Cardigan Welsh corgies shown below) and a pair of very different breeds. 
 - What do you hypothesize this code is doing with the two dog image samples?
 - Hypothesize: Does the performance (visual quality of the output) differ based on qualities of the source images such as similar dog breed or similar image background?
 - Unlike in Task 3.A, both images share the same `noise_seed`. From your observations, what do you think is the effect of having the images share the same parameter value versus having each their own `noise_seed`?
 - What might be the difference between `interpolated` and `combination`?

**ANSWER:** _Double click this text to write your answer to the question here._

In [None]:
## Task 3.B: Modify this code!
truncation = 0.45 # min:0.02, max:1
noise_seed = 22 # min:0, max:100
categoryA = "263: 'Pembroke, Pembroke Welsh corgi'" 
categoryB = "264: 'Cardigan, Cardigan Welsh corgi'" 

categories = [int(categoryA.split(':')[0]), int(categoryB.split(':')[0])]
y, z = gan.get_combination_yz(categories, noise_seed, truncation)
imgs = gan.sample(z, y, truncation=truncation)
gan.imshow(gan.imgrid(imgs, cols=len(categories)+1))

### Task 3.C (5% effort) Generate your best looking result. 
Run the above cell several times. Experiment! Find your best looking result and place the code in the cell block below. Be sure to run the cell again, so the images are generated.

In [None]:
# BEST LOOKING RESULT (code):

### Task 3.D (5% effort) reallyBigGANs
A big advantage of the BigGAN model over other GANs is that it is able to produce much higher resolution images than GANs were previously capable of. To test this out, run the cell below. This will load a version of BigGAN that can generate 512 by 512 pixel images instead of 128 by 128 pixels.

Now re-run the code for Task 3.B in some exploratory experiments, and store the code that generates your best looking result in the code cell below. Be sure to. run the cell so that the image is generated. _Note: since this generates a larger image, it will take longer to compute_

In [None]:
# Create a new GAN Session using our BIG images
module_path_big ='https://tfhub.dev/deepmind/biggan-512/2'  
gan = GANSession(module_path_big)

In [None]:
# BEST LOOKING RESULT (code):

### Task 3.E (5% effort) BigGAN vs not-so-bigGAN
Compare your best results from Task 3.C and 3.D. Hypothesize, what do you think the effect of a higher resolution GAN is on the image quality?

**ANSWER:** _Double click this text to write your answer to the question here._

# Part 4: Beyond Dogs

In this section, we examine our image dataset more closely.

### Task 4.A (1% effort) Compare noise_seed across a new category
ImageNet (where, unfortunately, we get our dog breed images) has many other classes, listed in `imagenet_classes-all.txt`. Choose one of these categories, and explore the impact `noise_seed` has on the output (i.e., see Task 2C).

In [None]:
# Save some time, let's go back to the smaller image set
module_path ='https://tfhub.dev/deepmind/biggan-128/2'  
gan = GANSession(module_path)

In [None]:
# Task 4.A: Add new code here!

### Task 4.B (1% effort) Compare truncation and noise_seed across categories
Grab your code from Task 3A and explore how `noise_seed` and `truncation` impact the transition between two different categories.

In [None]:
# Task 4.B: Add new code here!

### Task 4.C (2% effort) Mad Scientist Time!
Convert your code from Task 3B into a method, `generate_combo`, that lets you easily combine two classes into one. Explore! Experiment! What's happening?!

In [None]:
# Task 4.D: Complete code below:

def generate_combo(categoryA, categoryB, noise_seed, truncation):
    pass # replace this with your code!

# Calls to generate_combo below:
generate_combo("386: 'African elephant, Loxodonta africana'", "881: 'upright, upright piano'", 48, 0.99)  # sample call

### Task 4.E (9% effort) Imagenet and the Unbearable Weight of Massive Methodological Issues

This is not our first time encountering ImageNet and its multitudinous problems! Can you recall the previous times ImageNet was brought up? Answer the following questions below (~3 sentences each):
 - Read the [Excavating AI](https://excavating.ai/) article and summarize its main points. 
 - Did anything in the article surprise you? What do you disagree with? 
 - How do the problems discussed in the [Excavating AI](https://excavating.ai/) article intersect with the ImageNet classes in our `imagenet_classes-all.txt` file? 
 - Is it _wrong_ for the ImageNet authors to censor their dataset? What ethical principles does ImageNet support or contradict?
 - Is it okay to use datasets that conflict with our values: privacy, morals, social good, equity, etc.? Connect this argument with at least 2 sources from across the semester.

**ANSWER (summary):** _Double click this text to write your answer to the question here._

**ANSWER (surprised/disagree/agree):** _Double click this text to write your answer to the question here._

**ANSWER (apply to imagenet_class-all.txt):** _Double click this text to write your answer to the question here._

**ANSWER (ethical principles):** _Double click this text to write your answer to the question here._

**ANSWER (connect with 2+ sources):** _Double click this text to write your answer to the question here._

# Assignment Submission

Once you've completed all of the above, you're done with this assignment! As always, clean up your code and ensure your entire python Notebook runs before submitting, Iris must be able to run your notebook on her machine.

Once you think everything is set, go to `File > Download > Download as .ipynb` and please change the filename of your notebook to `[yourunixID]_haiiYY[assignmentnumber].ipynb`, e.g., `ikh1_haii17a5.ipynb`, and then submit the file on Glow.