# What makes art, art? (Models of Mind and Brain, Final Project)
### Hannah Paris Cowley

In [17]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as im
import random
import os

## Introduction:

### Question: What _really_ goes into the works of famous artists? 
When looking at a piece of art, it's hard to put a finger on what about the piece we find enjoyable, moving, or disturbing. When analyzing the work of artists, there's more that goes into their work than analyses of brushstroke patterns and color palettes. Using generative models, I've decided to look into the key features of the artwork of famous artists, hoping to uncover how our minds might generalize the works of Monet. This will help answer a rather complicated question, among other things: If I go to a museum and tell you that I saw a great Monet painting today, what mental image do you get?

### DCGAN: Deep Convolutional Generative Adversarial Network
I chose to use a generative model to distill the key features of artwork. By using an generative model, I am able to look at "artwork" the computer generates, to assess how the network has generalized features of the training artwork. 

A DCGAN is a generative model that uses unsupervised learning in training. Its specific architecture has made it relatively successful in learning from unlabeled data (Radford, Metz, & Chintala, 2015). The network consists of multiple convolutional layers and relies on the generative adversarial network (GAN) approach for training the network. 
![DCGAN diagram](./DCGAN.png)
(_DCGAN diagram courtesy of https://github.com/carpedm20/DCGAN-tensorflow_)


Because the images generated by the model have no explicit target other than wanting them to look "Monet-ish", one might think it would be impossible to properly train! However, GAN has some tricks that make it possible. Alongside the generative network that is producing images, a discriminator network is being trained. This discriminator network is taking in the output images from the generator and asking "Was this image generated or is it a training image?" Then, the error from this _discriminator_ network is backpropogated both through the discriminator network _and_ the generator network, with the goal of making the discriminator network more likely to confuse training and generated images on the next epoch (https://blog.openai.com/generative-models/). Essentially, we want our real Monet images to become indistinguishable from our generated Monet images. Of course, in practice this won't happen. But what's interesting is that our DCGAN should generalize from our training data, thus answering (in part), my question of what makes a Monet, a Monet!

I specifically chose a DCGAN for a few reasons. First, it's relatively easy to train. There are no labels required, and I can run 300 epochs of training on my personal laptop in just under 2 hours. Second, I found an existing implemenation of DCGAN on GitHub, using TensorFlow: https://github.com/carpedm20/DCGAN-tensorflow.

### What you'll see here
I'll include the code at the bottom of this notebook to run everything if you so wish. However, be advised that training takes a _really_ long time, and because the implementation runs from command line, a Jupyter Notebook isn't the best environment to work in.

## Investigation: Preliminary Work

To learn about how generative models work, I'll first train my DCGAN on simple configurations of a checkerboard-like pattern, and see if it produces a checkerboard in response. To test how well it produces checkerboards, I will train a binary classifier at the end, trained on the same DCGAN checkerboards, and tested on the checkerboards that were generated by the DCGAN.

** Work in progress **

### Training data creation:
Using numpy (found in the cells below), I created 500 different checkerboard configurations to feed into the DCGAN.

In [23]:
# width and height of board = 250 (to match cropping of real images)
def build_checkerboard(c_size, c_num):
    board = np.zeros((250, 250))
    for i in range(c_num):
        top_left_0 = random.randint(0, 249)
        top_left_1 = random.randint(0, 249)
        for i in range(c_size):
            if top_left_0 + i < 249:
                board[top_left_0 + i][top_left_1] = 1
                for j in range(c_size):
                    if top_left_1 + j < 249:
                        board[top_left_0 + i][top_left_1 + j] = 1
    return board

In [None]:
if not os.path.exists("./data/checkerboards"):
    os.mkdir("./data/checkerboards")
  
os.chdir("./data/checkerboards")

for size in [2, 5, 10, 15]:
    for num in [100, 200, 300, 400, 500]:
        for i in range(5):
            name = 's{}n{}iter{}.png'.format(size, num, i)
            b = build_checkerboard(size, num)
            im.imsave(name, b, cmap=plt.cm.gray)
        
os.chdir("../..")

## Investigation: Real Paintings

### Training data collection:
Data was obtained from Google Image search and https://www.wikiart.org/. All images were hand re-sized and cropped centrally. Please note, the data sets for this project are relatively small. Painters only painted so much!

#### Preliminary Monet Data:
I collected images of Monet's works in 4 categories, 25 images per category: people, flowers, landscapes, and seascapes. This data can be found in ./data/monet_cropped. The data chosen here was intentionally over-diverse and investigational. I wasn't sure exactly what I'd get, so I started broad!

2 works from each category are reproduced below:

![Woman Reading](data/monet_cropped/a-woman-reading.jpg)
![Parasol Woman](data/monet_cropped/the-promenade-woman-with-a-parasol.jpg)

![Poppies](data/monet_cropped/poppies-at-giverny.jpg)
![Pine Trees](data/monet_cropped/under-the-pine-trees-at-the-end-of-the-day.jpg)

![Flowers & Fruit](data/monet_cropped/flowers-and-fruit.jpg)
![Chrysanthems](data/monet_cropped/two-vases-with-chrysanthems.jpg)

![Green Wave](data/monet_cropped/the-green-wave.jpg)
![Night Seascape](data/monet_cropped/seascape-night-effect.jpg)

A model was trained using this cropped dataset using the bash command below. **Please note, if you run this, you will over-write existing models. Only run if you're _absolutely sure_!**

In [None]:
%%bash
python3 main.py --dataset=monet_cropped --input_height=250 --input_width=250 --train_size=97 --epoch=300 --crop --train

#### Preliminary Monet Training:
While training, my model produced test images at each epoch. These test images are displayed in an 8x8 grid, so one image reproduced below is _actually_ 64 different images (all smooshed into one jpg for convenience).

**Epoch 1**
![Cropped Epoch 1](./samples/monet_cropped/train_01_0000.png)
**Epoch 61**
![Cropped Epoch 31](./samples/monet_cropped/train_61_0000.png)
**Epoch 151**
![Cropped Epoch 31](./samples/monet_cropped/train_151_0000.png)
**Epoch 205**
![Cropped Epoch 205](./samples/monet_cropped/train_205_0000.png)
**Epoch 241**
![Cropped Epoch 31](./samples/monet_cropped/train_241_0000.png)
**Epoch 298**
![Cropped Epoch 31](./samples/monet_cropped/train_298_0000.png)


And below is a gif showing the training progress across 11 sampled epochs (between 1 and 298).

In [6]:
from IPython.display import HTML
HTML('<img src="./monet_cropped_training2.gif">')

#### Revised Monet Data: Open Scenes Only
While I was happily surprised that it seemed that the model picked up Monet's hurried, impressionist style, these images didn't seem quite right. I wondered if it was an artifact of my incredibly diverse, tiny training dataset. Because I couldn't produce more Monets from thin air, I instead decided to create a second Monet dataset of only seascapes and landscapes. I noticed that incredibly different color palettes and subject matter was present in open scenes (landscapes and seascapes) versus closed scenes (people and flowers), and I hoped that by narrowing down the diversity in my dataset (although it resulted in a smaller set), would provide enough reglarity in the images to produce better end results. Data in this new set was comprised of the landscapes and seascapes from the previous data set, with additional landscapes and seascapes found via wikiart and Google Images.

![Landscape 1](./data/monet_open/cap-martin-2.jpg)
![Landscape 2](./data/monet_open/land2.jpeg)

![Seascape 1](./data/monet_open/boats-at-rest-at-petit-gennevilliers.jpg)
![Seascape 2](./data/monet_open/fishing-boats-calm-sea.jpg)


A model was trained using this open scene dataset using the bash command below. **Please note, if you run this, you will over-write existing models. Only run if you're _absolutely sure!_**

In [None]:
%%bash
python3 main.py --dataset=monet_open --input_height=250 --input_width=250 --train_size=73 --epoch=300 --crop --train

#### Open Scene Monet Training
I am proud to say that changing my training data allowed my model to pick up on some key features of Monet's paintings a little better. In the following images, you'll see that the model was able to capture some consistent color palettes of the seascapes and landscapes, and was additionally able to place a bright spot in the upper half of the image, which could correspond to the glint of sun on the water in seascapes, or the sun in landscapes.

**Epoch 2**
![Open Epoch 2](./samples/monet_open/train_02_0000.png)
**Epoch 53**
![Open Epoch 53](./samples/monet_open/train_53_0000.png)
**Epoch 131**
![Open Epoch 131](./samples/monet_open/train_131_0000.png)
**Epoch 200**
![Open Epoch 200](./samples/monet_open/train_200_0000.png)
**Epoch 254**
![Open Epoch 254](./samples/monet_open/train_254_0000.png)
**Epoch 299**
![Open Epoch 299](./samples/monet_open/train_299_0000.png)

Once again, below find a gif of some sampled epochs through training.

In [7]:
from IPython.display import HTML
HTML('<img src="./monet_open_training.gif">')

### Other Monet Conditions To Train With:
* Scrambled pixels
* Black and white

Training with these different perturbations of monet's works will help us know what is important for making something "Monet-ish". Is it the spatial integrity? If it were, scrambled pixels should show significantly worse generated images than normal images. Is it the color palette that Monet used? If it were that, converting images to black and white for training should yield significantly worse generated images than the normal images.

** Work in progress **

## Still to do: 
* quantification (gloss and dloss over training)
* Train binary image classifier that can say if a generated image is "monet" or "not monet" -- allows for quantification
* talk about how this could have been made better (ie: greater training data)

## Conclusion:
In my conclusion, I plan to:
* Explain how generative models can help us understand what the key features of artwork are
* Maybe, when we think of works by famous artists we are making some sort of generative model in our own head!
* Compare how the model did across different works, compare end results. Can I tell them apart?
* Future directions and suggestions for improvement

## References:
* https://blog.openai.com/generative-models/
* https://arxiv.org/pdf/1511.06434.pdf
* https://github.com/carpedm20/DCGAN-tensorflow
* https://arxiv.org/pdf/1601.06759.pdf
* https://www.wikiart.org/
* Pointed me to wikiart database: https://arxiv.org/pdf/1505.00855.pdf
* https://docs.floydhub.com/examples/dcgan/
* https://stackoverflow.com/questions/2169478/how-to-make-a-checkerboard-in-numpy


** To be put into a common reference format **