*The first portion of this notebook is devoted to creating the recommendation system which will later be used as part of the Flask App. The second portionn is dedicated to creating brand new images*

In [None]:
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Input
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model, load_model
import matplotlib.pyplot as plt
import numpy as np
import re

from scipy import spatial
from PIL import Image
import os
import cv2
import pickle
from skimage import io
from sklearn.decomposition import PCA
from collections import Counter

*To begin, it's essentially to load in the encoder which is used to condense the images. Once encoded, recommendations can comfortably made by calculating cosine distance between encodings*

In [None]:
feature_extractor = load_model('feature.h5')

*Here I connect to my modern art images and get the encodings for each using the neural net*

In [None]:
art = os.listdir('moma')
art.remove('.DS_Store') #Not an image, this one was causing issues
ad_features = []
for work in art:
    x = cv2.resize(cv2.imread('moma/'+work),(224,224))
    x = x/255
    ad_features.append(feature_extractor.predict(x.reshape(1,224,224,3)))

*I convert my encodings to a numpy array in order to reshape each to a flat array. This is necessary for calculating cosine distance as well as clustering for image generation*

In [None]:
ad_featured = np.array(ad_features)
ad_featured = ad_featured.reshape(len(ad_featured), np.prod(ad_featured.shape[1:]))

In [None]:
ad_featured.shape

*I save the encodings for use in the recommender app*

In [None]:
pickle.dump(ad_featured, open('art_features.pickle', 'wb'))

In [None]:
pickle.load()

*The following line of code requires an image address, converts it to a numpy array of 224x224x3 with cv2, creates the encoding of that image, calculates the cosine distance of every image within the corpus, and produces the indices of the top 5 closely related images, which can then easily be used to display the images themselves. This is the crux of how the recommender is created with the app.*

In [None]:
image = io.imread('{image_address}')
x = cv2.resize(image,(224,224))
x = x/255
y = feature_extractor.predict(x.reshape(1,224,224,3))
y = y.reshape(1, np.prod(y.shape))
dist = spatial.distance.cdist(y, ad_featured, metric='cosine')[0]
ind = dist.argsort()[:5]

In [None]:
[art[ind[0]], art[ind[1]], art[ind[2]]]

In [None]:
Image.open('static/moma/'+art[ind[0]])

*The code below prints out the artist's name, the title, and year for visual purposes within the app.*

In [None]:
name, title, year = re.sub('-',' ',art[255][:-5]).split('_')
print(name+'\n'+title+'\n'+year)

*This next portion of code is dedicated to creating brand new images by clustering the encodings and aggregating on the clusters. The aggregations are then sent through the decoder half of the autoencoder to produce brand new images.*

In [None]:
image_maker = load_model('imager.h5')

*I use PCA to reduce dimensionality of the encodings and then cluster paintings by assigning each to its majority feature, I then collect that groups that consist of at least two paintings*

In [None]:
pca = PCA(n_components=50)
works = pca.fit_transform(ad_featured)
tops = np.argmax(works, axis=1)
counter = Counter(tops)
groups = [i for i in counter.keys() if counter[i] >= 2]
groups.sort()

*Here I aggregate all the images in each cluster, either with max, mean, or min to get different results, and prepare them to print. Most images look outrageous, but a handful end up looking really cool. Those are the ones I save.*

In [None]:
generated = []
for i in groups:
    ind = np.where(tops==i)
    new = np.max(ad_featured[ind], axis=0) #type of aggregation is determined in this line
    new = new.reshape(1,28,28,16)
    new_image = image_maker.predict(new)
    generated.append(new_image.reshape(224,224,3))

plt.figure(figsize=(18,80))
for i in range(len(groups)):
    plt.subplot((len(groups))//2+1, 2, i+1)
    plt.imshow(generated[i])
plt.tight_layout()

*Because the first few groups contained far too many images (over 100) to produce anything interesting, I go a bit deeper and cluster each of those clusters to extract some good results. Generally, a cluster with three to six images creates the most worthwhile results. I use more-or-less the same code for each of the subgroups below.*

In [None]:
ind = np.where(tops==0)
c0 = ad_featured[ind]
pca0 = PCA(n_components=50)
works0 = pca0.fit_transform(c0)
top0 = np.argmax(works0, axis=1)
count0 = Counter(top0)
group0 = [i for i in count0.keys() if count0[i] >=2]

In [None]:
generated = []
for i in group0:
    ind = np.where(tops==0)[0][np.where(top0==i)]
    new = np.max(ad_featured[ind], axis=0)
    new = new.reshape(1,28,28,16)
    new_image = image_maker.predict(new)
    generated.append(new_image.reshape(224,224,3))

plt.figure(figsize=(18,80))
for i in range(len(group0)):
    plt.subplot((len(group0))//2+1, 2, i+1)
    plt.imshow(generated[i])
plt.tight_layout()

*Because the first cluster still contained too many images, I went ahead and further clustered the images from that group as well*

In [None]:
ind = np.where(tops==0)[0][np.where(top0==0)]
c00 = ad_featured[ind]
pca00 = PCA(n_components=20)
works00 = pca00.fit_transform(c00)
top00 = np.argmax(works00, axis=1)
count00 = Counter(top00)
group00 = [i for i in count00.keys() if count00[i] >=2]

In [None]:
generated = []
for i in group00:
    ind = np.where(tops==0)[0][np.where(top0==0)][np.where(top00==i)]
    new = np.min(ad_featured[ind], axis=0)
    new = new.reshape(1,28,28,16)
    new_image = image_maker.predict(new)
    generated.append(new_image.reshape(224,224,3))

plt.figure(figsize=(18,80))
for i in range(len(group00)):
    plt.subplot((len(group0))//2+1, 2, i+1)
    plt.imshow(generated[i])
plt.tight_layout()

*And again for the second group*

In [None]:
ind = np.where(tops==1)
c1 = ad_featured[ind]
pca1 = PCA(n_components=50)
works1 = pca1.fit_transform(c1)
top1 = np.argmax(works1, axis=1)
count1 = Counter(top1)
group1 = [i for i in count1.keys() if count1[i] >=2]

In [None]:
generated = []
for i in group1:
    ind = np.where(tops==1)[0][np.where(top1==i)]
    new = np.max(ad_featured[ind], axis=0)
    new = new.reshape(1,28,28,16)
    new_image = image_maker.predict(new)
    generated.append(new_image.reshape(224,224,3))

plt.figure(figsize=(18,80))
for i in range(len(group1)):
    plt.subplot((len(group1))//2+1, 2, i+1)
    plt.imshow(generated[i])
plt.tight_layout()

*once more for a third*

In [None]:
ind = np.where(tops==2)
c2 = ad_featured[ind]
pca2 = PCA(n_components=50)
works2 = pca2.fit_transform(c2)
top2 = np.argmax(works2, axis=1)
count2 = Counter(top2)
group2 = [i for i in count2.keys() if count2[i] >=2]
group2.sort()

In [None]:
generated = []
for i in group2:
    ind = np.where(tops==2)[0][np.where(top2==i)]
    new = np.max(ad_featured[ind], axis=0)
    new = new.reshape(1,28,28,16)
    new_image = image_maker.predict(new)
    generated.append(new_image.reshape(224,224,3))
        
plt.figure(figsize=(18,80))
for i in range(len(group2)):
    plt.subplot((len(group1))//2+1, 2, i+1)
    plt.imshow(generated[i])
plt.tight_layout()

*After perusing the generated images, I pick out the ones I like and adjust this line below to save them. My preferred images are showcased in the virtual gallery within my Flask App.*

In [None]:
ind = np.where(tops=={'desired_image'})
new = np.max(ad_featured[ind], axis=0) #the type of aggregation must also be adjusted if necessary
new = new.reshape(1,28,28,16)
new_image = image_maker.predict(new)
plt.figure(figsize=(8,8))
plt.imshow(new_image.reshape(224,224,3))
plt.xticks(())
plt.yticks(())
plt.tight_layout(pad=-1)
plt.savefig('static/for_use/{image_name}.jpeg')

*The block below demonstrates the contrast of aggregating the clusters by the images to produce a new image as opposed to aggregating on the encodings and putting that aggregation through the decoder. The difference is vast!*

In [None]:
conglom = []
for i in np.where(tops=={'desired_image'}):
    conglom.append(cv2.resize(cv2.imread('static/moma/'+art[i]),(224,224)))
conglom = np.array(conglom)
im = np.max(conglom, axis=0)
plt.figure(figsize=(8,8))
plt.imshow(im.reshape(224,224,3))
plt.xticks(())
plt.yticks(())
plt.tight_layout(pad=-1)
plt.savefig('static/for_use/bad.jpeg')