# ART RECOMMENDER SYSTEM

## Recommender System - MET

This script will take all the data analyzed and structured in [Analyzing Data - MET](03_MET_AnalysisTFIDF.ipynb) and create a recommender system based on contents: calculating the closest cosine distance between works of art, where the attributes to calculate this distance will be the probabilities of each painting to belong to each of the analyzed topics.  

Future analysis will include a transversal recommender from MET to Prado works of art and viceversa.  Also, include more museums and more languages.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pickle

from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise import manhattan_distances
from sklearn.metrics.pairwise import euclidean_distances

In [25]:
with open('../models/Met_NMF37.pkl', 'rb') as f:
    museum_model = pickle.load(f)

In [3]:
museum_model.head(3)

Unnamed: 0,index,link,descripcion,ent,0,1,2,3,4,5,...,27,28,29,30,31,32,33,34,35,36
0,0,http://www.metmuseum.org/art/collection/search...,"Until the mid-1870s, furniture styles in Ameri...",,0.002041,0.0,0.0,0.0,0.0,0.0,...,0.054673,0.0,0.0,0.001729,0.008726,0.004717,0.001115,0.0,0.015141,0.006391
1,1,http://www.metmuseum.org/art/collection/search...,"This cabinet was a gift of its maker, Charles ...",,0.0,0.0,0.001489,0.002516,0.0,0.0,...,0.049165,0.0,0.0,0.006394,0.0,0.0,0.0,0.0,0.0,0.0
2,2,http://www.metmuseum.org/art/collection/search...,With the development of new formulas and techn...,,0.0,0.0,0.0,0.0,0.000554,0.001737,...,0.051425,0.0,0.0,0.0,0.0,0.001661,0.0,0.006275,0.000129,0.026792


In [26]:
museum_dist=museum_model.iloc[:,4:]
museum_dist.head(3)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,27,28,29,30,31,32,33,34,35,36
0,0.0,0.0,0.0,0.000173,0.049208,0.002158,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.004975,0.0,0.006627,0.004285,0.0,0.010103,0.007728
1,0.0,0.0,0.0,0.007561,0.002176,0.003556,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.011159,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.004075,0.001199,0.002278,0.00757,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.003117,0.0,0.01016,0.045389,0.0


In [89]:
len(museum_dist)

41538

**Create the matrix of cosine distances between all 41.538 paintings**

In [27]:
dists = cosine_similarity(museum_dist)
dists.shape

(41538, 41538)

In [28]:
dists = pd.DataFrame(dists, columns=museum_model['link'])

dists.index = dists.columns
dists.iloc[0:10, 0:10]

link,http://www.metmuseum.org/art/collection/search/1084,http://www.metmuseum.org/art/collection/search/1085,http://www.metmuseum.org/art/collection/search/1091,http://www.metmuseum.org/art/collection/search/1120,http://www.metmuseum.org/art/collection/search/1175,http://www.metmuseum.org/art/collection/search/1180,http://www.metmuseum.org/art/collection/search/1224,http://www.metmuseum.org/art/collection/search/1226,http://www.metmuseum.org/art/collection/search/1230,http://www.metmuseum.org/art/collection/search/1293
link,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
http://www.metmuseum.org/art/collection/search/1084,1.0,0.652199,0.561879,0.654749,0.613192,0.50214,0.601204,0.140996,0.443939,0.517103
http://www.metmuseum.org/art/collection/search/1085,0.652199,1.0,0.681268,0.70347,0.69202,0.694307,0.566071,0.080482,0.345632,0.516477
http://www.metmuseum.org/art/collection/search/1091,0.561879,0.681268,1.0,0.856567,0.503435,0.517063,0.440792,0.055954,0.859104,0.699915
http://www.metmuseum.org/art/collection/search/1120,0.654749,0.70347,0.856567,1.0,0.560484,0.748537,0.446758,0.070004,0.678211,0.613875
http://www.metmuseum.org/art/collection/search/1175,0.613192,0.69202,0.503435,0.560484,1.0,0.520935,0.450623,0.131336,0.300979,0.493365
http://www.metmuseum.org/art/collection/search/1180,0.50214,0.694307,0.517063,0.748537,0.520935,1.0,0.365883,0.055929,0.315133,0.389715
http://www.metmuseum.org/art/collection/search/1224,0.601204,0.566071,0.440792,0.446758,0.450623,0.365883,1.0,0.066694,0.289011,0.427374
http://www.metmuseum.org/art/collection/search/1226,0.140996,0.080482,0.055954,0.070004,0.131336,0.055929,0.066694,1.0,0.060927,0.074185
http://www.metmuseum.org/art/collection/search/1230,0.443939,0.345632,0.859104,0.678211,0.300979,0.315133,0.289011,0.060927,1.0,0.606626
http://www.metmuseum.org/art/collection/search/1293,0.517103,0.516477,0.699915,0.613875,0.493365,0.389715,0.427374,0.074185,0.606626,1.0


**Calculate the first k recommended paintings based on a given one**

In [7]:
def art_recommend(work_link, k) :
    print(work_link)
    print('-'*20)
    ranked_art = dists[work_link]
    ranked_art = ranked_art.sort_values(ascending=False)
    ranked_art = ranked_art.index
    ranked_art = ranked_art.tolist()
    for i in range(1,k+1) :
        print(ranked_art[i])
    

In [30]:
import random
w = random.randint(0,len(museum_model))
#art_recommend(museum_model['link'][w], 10)
art_recommend('http://www.metmuseum.org/art/collection/search/486842', 10)

http://www.metmuseum.org/art/collection/search/486842
--------------------
http://www.metmuseum.org/art/collection/search/436586
http://www.metmuseum.org/art/collection/search/33605
http://www.metmuseum.org/art/collection/search/505798
http://www.metmuseum.org/art/collection/search/487603
http://www.metmuseum.org/art/collection/search/13148
http://www.metmuseum.org/art/collection/search/437898
http://www.metmuseum.org/art/collection/search/437122
http://www.metmuseum.org/art/collection/search/486754
http://www.metmuseum.org/art/collection/search/483417
http://www.metmuseum.org/art/collection/search/11940
