<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 28px; height: 77px"> 

# Anime Recommender System 
### *Recommender Sysyem Modeling*
---

### Reading in Imports

In [78]:
import numpy as np
import pandas as pd
from scipy import sparse
from sklearn.metrics.pairwise import pairwise_distances
import pickle
from IPython.display import HTML

### Reading in Data

In [79]:
df = pd.read_csv('../datasets/clean.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,user_id,anime_id,rating,name
0,1,3,20,8,Naruto
1,2,5,20,6,Naruto
2,5,21,20,8,Naruto
3,6,28,20,9,Naruto
4,7,34,20,9,Naruto


### Drop Unnamed:0 Column

In [80]:
df.drop(columns = 'Unnamed: 0', inplace = True)
df.head()

Unnamed: 0,user_id,anime_id,rating,name
0,3,20,8,Naruto
1,5,20,6,Naruto
2,21,20,8,Naruto
3,28,20,9,Naruto
4,34,20,9,Naruto


### Recommender System Setup with Cosine Distances
---

#### - Pivot table for dataframe

In [81]:
rating_mx = pd.pivot_table(
    df,
    index='name',
    columns='user_id',
    values='rating'
)

rating_mx.head()

user_id,1,2,3,5,7,8,9,10,11,12,...,73507,73508,73509,73510,73511,73512,73513,73514,73515,73516
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
&quot;0&quot;,,,,,,,,,,,...,,,,,,,,,,
"&quot;Aesop&quot; no Ohanashi yori: Ushi to Kaeru, Yokubatta Inu",,,,,,,,,,,...,,,,,,,,,,
&quot;Bungaku Shoujo&quot; Kyou no Oyatsu: Hatsukoi,,,,,,,,,,,...,,,,,,,,,,
&quot;Bungaku Shoujo&quot; Memoire,,,,,,,,,,,...,,,,6.0,,,,,,
&quot;Bungaku Shoujo&quot; Movie,,,,,,,,,,,...,,,,,,,,,,


#### - Sparse Matrix

In [82]:
ratings_sparse = sparse.csr_matrix(rating_mx.fillna(0))

#### - Cosine distances for similarities

In [83]:
dists = pairwise_distances(ratings_sparse, metric='cosine')

#### - Create Dataframe

In [84]:
df_rec = pd.DataFrame(dists, columns=rating_mx.index, index=rating_mx.index)

### Example Recommendation
---

In [85]:
1 - df_rec['Ghost in the Shell'].sort_values().head(10)

name
Ghost in the Shell                                               1.000000
Ghost in the Shell 2: Innocence                                  0.631306
Ghost in the Shell: Stand Alone Complex                          0.539704
Akira                                                            0.497596
Ghost in the Shell: Stand Alone Complex 2nd GIG                  0.483078
Cowboy Bebop                                                     0.456209
Neon Genesis Evangelion                                          0.440300
Ghost in the Shell: Stand Alone Complex - Solid State Society    0.432845
Cowboy Bebop: Tengoku no Tobira                                  0.421683
Neon Genesis Evangelion: The End of Evangelion                   0.409675
Name: Ghost in the Shell, dtype: float64

In [86]:
df = pd.DataFrame(1 - df_rec['Cowboy Bebop'].sort_values().head(10))
df['url'] = 'https://www.justwatch.com/us/search?q=' + df.index.astype(str).str.replace(' ' , '%20')

In [87]:
HTML(df.to_html(render_links=True, escape=False))

Unnamed: 0_level_0,Cowboy Bebop,url
name,Unnamed: 1_level_1,Unnamed: 2_level_1
Cowboy Bebop,1.0,https://www.justwatch.com/us/search?q=Cowboy%20Bebop
Cowboy Bebop: Tengoku no Tobira,0.607757,https://www.justwatch.com/us/search?q=Cowboy%20Bebop:%20Tengoku%20no%20Tobira
Samurai Champloo,0.550237,https://www.justwatch.com/us/search?q=Samurai%20Champloo
Trigun,0.534823,https://www.justwatch.com/us/search?q=Trigun
FLCL,0.506804,https://www.justwatch.com/us/search?q=FLCL
Neon Genesis Evangelion,0.501355,https://www.justwatch.com/us/search?q=Neon%20Genesis%20Evangelion
Tengen Toppa Gurren Lagann,0.472579,https://www.justwatch.com/us/search?q=Tengen%20Toppa%20Gurren%20Lagann
Akira,0.46573,https://www.justwatch.com/us/search?q=Akira
Ghost in the Shell,0.456209,https://www.justwatch.com/us/search?q=Ghost%20in%20the%20Shell
Black Lagoon,0.456098,https://www.justwatch.com/us/search?q=Black%20Lagoon


### Export for Streamlit

In [15]:
df_rec.insert(0, "name", df_rec.index)

In [16]:
df_rec.to_pickle('../datasets/rec.plk')

In [17]:
pickle.dump(dists,open("dists.pkl","wb"))

In [18]:
pickle.dump(df_rec,open("movie_recom.pkl","wb"))