# ANIME RECOMMENDER SYSTEM
## Anime Recommender System based on content and collaborative filtering

### Let's build our content-based filtering recommender system

### Objective: 
- Build Sypnopsis, Genres, and Studios -based filtering
- Build full content-based filtering (combination of feature)
- Model demonstration

## CONTENT-BASED VS COLLABORATIVE FILTERING

![img](img/3_content_vs_collab.png)

Collaborative filtering is a recomender system that works by finding similar interest from other user. 

Meanwhile, Content-based filtering is a recommender system that works by finding similarity in the content of the anime that user likes(Genres, sypnopsis, etc).

### With all set, lets build our model! さぁ、始めよう!!

## TABLE OF CONTENT
- MODEL BUILDING (CONTENT-BASED FILTERING)
- MODEL DEMONSTRATION SAMPLE

In [1]:
# basic library
import numpy as np 
import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

In [2]:
# load data
anime_subset = pd.read_csv('dataset/exported_dataset/anime_subset.csv')

## CONTENT-BASED FILTERING
Content-based filtering is works by finding similarity (by computing pairwise similarity scores based on TF-IDF Matrix) in the content. The content in this case can be the titles, sypnopsis, genres, etc. 


TF-IDF:

![img](img/4_tfidf.png)

We will do content-based filtering recommender with this features:
- sypnopsis based recommender
- Genres based recommender
- Studios based recommender

Lastly, we will combining all those feature to create 1 content combination recommender. We will combine all features by making a 'soup' or combining all column into 1 new feature.


## essential function:

In [3]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import sigmoid_kernel

In [4]:
def get_tf_matrix(col):
    tf = TfidfVectorizer(stop_words='english')
    anime_subset[col] = anime_subset[col].fillna('')

    tf_matrix = tf.fit_transform(anime_subset[col])
    tf_matrix.shape
    return tf_matrix

In [5]:
def get_cosine(tf_matrix):
    # compute cosine similarity matrix
    cosine_sim = sigmoid_kernel(tf_matrix, tf_matrix)
    return cosine_sim

In [6]:
# get indices for every anime
indices = pd.Series(anime_subset.index, index=anime_subset['Name']).drop_duplicates()

In [7]:
# list to string
col_list = ['Genres', 'Studios']
for i in col_list:
    anime_subset[i] = anime_subset[i].astype(str)

### Sypnopsis based Recommender
recommendation based on similar sypnopsis of an anime, we will do this by computing pairwise similarity scores.

In [8]:
tf_matrix_sypnopsis = get_tf_matrix('sypnopsis')
cosine_sim_sypnopsis = get_cosine(tf_matrix_sypnopsis)

### Genres based Recommender

In [9]:
tf_matrix_genres = get_tf_matrix('Genres')
cosine_sim_genres = get_cosine(tf_matrix_genres)

### Studios based Recommender

In [10]:
tf_matrix_studios = get_tf_matrix('Studios')
cosine_sim_studios = get_cosine(tf_matrix_studios)

### Feature Combination based Recommender
creating content with combination of feature. we will create 'metadata soup' contain the feature we will combine in a single string

In [11]:
# create soup combining feature:
feature = ['Genres', 'Type', 'Studios', 'Source', 'sypnopsis']
for i in feature:
    anime_subset = anime_subset.astype(str)
anime_subset['soup'] = anime_subset[feature].apply(lambda row: '  '.join(row.values.astype(str)), axis=1)

In [12]:
tf_matrix_soup = get_tf_matrix('soup')
cosine_sim_soup = get_cosine(tf_matrix_soup)

In [13]:
# get recommendation function for content based filtering
def get_rec_content(title, cosine_sim):
    # get index of title
    idx = indices[title]
    
    # pairwise similarity score 
    sim = list(enumerate(cosine_sim[idx]))
    sim = sorted(sim, key=lambda x: x[1], reverse=True)
    sim = sim[1:11]

    # indices sim 
    anime_indices = [i[0] for i in sim]
    return anime_subset['Name'].iloc[anime_indices]

# MODEL DEMONSTRATION SAMPLE

## CONTENT BASED FILTERING DEMO

In [14]:
get_rec_content('Haikyuu!!', cosine_sim_sypnopsis)

15199                     Haikyuu!!: To the Top
9895                    Haikyuu!! Second Season
2803                              Attacker You!
3592                           Ashita e Attack!
5654                         Attack No.1 (1970)
13273                   Haikyuu!!: vs. "Akaten"
1407                                Attack No.1
4898     Shoujo Fight: Norainu-tachi no Odekake
16296       2.43: Seiin Koukou Danshi Volley-bu
9315                     Haikyuu!!: Lev Genzan!
Name: Name, dtype: object

In [15]:
get_rec_content('Shingeki no Kyojin', cosine_sim_genres)

9383                           Shingeki no Kyojin Season 2
13252                          Shingeki no Kyojin Season 3
14963                   Shingeki no Kyojin Season 3 Part 2
15926                 Shingeki no Kyojin: The Final Season
16841                        Shingeki no Kyojin: Chronicle
7879                                Shingeki no Kyojin OVA
8052                       Shingeki no Kyojin: Ano Hi Kara
9018           Shingeki no Kyojin Movie 1: Guren no Yumiya
9019          Shingeki no Kyojin Movie 2: Jiyuu no Tsubasa
13804    Shingeki no Kyojin Season 2 Movie: Kakusei no ...
Name: Name, dtype: object

In [16]:
get_rec_content('Koe no Katachi', cosine_sim_studios)

53                      Full Metal Panic! The Second Raid
80                                                    Air
614                                         Air in Summer
766                            Suzumiya Haruhi no Yuuutsu
920     Full Metal Panic! The Second Raid: Wari to Him...
1031                         Munto: Toki no Kabe wo Koete
1069                                                Munto
1388                                         Kanon (2006)
1718                                           Lucky☆Star
1984                                              Clannad
Name: Name, dtype: object

In [17]:
get_rec_content('Himouto! Umaru-chan', cosine_sim_soup)

13017      Himouto! Umaru-chan R
10870       Himouto! Umaru-chanS
5964                Hamster Club
10028          Toko-chan Chokkin
10873           Ganbare-bu Next!
13356             Alice or Alice
10772    Himouto! Umaru-chan OVA
11754     Sansha Sanyou Specials
586            Shichinin no Nana
3240             Koala Boy Kokki
Name: Name, dtype: object

### comparison of all content-based reccomender above

In [18]:
get_rec_content('Haikyuu!!', cosine_sim_sypnopsis)

15199                     Haikyuu!!: To the Top
9895                    Haikyuu!! Second Season
2803                              Attacker You!
3592                           Ashita e Attack!
5654                         Attack No.1 (1970)
13273                   Haikyuu!!: vs. "Akaten"
1407                                Attack No.1
4898     Shoujo Fight: Norainu-tachi no Odekake
16296       2.43: Seiin Koukou Danshi Volley-bu
9315                     Haikyuu!!: Lev Genzan!
Name: Name, dtype: object

In [19]:
get_rec_content('Haikyuu!!', cosine_sim_genres)

5633                                     Rokudenashi Blues
7454                                         Batsu & Terry
8305                                             Haikyuu!!
9895                               Haikyuu!! Second Season
10139                 Haikyuu!! Movie 1: Owari to Hajimari
10495                 Haikyuu!! Movie 2: Shousha to Haisha
11624    Haikyuu!!: Karasuno Koukou vs. Shiratorizawa G...
12834                   Haikyuu!! Movie 3: Sainou to Sense
12835                Haikyuu!! Movie 4: Concept no Tatakai
14234                                        Ahiru no Sora
Name: Name, dtype: object

In [20]:
get_rec_content('Haikyuu!!', cosine_sim_studios)

93               Sakigake!! Cromartie Koukou
128                                   Blood+
178                            Video Girl Ai
381                  Blood: The Last Vampire
437        One Piece: Taose! Kaizoku Ganzack
438    Koukaku Kidoutai: Stand Alone Complex
439                                Innocence
492                              Otogizoushi
493               Boku no Chikyuu wo Mamotte
535                                  Jin-Rou
Name: Name, dtype: object

In [21]:
get_rec_content('Haikyuu!!', cosine_sim_soup)

15199                     Haikyuu!!: To the Top
9895                    Haikyuu!! Second Season
3592                           Ashita e Attack!
2803                              Attacker You!
13273                   Haikyuu!!: vs. "Akaten"
5654                         Attack No.1 (1970)
1407                                Attack No.1
4898     Shoujo Fight: Norainu-tachi no Odekake
16296       2.43: Seiin Koukou Danshi Volley-bu
9315                     Haikyuu!!: Lev Genzan!
Name: Name, dtype: object