# 21 October 2021

## **Latihan Recommendation System (Content Based Filtering)**

### **Gunakan dataset anime**

### **Content based filtering for one user**

- Drop missing values
- Vectorize untuk mendapatkan tiap nilai genre untuk tiap anime (item-feature matrix with rating)
- Pilih 3 anime yang disukai user (bebas)
- Buat user feature vector
- Cari 10 rekomendasi anime untuk user

## **Import libraries**

In [1]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer

import warnings
warnings.filterwarnings('ignore')

## **Load dataset**

In [2]:
df = pd.read_csv('anime.csv')
df

Unnamed: 0,anime_id,name,genre,type,episodes,rating,members
0,32281,Kimi no Na wa.,"Drama, Romance, School, Supernatural",Movie,1,9.37,200630
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili...",TV,64,9.26,793665
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,9.25,114262
3,9253,Steins;Gate,"Sci-Fi, Thriller",TV,24,9.17,673572
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,9.16,151266
...,...,...,...,...,...,...,...
12289,9316,Toushindai My Lover: Minami tai Mecha-Minami,Hentai,OVA,1,4.15,211
12290,5543,Under World,Hentai,OVA,1,4.28,183
12291,5621,Violence Gekiga David no Hoshi,Hentai,OVA,4,4.88,219
12292,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,Hentai,OVA,1,4.98,175


## **Preprocessing**

In [3]:
df = df.loc[:, ['anime_id', 'name', 'genre', 'rating']]
df

Unnamed: 0,anime_id,name,genre,rating
0,32281,Kimi no Na wa.,"Drama, Romance, School, Supernatural",9.37
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili...",9.26
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",9.25
3,9253,Steins;Gate,"Sci-Fi, Thriller",9.17
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S...",9.16
...,...,...,...,...
12289,9316,Toushindai My Lover: Minami tai Mecha-Minami,Hentai,4.15
12290,5543,Under World,Hentai,4.28
12291,5621,Violence Gekiga David no Hoshi,Hentai,4.88
12292,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,Hentai,4.98


In [4]:
df.isna().sum()

anime_id      0
name          0
genre        62
rating      230
dtype: int64

In [5]:
df = df.dropna()

In [6]:
df.shape

(12017, 4)

In [7]:
df = df.reset_index(drop=True)

## **Vectorizer (item-feature matrix)**

In [8]:
# Generate item feature matrix
vect = CountVectorizer(tokenizer=lambda x:x.split(', '))
df_genre = vect.fit_transform(df['genre'])
df_genre = pd.DataFrame(df_genre.toarray(), columns=vect.get_feature_names())
df_genre = pd.concat([df[['anime_id', 'name', 'rating']], df_genre], axis=1)
df_genre

Unnamed: 0,anime_id,name,rating,action,adventure,cars,comedy,dementia,demons,drama,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
0,32281,Kimi no Na wa.,9.37,0,0,0,0,0,0,1,...,0,0,0,0,0,1,0,0,0,0
1,5114,Fullmetal Alchemist: Brotherhood,9.26,1,1,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,28977,Gintama°,9.25,1,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,9253,Steins;Gate,9.17,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
4,9969,Gintama&#039;,9.16,1,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12012,9316,Toushindai My Lover: Minami tai Mecha-Minami,4.15,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12013,5543,Under World,4.28,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12014,5621,Violence Gekiga David no Hoshi,4.88,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12015,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,4.98,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## **Randomly selecting 3 anime**

In [9]:
df_choice = df_genre[df_genre['name'].isin(['Naruto', 'One Piece', 'Dragon Ball'])]
df_choice

Unnamed: 0,anime_id,name,rating,action,adventure,cars,comedy,dementia,demons,drama,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
74,21,One Piece,8.58,1,1,0,1,0,0,1,...,0,0,0,0,1,0,0,0,0,0
346,223,Dragon Ball,8.16,0,1,0,1,0,0,0,...,0,0,0,0,1,0,0,0,0,0
841,20,Naruto,7.81,1,0,0,1,0,0,0,...,0,0,0,0,1,0,0,0,0,0


### **Item-feature matrix with rating**

In [10]:
for i in vect.get_feature_names():
    df_choice[i]= df_choice['rating'] * df_choice[i]

df_choice

Unnamed: 0,anime_id,name,rating,action,adventure,cars,comedy,dementia,demons,drama,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
74,21,One Piece,8.58,8.58,8.58,0.0,8.58,0.0,0.0,8.58,...,0.0,0.0,0.0,0.0,8.58,0.0,0.0,0.0,0.0,0.0
346,223,Dragon Ball,8.16,0.0,8.16,0.0,8.16,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,8.16,0.0,0.0,0.0,0.0,0.0
841,20,Naruto,7.81,7.81,0.0,0.0,7.81,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,7.81,0.0,0.0,0.0,0.0,0.0


## **Calculating user feature vector**

In [11]:
df_choice[vect.get_feature_names()].sum()

action           16.39
adventure        16.74
cars              0.00
comedy           24.55
dementia          0.00
demons            0.00
drama             8.58
ecchi             0.00
fantasy          16.74
game              0.00
harem             0.00
hentai            0.00
historical        0.00
horror            0.00
josei             0.00
kids              0.00
magic             0.00
martial arts     15.97
mecha             0.00
military          0.00
music             0.00
mystery           0.00
parody            0.00
police            0.00
psychological     0.00
romance           0.00
samurai           0.00
school            0.00
sci-fi            0.00
seinen            0.00
shoujo            0.00
shoujo ai         0.00
shounen          24.55
shounen ai        0.00
slice of life     0.00
space             0.00
sports            0.00
super power      24.55
supernatural      0.00
thriller          0.00
vampire           0.00
yaoi              0.00
yuri              0.00
dtype: floa

In [12]:
user_feature_vector = df_choice[vect.get_feature_names()].sum()/df_choice[vect.get_feature_names()].sum().sum()
user_feature_vector

action           0.110691
adventure        0.113055
cars             0.000000
comedy           0.165800
dementia         0.000000
demons           0.000000
drama            0.057946
ecchi            0.000000
fantasy          0.113055
game             0.000000
harem            0.000000
hentai           0.000000
historical       0.000000
horror           0.000000
josei            0.000000
kids             0.000000
magic            0.000000
martial arts     0.107854
mecha            0.000000
military         0.000000
music            0.000000
mystery          0.000000
parody           0.000000
police           0.000000
psychological    0.000000
romance          0.000000
samurai          0.000000
school           0.000000
sci-fi           0.000000
seinen           0.000000
shoujo           0.000000
shoujo ai        0.000000
shounen          0.165800
shounen ai       0.000000
slice of life    0.000000
space            0.000000
sports           0.000000
super power      0.165800
supernatural

## **Create recommendation**

Kalikan hasil user feature vector dengan semua list anime, sehingga nanti tiap judul anime, genre-nya memiliki nilai user feature vector berdasarkan anime Naruto, One Piece, dan Dragon Ball.

In [13]:
# Dataframe without Naruto, One Piece, Dragon Ball
df_recommend = df_genre.loc[~df_genre['name'].isin(['Naruto', 'One Piece', 'Dragon Ball'])] 

In [14]:
for i in vect.get_feature_names():
    df_recommend[i]= df_recommend[i] * user_feature_vector[i]
    
df_recommend

Unnamed: 0,anime_id,name,rating,action,adventure,cars,comedy,dementia,demons,drama,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
0,32281,Kimi no Na wa.,9.37,0.000000,0.000000,0.0,0.0000,0.0,0.0,0.057946,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,5114,Fullmetal Alchemist: Brotherhood,9.26,0.110691,0.113055,0.0,0.0000,0.0,0.0,0.057946,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,28977,Gintama°,9.25,0.110691,0.000000,0.0,0.1658,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,9253,Steins;Gate,9.17,0.000000,0.000000,0.0,0.0000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,9969,Gintama&#039;,9.16,0.110691,0.000000,0.0,0.1658,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12012,9316,Toushindai My Lover: Minami tai Mecha-Minami,4.15,0.000000,0.000000,0.0,0.0000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
12013,5543,Under World,4.28,0.000000,0.000000,0.0,0.0000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
12014,5621,Violence Gekiga David no Hoshi,4.88,0.000000,0.000000,0.0,0.0000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
12015,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,4.98,0.000000,0.000000,0.0,0.0000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### **Create predicted score column**

In [15]:
df_recommend['predicted_score'] = df_recommend.drop(columns=['rating', 'anime_id']).sum(axis=1)

In [16]:
df_recommend[['name', 'predicted_score']].sort_values('predicted_score', ascending=False).head(20)

Unnamed: 0,name,predicted_score
5997,Dragon Ball Z Movie 11: Super Senshi Gekiha!! ...,0.942054
1409,Dragon Ball Z Movie 15: Fukkatsu no F,0.942054
1931,Dragon Ball: Episode of Bardock,0.942054
1930,Dragon Ball Super,0.942054
3407,Dragon Ball Z: Zenbu Misemasu Toshi Wasure Dra...,0.942054
3202,Dragon Ball Z: Summer Vacation Special,0.942054
4312,Dragon Ball GT: Goku Gaiden! Yuuki no Akashi w...,0.942054
4273,Dragon Ball Z: Atsumare! Gokuu World,0.942054
588,Dragon Ball Kai,0.942054
206,Dragon Ball Z,0.942054


### **Interpretasi**

Jadi, berdasarkan 3 pilihan anime, yaitu Naruto, One Piece, dan Dragon Ball, 20 rekomendasi anime teratas (berdasarkan predicted score-nya) tercantum pada dataframe di atas, yang mana hasilnya kebanyakan adalah varian atau episode lain dari Dragon Ball dan One Piece.

In [17]:
# All codes above as a function
def recommend_me():
    anime = input('Masukan judul anime yang kamu sukai, pisahkan dengan tanda koma').split(', ')
    
    import pandas as pd
    import numpy as np
    from sklearn.feature_extraction.text import CountVectorizer
    
    # Create genre dataframe
    df = pd.read_csv('anime.csv')
    df = df.loc[:, ['anime_id', 'name', 'genre', 'rating']].dropna().reset_index(drop=True)
    vect = CountVectorizer(tokenizer=lambda x:x.split(', '))
    df_genre = vect.fit_transform(df['genre'])
    df_genre = pd.DataFrame(df_genre.toarray(), columns=vect.get_feature_names())
    df_genre = pd.concat([df[['anime_id', 'name', 'rating']], df_genre],axis=1)
    
    # User feature vector
    df_choice = df_genre[df_genre['name'].isin(anime)]
    
    for i in vect.get_feature_names():
        df_choice[i]= df_choice['rating']*df_choice[i]
    
    user_feature_vector = df_choice[vect.get_feature_names()].sum()/df_choice[vect.get_feature_names()].sum().sum()
    
    # Predict score
    df_recommend = df_genre[~df_genre['name'].isin(anime)].drop(columns='rating')
    
    for i in vect.get_feature_names():
        df_recommend[i] = df_recommend[i]*user_feature_vector[i]
        
    df_recommend['predicted_score'] = df_recommend.drop(columns=['anime_id']).sum(axis=1)
    
    return df_recommend[['name', 'predicted_score']].sort_values('predicted_score', ascending=False).head(20)

In [18]:
recommend_me()

Unnamed: 0,name,predicted_score
2183,Doraemon Movie 23: Nobita to Robot Kingdom,1.0
993,Doraemon Movie 31: Shin Nobita to Tetsujin Hei...,1.0
3846,The☆Doraemons: Dokidoki Kikansha Daibakusou!,1.0
4814,Saru Getchu: On Air 2nd,1.0
4638,Doraemon: It&#039;s Spring!,1.0
1443,Doraemon Movie 28: Nobita to Midori no Kyojin Den,1.0
4791,Doraemon: It&#039;s Summer!,1.0
1354,Doraemon Movie 21: Nobita no Taiyou Ou Densetsu,1.0
1657,Doraemon Movie 33: Nobita no Himitsu Dougu Museum,1.0
1702,Doraemon Movie 07: Nobita to Tetsujin Heidan,1.0


## **Content based filtering for multiple users**

**Rekomendasikan masing-masing 10 anime untuk setiap user**

Buatlah:
- Item feature: anime id, name, genre
- Buat user-item rating matrix di mana isinya adalah 4 user dan 4 anime yang sudah diberi rating oleh keempat user tersebut (buat secara random saja)
- Buat item-feature matrix untuk semua anime dan filter-lah untuk 4 anime yang sudah diberi rating
- Buat user feature matrix dan user feature vector-nya
- Buat rekomendasi untuk anime yang belum ditonton
- Sort dan filtering 10 rekomendasi anime untuk tiap user

In [19]:
search = 'Hunter'
  
result = df['name'].str.startswith(search, na = False) 
df[result] 

Unnamed: 0,anime_id,name,genre,rating
6,11061,Hunter x Hunter (2011),"Action, Adventure, Shounen, Super Power",9.13
112,136,Hunter x Hunter,"Action, Adventure, Shounen, Super Power",8.48
145,137,Hunter x Hunter OVA,"Action, Adventure, Shounen, Super Power",8.41
146,139,Hunter x Hunter: Greed Island Final,"Action, Adventure, Shounen, Super Power",8.41
202,138,Hunter x Hunter: Greed Island,"Action, Adventure, Shounen, Super Power",8.33
1974,13271,Hunter x Hunter Movie: Phantom Rouge,"Action, Adventure, Shounen, Super Power",7.39
2046,10189,Hunter x Hunter Pilot,"Action, Adventure, Shounen, Super Power",7.37
2108,19951,Hunter x Hunter Movie: The Last Mission,"Action, Adventure, Shounen, Super Power",7.35


In [20]:
df[df['anime_id'] == 11061]

Unnamed: 0,anime_id,name,genre,rating
6,11061,Hunter x Hunter (2011),"Action, Adventure, Shounen, Super Power",9.13


In [21]:
search = 'Action'
  
result = df['genre'].str.startswith(search, na = False) 
df[result].head(30)

Unnamed: 0,anime_id,name,genre,rating
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili...",9.26
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",9.25
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S...",9.16
6,11061,Hunter x Hunter (2011),"Action, Adventure, Shounen, Super Power",9.13
8,15335,Gintama Movie: Kanketsu-hen - Yorozuya yo Eien...,"Action, Comedy, Historical, Parody, Samurai, S...",9.1
9,15417,Gintama&#039;: Enchousen,"Action, Comedy, Historical, Parody, Samurai, S...",9.11
12,918,Gintama,"Action, Comedy, Historical, Parody, Samurai, S...",9.04
13,2904,Code Geass: Hangyaku no Lelouch R2,"Action, Drama, Mecha, Military, Sci-Fi, Super ...",8.98
19,1575,Code Geass: Hangyaku no Lelouch,"Action, Mecha, Military, School, Sci-Fi, Super...",8.83
21,44,Rurouni Kenshin: Meiji Kenkaku Romantan - Tsui...,"Action, Drama, Historical, Martial Arts, Roman...",8.83


## **Create a copy of df**

In [22]:
df_multiple = df

### **Choose only anime_id, name, genre columns and drop missing values**

In [23]:
df_multiple = df_multiple[['anime_id', 'name', 'genre']].dropna().reset_index(drop=True)
df_multiple

Unnamed: 0,anime_id,name,genre
0,32281,Kimi no Na wa.,"Drama, Romance, School, Supernatural"
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili..."
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S..."
3,9253,Steins;Gate,"Sci-Fi, Thriller"
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S..."
...,...,...,...
12012,9316,Toushindai My Lover: Minami tai Mecha-Minami,Hentai
12013,5543,Under World,Hentai
12014,5621,Violence Gekiga David no Hoshi,Hentai
12015,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,Hentai


## **Create user-item rating matrix**

In [24]:
df_user_item = pd.DataFrame({
    'user': ['user 1', 'user 2', 'user 3', 'user 4'],
    26055: [7, 8, 8, 9],
    11061: [9, 7, 8, 7],
    2001: [7, 7, 9, 7],
    31933: [8, 9, 7, 8]
})

df_user_item

Unnamed: 0,user,26055,11061,2001,31933
0,user 1,7,9,7,8
1,user 2,8,7,7,9
2,user 3,8,8,9,7
3,user 4,9,7,7,8


## **Create item feature matrix**

In [25]:
vect = CountVectorizer(tokenizer=lambda x:x.split(', '))
df_genre = vect.fit_transform(df_multiple['genre'])
df_genre = pd.DataFrame(df_genre.toarray(), columns=vect.get_feature_names())
df_genre = pd.concat([df_multiple[['anime_id', 'name']], df_genre], axis=1)
df_genre

Unnamed: 0,anime_id,name,action,adventure,cars,comedy,dementia,demons,drama,ecchi,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
0,32281,Kimi no Na wa.,0,0,0,0,0,0,1,0,...,0,0,0,0,0,1,0,0,0,0
1,5114,Fullmetal Alchemist: Brotherhood,1,1,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
2,28977,Gintama°,1,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,9253,Steins;Gate,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
4,9969,Gintama&#039;,1,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12012,9316,Toushindai My Lover: Minami tai Mecha-Minami,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12013,5543,Under World,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12014,5621,Violence Gekiga David no Hoshi,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12015,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [26]:
df_item_feature = df_genre.loc[df_genre['anime_id'].isin(df_user_item.columns[1:])]
df_item_feature

Unnamed: 0,anime_id,name,action,adventure,cars,comedy,dementia,demons,drama,ecchi,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
6,11061,Hunter x Hunter (2011),1,1,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
29,2001,Tengen Toppa Gurren Lagann,1,1,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
64,26055,JoJo no Kimyou na Bouken: Stardust Crusaders 2...,1,1,0,0,0,0,1,0,...,0,0,0,0,0,1,0,0,0,0
76,31933,JoJo no Kimyou na Bouken: Diamond wa Kudakenai,1,1,0,1,0,0,1,0,...,0,0,0,0,0,1,0,0,0,0


## **Create user feature vector**

In [27]:
user_item = np.array(df_user_item.drop(columns='user'))
item_feature = np.array(df_item_feature.drop(columns=['anime_id', 'name']))

n_user = user_item.shape[0]
n_item = user_item.shape[1]
n_feature = item_feature.shape[1]

user_feature = np.empty((n_user, n_feature))

for i in range(n_user):
    user_feature_vector = np.matmul(user_item[i,:], item_feature)
    user_feature_vector = user_feature_vector/user_feature_vector.sum()
    user_feature[i, :] = user_feature_vector

In [28]:
item_feature

array([[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
       [1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
       [1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]],
      dtype=int64)

In [29]:
user_feature

array([[0.19871795, 0.19871795, 0.        , 0.10897436, 0.        ,
        0.        , 0.09615385, 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.05769231, 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.05769231, 0.        ,
        0.        , 0.        , 0.14102564, 0.        , 0.        ,
        0.        , 0.        , 0.04487179, 0.09615385, 0.        ,
        0.        , 0.        , 0.        ],
       [0.19871795, 0.19871795, 0.        , 0.1025641 , 0.        ,
        0.        , 0.1025641 , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.04487179, 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.04487179, 0.        ,
   

In [30]:
df_user_vector = pd.DataFrame(user_feature, index=df_user_item['user'], columns=df_item_feature.columns[2:])
df_user_vector

Unnamed: 0_level_0,action,adventure,cars,comedy,dementia,demons,drama,ecchi,fantasy,game,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
user,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
user 1,0.198718,0.198718,0.0,0.108974,0.0,0.0,0.096154,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.044872,0.096154,0.0,0.0,0.0,0.0
user 2,0.198718,0.198718,0.0,0.102564,0.0,0.0,0.102564,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.051282,0.102564,0.0,0.0,0.0,0.0
user 3,0.201258,0.201258,0.0,0.09434,0.0,0.0,0.100629,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.050314,0.100629,0.0,0.0,0.0,0.0
user 4,0.201299,0.201299,0.0,0.097403,0.0,0.0,0.097403,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.058442,0.097403,0.0,0.0,0.0,0.0


## **Filter dataframe without 4 anime chosen**

In [31]:
df_item_feature_new = df_genre.loc[~df_genre['anime_id'].isin(df_user_item.columns[1:])]
df_item_feature_new

Unnamed: 0,anime_id,name,action,adventure,cars,comedy,dementia,demons,drama,ecchi,...,shounen ai,slice of life,space,sports,super power,supernatural,thriller,vampire,yaoi,yuri
0,32281,Kimi no Na wa.,0,0,0,0,0,0,1,0,...,0,0,0,0,0,1,0,0,0,0
1,5114,Fullmetal Alchemist: Brotherhood,1,1,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
2,28977,Gintama°,1,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,9253,Steins;Gate,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
4,9969,Gintama&#039;,1,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12012,9316,Toushindai My Lover: Minami tai Mecha-Minami,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12013,5543,Under World,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12014,5621,Violence Gekiga David no Hoshi,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
12015,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## **Create predicted score for unwatched anime**

In [32]:
item_feature_new = np.array(df_item_feature_new.drop(columns=['anime_id', 'name']))
# item_feature_new
n_item_new = item_feature_new.shape[0]
# n_item_new
user_item_score_new = np.empty((n_user, n_item_new))

for i in range(n_user):
    user_item_score = np.matmul(item_feature_new, user_feature[i,:])
    user_item_score_new[i,:] = user_item_score

In [33]:
# Create and transpose new dataframe
df_score = pd.DataFrame(user_item_score_new).T

In [34]:
# Rename the columns
df_score.columns = df_user_item['user']

In [41]:
df_score

Unnamed: 0,anime_id,name,user 1,user 2,user 3,user 4
0,32281,Kimi no Na wa.,0.192308,0.205128,0.201258,0.194805
1,5114,Fullmetal Alchemist: Brotherhood,0.634615,0.653846,0.654088,0.655844
2,28977,Gintama°,0.506410,0.500000,0.496855,0.500000
3,9253,Steins;Gate,0.057692,0.044872,0.050314,0.045455
4,9969,Gintama&#039;,0.506410,0.500000,0.496855,0.500000
...,...,...,...,...,...,...
12008,9316,Toushindai My Lover: Minami tai Mecha-Minami,0.000000,0.000000,0.000000,0.000000
12009,5543,Under World,0.000000,0.000000,0.000000,0.000000
12010,5621,Violence Gekiga David no Hoshi,0.000000,0.000000,0.000000,0.000000
12011,6133,Violence Gekiga Shin David no Hoshi: Inma Dens...,0.000000,0.000000,0.000000,0.000000


In [35]:
# Merge dataframes
df_score = pd.concat([df_item_feature_new[['anime_id', 'name']].reset_index(drop=True), df_score], axis=1)
df_score.head()

Unnamed: 0,anime_id,name,user 1,user 2,user 3,user 4
0,32281,Kimi no Na wa.,0.192308,0.205128,0.201258,0.194805
1,5114,Fullmetal Alchemist: Brotherhood,0.634615,0.653846,0.654088,0.655844
2,28977,Gintama°,0.50641,0.5,0.496855,0.5
3,9253,Steins;Gate,0.057692,0.044872,0.050314,0.045455
4,9969,Gintama&#039;,0.50641,0.5,0.496855,0.5


In [36]:
df_score[df_score['user 1'] > 0.6]['user 1']

1        0.634615
21       0.660256
71       0.788462
91       0.634615
100      0.647436
           ...   
10062    0.634615
10270    0.653846
10584    0.653846
10700    0.608974
11373    0.602564
Name: user 1, Length: 333, dtype: float64

## **Recommendation for users**

In [37]:
# Recommendation for user 1
df_score[['anime_id', 'name', 'user 1']].sort_values('user 1', ascending=False).head(10)

Unnamed: 0,anime_id,name,user 1
7702,1626,Genma Taisen,0.858974
4224,1136,Betterman,0.858974
3208,1400,Macross 7 Movie: Ginga ga Ore wo Yondeiru!,0.858974
2550,1397,Macross 7,0.858974
1758,573,Saber Marionette J,0.858974
799,154,Shaman King,0.839744
5161,3114,Chiisana Kyojin Microman,0.807692
4035,23067,Tenkai Knights,0.807692
3316,1186,Battle Athletess Daiundoukai (TV),0.801282
2903,1022,Generator Gawl,0.801282


In [38]:
# Recommendation for user 2
df_score[['anime_id', 'name', 'user 2']].sort_values('user 2', ascending=False).head(10)

Unnamed: 0,anime_id,name,user 2
799,154,Shaman King,0.858974
4224,1136,Betterman,0.846154
2550,1397,Macross 7,0.846154
1758,573,Saber Marionette J,0.846154
3208,1400,Macross 7 Movie: Ginga ga Ore wo Yondeiru!,0.846154
7702,1626,Genma Taisen,0.846154
237,15323,One Piece: Episode of Nami - Koukaishi no Nami...,0.807692
892,31289,One Piece: Episode of Sabo - 3 Kyoudai no Kizu...,0.807692
227,19123,One Piece: Episode of Merry - Mou Hitori no Na...,0.807692
71,21,One Piece,0.807692


In [39]:
# Recommendation for user 3
df_score[['anime_id', 'name', 'user 3']].sort_values('user 3', ascending=False).head(10)

Unnamed: 0,anime_id,name,user 3
4224,1136,Betterman,0.849057
2550,1397,Macross 7,0.849057
7702,1626,Genma Taisen,0.849057
1758,573,Saber Marionette J,0.849057
3208,1400,Macross 7 Movie: Ginga ga Ore wo Yondeiru!,0.849057
799,154,Shaman King,0.849057
237,15323,One Piece: Episode of Nami - Koukaishi no Nami...,0.798742
1626,2408,Keroro Gunsou Movie 2: Shinkai no Princess de ...,0.798742
227,19123,One Piece: Episode of Merry - Mou Hitori no Na...,0.798742
4035,23067,Tenkai Knights,0.798742


In [40]:
# Recommendation for user 4
df_score[['anime_id', 'name', 'user 4']].sort_values('user 4', ascending=False).head(10)

Unnamed: 0,anime_id,name,user 4
799,154,Shaman King,0.850649
3208,1400,Macross 7 Movie: Ginga ga Ore wo Yondeiru!,0.844156
1758,573,Saber Marionette J,0.844156
7702,1626,Genma Taisen,0.844156
2550,1397,Macross 7,0.844156
4224,1136,Betterman,0.844156
237,15323,One Piece: Episode of Nami - Koukaishi no Nami...,0.811688
892,31289,One Piece: Episode of Sabo - 3 Kyoudai no Kizu...,0.811688
71,21,One Piece,0.811688
227,19123,One Piece: Episode of Merry - Mou Hitori no Na...,0.811688
