# Recommendation System

- Top/Popular Based Filtering
- Content Based Filtering
- Collaborative Filtering

<img src="https://miro.medium.com/max/4056/1*yrkvweErbifbPFkBUyZlOw.png" 
     width="500px">

In [1]:
import numpy as np
import pandas as pd

<hr>

### 2. Collaborative Filtering

- Data yang dibutuhkan: history interaksi user dengan product, misal:
    - _E-commerce_: history pembelian user, wishlist user, rating produk
    - _Social media_: history views, likes, subscribes
- Konsep dasarnya menggunakan formula correlation.

In [4]:
# data dummy e-commerce, history rating produk dari user

df = pd.DataFrame([
    {'user':'Andi', 'dapur1':5, 'dapur2':5, 'dapur3':5, 'sport1':1, 'sport2':1, 'sport3':1},
    {'user':'Budi', 'dapur1':4, 'dapur2':5, 'dapur3':4, 'sport1':2, 'sport2':1, 'sport3':2},
    {'user':'Caca', 'dapur1':4, 'dapur2':4, 'dapur3':5, 'sport1':1, 'sport2':2, 'sport3':2},
    {'user':'Deni', 'dapur1':1, 'dapur2':2, 'dapur3':1, 'sport1':4, 'sport2':5, 'sport3':4},
    {'user':'Euis', 'dapur1':2, 'dapur2':1, 'dapur3':2, 'sport1':5, 'sport2':5, 'sport3':5},
])
df

Unnamed: 0,user,dapur1,dapur2,dapur3,sport1,sport2,sport3
0,Andi,5,5,5,1,1,1
1,Budi,4,5,4,2,1,2
2,Caca,4,4,5,1,2,2
3,Deni,1,2,1,4,5,4
4,Euis,2,1,2,5,5,5


In [5]:
# correlation pearson

dfCorr = df.corr()  # df.corr(method='pearson')
dfCorr

Unnamed: 0,dapur1,dapur2,dapur3,sport1,sport2,sport3
dapur1,1.0,0.887783,0.971537,-0.887783,-0.950262,-0.907407
dapur2,0.887783,1.0,0.848485,-0.924242,-0.980418,-0.971537
dapur3,0.971537,0.848485,1.0,-0.924242,-0.913266,-0.887783
sport1,-0.887783,-0.924242,-0.924242,1.0,0.913266,0.971537
sport2,-0.950262,-0.980418,-0.913266,0.913266,1.0,0.950262
sport3,-0.907407,-0.971537,-0.887783,0.971537,0.950262,1.0


In [6]:
# rekomendasi untuk Fafa: beli 'dapur1' rating yang diberikan = 5

Fafa = ['dapur1', 5]

In [9]:
# similarity score berdasarkan correlation matrix

skor = dfCorr[Fafa[0]] * Fafa[1]
skor

dapur1    5.000000
dapur2    4.438917
dapur3    4.857683
sport1   -4.438917
sport2   -4.751311
sport3   -4.537037
Name: dapur1, dtype: float64

In [15]:
# rekomendasi untuk Gina: beli 'dapur1':3, 'sport1':3, 'sport3':3

Gina = [['dapur1', 3], ['sport1', 3], ['sport3', 3]]

In [26]:
# similarity score berdasarkan correlation matrix

dfSkor = pd.DataFrame()
for produk, rating in Gina:
    # skor = dfCorr[produk] * rating          # range skor: -5 s/d 5
    skor = dfCorr[produk] * (rating / 5)      # range skor: -1 s/d 1
    skor = skor.sort_values(ascending=False)
    dfSkor = dfSkor.append(skor)
dfSkor

Unnamed: 0,dapur1,dapur2,dapur3,sport1,sport2,sport3
dapur1,0.6,0.53267,0.582922,-0.53267,-0.570157,-0.544444
sport1,-0.53267,-0.554545,-0.554545,0.6,0.54796,0.582922
sport3,-0.544444,-0.582922,-0.53267,0.582922,0.570157,0.6


In [28]:
# rekomendasi

dfSkor.sum().sort_values(ascending=False)

sport1    0.650252
sport3    0.638477
sport2    0.547960
dapur1   -0.477114
dapur3   -0.504294
dapur2   -0.604797
dtype: float64