# 추천시스템

## 추천 시스템의 종류
- Matrix Factorization(MF)
- Factorization Machine(FM)
- Deep Leaning

### Matrix Factorization(MF)
- <a href="https://datajobs.com/data-science-repo/Recommender-Systems-[Netflix].pdf" >논문 : Matrix Factorization Techniques for Recommender Systems (2009) </a>
- ALS(Alternating Least Squares), SGD(Stochastic Gradient Descent) 가 많이 쓰이며, 모두 Spark ML 에 구현되어 있어, BigData Scale Approach 에서도 많이 사용 되었음
- 특히 Netflix 추천 Competition 에서 가장 우수한 알고리즘으로 알려지면서 유명세를 탓던 알고리즘
- 행 : USER, 열 : ITEM, 셀 : RATING 으로 구성
- 점수 말고는 다른 데이터를 알고리즘에 적용시킬 수 없음

<img src = "./image/추천시스템MF_1.png" width="30%">

In [None]:
from sqlalchemy import create_engine
from pyspark.ml.evaluation import RegressionEvaluator 
from pyspark.ml.recommendation import ALS 
from pyspark.sql import Row 
from pyspark.sql.session import SparkSession
from pyspark import SparkContext, SparkConf

params={
    'user': 'postgres',
    'pass': 'dsc123',
    'host': '192.168.1.214',
    'port': '5432',
    'db'  : 'datavoucher',
}
engine = create_engine('postgresql://{user}:{pass}@{host}:{port}/{db}'.format(**params), echo=False)

import pandas as pd
sql = "\
select b.o_user_id userId, oi_goods_id itemId, count(*)::float/8476*5 rating \
from   isecure.order_item_tb a, isecure.order_tb b \
where  a.oi_order_id = b.o_id \
  and  a.oi_created_at >= '2019-01-01' \
group by b.o_user_id, oi_goods_id\
"
df = pd.read_sql(sql, con=engine)

ratings = spark.createDataFrame(df)
(training, test) = ratings.randomSplit([0.8, 0.2]) 

# Build the recommendation model using ALS on the training data 
# Note we set cold start strategy to 'drop' to ensure we don't get NaN evaluation metrics 
als = ALS(maxIter=5, regParam=0.01, userCol="userid", itemCol="itemid", ratingCol="rating", coldStartStrategy="drop") 
model = als.fit(training) 

# Evaluate the model by computing the RMSE on the test data 
predictions = model.transform(test) 
evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating", predictionCol="prediction") 
rmse = evaluator.evaluate(predictions) 
print("Root-mean-square error = " + str(rmse)) 
# Root-mean-square error = 0.012109926564

# Generate top 10 movie recommendations for each user 
userRecs = model.recommendForAllUsers(10) 
# Generate top 10 user recommendations for each movie 
movieRecs = model.recommendForAllItems(10) 

# Generate top 10 movie recommendations for a specified set of users 
users = ratings.select(als.getUserCol()).distinct().limit(3) 
userSubsetRecs = model.recommendForUserSubset(users, 10) 
# Generate top 10 user recommendations for a specified set of movies 
movies = ratings.select(als.getItemCol()).distinct().limit(3) 
movieSubSetRecs = model.recommendForItemSubset(movies, 10)

### Factorization Machine(FM)
#### Factorization Machine의 종류
- Factorization Machine (FM)
 - <a href="https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf" >논문 : Factorization Machine (2010) </a>
 - <a href="https://greeksharifa.github.io/machine_learning/2019/12/21/FM/" > 정리 링크 (Tensorflow 코드) </a> 
- Field-aware Factorization Machines (FFM)
 - <a href="https://www.csie.ntu.edu.tw/~cjlin/papers/ffm.pdf" >논문 : Field-aware Factorization Machines for CTR prediction (2017.01.15) </a>
 - <a href="https://greeksharifa.github.io/machine_learning/2020/04/05/FFM/" > 정리 링크 (xlearn 코드) </a> 
- DeepFM
 - <a href="https://arxiv.org/pdf/1703.04247v1.pdf" >논문 : DeepFM: A Factorization-Machine based Neural Network for CTR Prediction (2017.03.13) </a>
 - <a href="https://greeksharifa.github.io/machine_learning/2020/04/07/DeepFM/" > 정리 링크 (Tensorflow 코드) </a> 
- Attentional Factorization Machines (AFM)
 - <a href="https://www.ijcai.org/Proceedings/2017/0435.pdf" >논문 : Attentional Factorization Machines: Learning theWeight of Feature Interactions via Attention Networks (2017.08.15) </a>
 - <a href="https://greeksharifa.github.io/machine_learning/2020/05/01/AFM/" > 정리 링크 (Tensorflow 코드) </a> 
- Field-weighted Factorization Machines
 - <a href="https://arxiv.org/pdf/1806.03514.pdf" >논문 : AField-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising (2020.03.08) </a>
- Field-Embedded Factorization Machines
 - <a href="https://arxiv.org/pdf/2009.09931.pdf" >논문 : Field-Embedded Factorization Machines for Click-through rate prediction (2020.09.13) </a>
---
- Field-Embedded Factorization Machines 논문에서 발췌한 성능 비교
 - 머신러닝
<img src = "./image/추천시스템_2.png" width="80%">
 - 딥러닝
<img src = "./image/추천시스템_3.png" width="80%">