<a href="https://colab.research.google.com/github/MatheusRocha0/Recommendation-Engine/blob/main/Recommendation-Engine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 
# Recommendation Engine
 
YouTube, Amazon, Facebook and Instagram are some of the companies that use this kind of implemantation. This kind of project is the most commonly used Data Science application in the entire world. Some years ago, you would have to hire the best statisticians and mathematicians of the world to build a nice system. But nowadays with our advanced technology, anyone can build their own recommendation system.
 
# Recommendation Engine Types
 
There are basically three distinct types of reccomender systems:
 
## Collaborative Filtering
 
This filtering method is usually based on collecting and analyzing information on user’s behaviors, their activities or preferences and predicting what they will like based on the similarity with other users. A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and thus it is capable of accurately recommending complex items such as movies without requiring an “understanding” of the item itself.
 
## Content-Based Filtering
 
These filtering methods are based on the description of an item and a profile of the user’s preferred choices. In a content-based recommendation system, keywords are used to describe the items; besides, a user profile is built to state the type of item this user likes.
 
## Hybrid Recommendation Systems
 
Recent research shows that combining collaborative and content-based recommendation can be more effective. Hybrid approaches can be implemented by making content-based and collaborative-based predictions separately and then combining them. Further, by adding content-based capabilities to a collaborative-based approach and vice versa; or by unifying the approaches into one model.
 
### Scikit Surprise
Surprise (stards for Simple Python Recommendation System Engine) is an easy-to-use Python scikit for recommender systems. This tool allows anyone to build Collaborative Filtering Reccomendation Engines easily using Python with few lines of code.
 
# About the Project
 
In this project, I am going to build a Recommendation Engine using Scikit-surprise library.
 
# About the Dataset
 
The dataset I will be using for this project is the Movielens Dataset. You can download it here: https://bit.ly/3qDOziX
 
This dataset brings informations like: movie title, user id, movie id and movie genre. A perfect dataset example for training recommendation systems.
 
# Importing
 
## Installing Surprise

In [1]:
pip install scikit-surprise -q

[K     |████████████████████████████████| 11.8MB 232kB/s 
[?25h  Building wheel for scikit-surprise (setup.py) ... [?25l[?25hdone


## Libraries

In [2]:
 import pandas as pd
import numpy as np
import seaborn as sb
import matplotlib.pyplot as plt
from surprise import Reader, Dataset, SVDpp, accuracy
from surprise.model_selection import train_test_split, cross_validate
import pickle
import requests

## Importing The Data

In [3]:
 movies = pd.read_csv("https://raw.githubusercontent.com/MatheusRocha0/Recommendation_Engine/main/movies.csv")
ratings = pd.read_csv("https://raw.githubusercontent.com/MatheusRocha0/Recommendation_Engine/main/ratings.csv")
 
movies.drop("genres", axis = 1, inplace = True)
ratings.drop("timestamp", axis = 1, inplace = True)
 
data = pd.merge(ratings, movies, on = "movieId")
data.head()

Unnamed: 0,userId,movieId,rating,title
0,1,1,4.0,Toy Story (1995)
1,5,1,4.0,Toy Story (1995)
2,7,1,4.5,Toy Story (1995)
3,15,1,2.5,Toy Story (1995)
4,17,1,4.5,Toy Story (1995)


# Data Cleaning

## Missing Values

In [4]:
data.isnull().sum()

userId     0
movieId    0
rating     0
title      0
dtype: int64

## Drop Duplicates

In [5]:
data.drop_duplicates(inplace = True)

## Preprocessing the data

In [6]:
 reader = Reader(rating_scale = (0.5, 5))
dataset = Dataset.load_from_df(data.drop("title", axis = 1), reader)

## Training set and Testing set

In [7]:
 train_set, test_set = train_test_split(dataset, test_size = .5)

# Machine Learning Model

In [8]:
 engine = SVDpp(
random_state = 1,
n_epochs = 30,
lr_all = .01,
reg_all = .07
)
 
engine.fit(train_set)

<surprise.prediction_algorithms.matrix_factorization.SVDpp at 0x7fb707446ad0>

## Evaluating the model

In [9]:
 p = engine.test(test_set)
score = accuracy.rmse(p)

RMSE: 0.8777


# Saving model

In [10]:
 fileObj = open('model.pkl', 'wb')
pickle.dump(engine,fileObj)
fileObj.close()

# Loading Model

In [11]:
 fileObj = open('model.pkl', 'rb')
engine = pickle.load(fileObj)
fileObj.close()

# API Requests

In [12]:
 sample = data.drop(["rating", "title"], axis = 1).sample()
json = sample.to_json(orient = "records")

In [16]:
url = "https://api-recommendation-engine.herokuapp.com/"
data = json
headers = {"Content-type": "application/json"}
 
r = requests.post(url = url, data = data, headers = headers)
 
df = pd.DataFrame(r.json(), columns = r.json()[0].keys())
df

Unnamed: 0,userId,movieId,user_rating
0,438,1895,3.690591
