# Recommendation Engine

Recommendation system is a technique by which an algorithm detects an input data to predict preferences of a user as they might have done for themselves.These input data could be products, services, food, videos, audios, images, news articles etc.Recommendation system could be use almost every where from suggesting movies to watch, jobs we may be intrested in on LinkedIn, who to follow on Twitter, friend you would like to connect with on Facebook, products you would like to get from E-commerce stores like Amazon, Tesco, Argos, etc.

In [None]:
https://www.rstudio.com/resources/cheatsheets/

## Understanding the need of recommendation

### Types of product recommendation

**Types of recommendation**<br>
- Content-based recommendation method:<br>
This recommendation concept is based on linking users preferences with item attributes. This type of recommendation is based 
entirely on what the user provides as ratings, hence there is no linkage to what anyone else had recommended.<br><br>
- Collaborative filtering-based recommendation method:<br>
This type of recommendations are based on the many ratings of a product or services provided by some or all the individuals in
the recommender database.
>- User-based Collaborative Filtering (UBCF):<br>
UBCF uses the similarity between the users with similar taste and recommend products based on historic buying pattern.<br><br>
>- Item-based Collaborative Filtering (IBCF):<br>
IBCF uses the similarity between the items and not users to make a recommendation.<br><br>
>- Popular (POPULAR):<br>
This method is used for new customer, items that are most popular are recommended.<br><br>
>- Re-Recommend (RERECOMMEND):<br>
Items or products with high ratings are recommended.<br><br>
>- Random Recommendation (RANDOM)
>- Singular Value Decomposition (SVD):<br>
This method is used when the users as well as items are very large.<br><br>
>- Association rule based-recommendation methods

We will be looking at Collaborative filtering-based recommendation method in this lecture due to the availability of all the required algorithms in R. R has a robust and reliable framework called the recommenderlab and is a widely used R extension designed to provide a robust foundation for recommender engines. The focus of this library is to provide efficient handling of data, availability of standard algorithms and evaluation capabilities.

In [None]:
# Import the R package, required for recommendation
library(recommenderlab)

In [None]:
# The syntax below shows all the algorithms implemented in the recommederlab package and brief descriptions
recommenderRegistry$get_entries()

In [None]:
# Methods for identifying Real Rating Matrix functions in the recommenderlab package
recommender_realRatingMat_models <- recommenderRegistry$get_entries(dataType = "realRatingMatrix")

names(recommender_realRatingMat_models)

In [None]:
recommender_realRatingMat_models$UBCF_realRatingMatrix
# recommender_realRatingMat_models$IBCF_realRatingMatrix
# recommender_realRatingMat_models$ALS_realRatingMatrix
# recommender_realRatingMat_models$RANDOM_realRatingMatrix
# recommender_realRatingMat_models$RERECOMMEND_realRatingMatrix
# recommender_realRatingMat_models$SVD_realRatingMatrix
# recommender_realRatingMat_models$POPULAR_realRatingMatrix
# recommender_realRatingMat_models$SVDF_realRatingMatrix

In [None]:
# Methods for identifying Binary Rating Matrix functions in the recommenderlab package
recommender_binaryRatingMatrix_models <- recommenderRegistry$get_entries(dataType = "binaryRatingMatrix")

names(recommender_binaryRatingMatrix_models)

In [None]:
recommender_binaryRatingMatrix_models$UBCF_binaryRatingMatrix
# recommender_binaryRatingMatrix_models$IBCF_binaryRatingMatrix
# recommender_binaryRatingMatrix_models$ALS_implicit_binaryRatingMatrix
# recommender_binaryRatingMatrix_models$AR_binaryRatingMatrix
# recommender_binaryRatingMatrix_models$POPULAR_binaryRatingMatrix
# recommender_binaryRatingMatrix_models$RANDOM_binaryRatingMatrix      

In [None]:
# Import dataset
ratingsDf <- read.csv("D:/CFT DataScienceHub/MovieRatings.csv")

In [None]:
# help(binaryRatingMatrix)

In [None]:
# Ensure the dataset is converted to a matrix
ratingsMat <- as.matrix(ratingsDf)
# View(ratingsMat)

In [None]:
ratings <- as(ratingsMat, "realRatingMatrix")
ratings
class(ratings)

In [None]:
# split the matrix to training and Test
set.seed(101)
train_rows <- sample(1:nrow(ratings), size=0.9*nrow(ratings), replace = F)

ratings_train <- ratings[train_rows, ]
ratings_test <- ratings[-train_rows, ]

In [None]:
data1 <- data.frame(x=c(300,500,120,900,30),y=c("John","Seiya","Mary","Chiddy","Philip"))
data1

In [None]:
nrow(data1)

In [None]:
set.seed(3)
train_rows2 <- sample(1:nrow(data1), size=0.9*nrow(data1), replace = F)

ratings_train2 <- data1[train_rows2, ]
ratings_test2 <- data1[-train_rows2, ]

In [None]:
train_rows2

In [None]:
ratings_test2 

In [None]:
# build the UBCF
rec_model <- Recommender(data = ratings_train, method = "UBCF") 
rec_model

# get the model specifications as a list
# getModel(rec_model) 

In [None]:
# predict using the test dataset.
n_reco <- 3
recommendations <- predict(object = rec_model, newdata = ratings_test, n = n_reco)
recommendations

recommendations@ratings
recommendations@items
recommendations@itemLabels

reco_out <- as(recommendations, "list")
reco_out

In [None]:
# smaller set
top3 <- bestN(recommendations, 3)
top3
as(top3, "list")

In [32]:
# Recommenderlab does have functionality to split the data into train and test sets
model <- evaluationScheme(ratings, method = 'split', train=0.9, given = 15, goodRating = 5)

In [None]:
ubcf = Recommender(getData(model,"train"), "UBCF")
ibcf = Recommender(getData(model,"train"), "IBCF")
svd = Recommender(getData(model, "train"), "SVD")
popular = Recommender(getData(model, "train"), "POPULAR")
pca = Recommender(getData(model, "train"), "PCA")
random = Recommender(getData(model, "train"), "RANDOM") 

In [None]:
user_pred = predict(ubcf, getData(model,"known"),type="ratings")
item_pred = predict(ibcf, getData(model, "known"),type="ratings")
svd_pred = predict(svd, getData(model, "known"),type="ratings")
pop_pred = predict(popular, getData(model, "known"),type="ratings")
pca_pred = predict(pca, getData(model, "known"),type="ratings")
rand_pred = predict(random, getData(model, "known"), type="ratings")

In [None]:
# Examine the error between the predictions and unknown portion of the test data using the 
# calcPredictionAccuracy() function
P1 = calcPredictionAccuracy(user_pred, getData(model,"unknown"))
P2 = calcPredictionAccuracy(item_pred, getData(model,"unknown"))
P3 = calcPredictionAccuracy(svd_pred, getData(model, "unknown"))
P5 = calcPredictionAccuracy(pca_pred, getData(model,"unknown"))
P6 = calcPredictionAccuracy(rand_pred, getData(model,"unknown"))


In [None]:
error = rbind(P1,P2,P3,P4,P5,P6)
rownames(error) = c("UBCF", "IBCF", "SVD", "Popular", "PCA", "Random")
error 