# Table of Contents
* [1. Introduction](#1.-Introduction)
	* [1.1 What is it?](#1.1-What-is-it?)
	* [1.2 Why recommender systems](#1.2-Why-recommender-systems)
	* [1.3 Big data driving the recommender systems](#1.3-Big-data-driving-the-recommender-systems)
	* [1.4 Types](#1.4-Types)
		* [1.4.1 Collaborative filtering recommender systems](#1.4.1-Collaborative-filtering-recommender-systems)
		* [1.4.2 Content-based recommender systems](#1.4.2-Content-based-recommender-systems)
		* [1.4.3 Hybrid recommender systems](#1.4.3-Hybrid-recommender-systems)
		* [1.4.4 Context-aware recommender systems](#1.4.4-Context-aware-recommender-systems)
* [2. Simple-Recommendation-Engine](#2.-Simple-Recommendation-Engine)
	* [2.1 Building blocks of the recommendation engine](#2.1-Building-blocks-of-the-recommendation-engine)
	* [2.2 Loading & formatting](#2.2-Loading-&-formatting)
* [3. Recommendation Engines](#3.-Recommendation-Engines)
* [4. Data-Mining & Recommendation Engines](#4.-Data-Mining-&-Recommendation-Engines)
* [5. Collaborative Filtering Recommendation Engines](#5.-Collaborative-Filtering-Recommendation-Engines)
* [6. Personalized Recommendation Engines](#6.-Personalized-Recommendation-Engines)
* [7. Real-Time Recommendation Engines](#7.-Real-Time-Recommendation-Engines)
* [8. Scalable Recommendation Engines](#8.-Scalable-Recommendation-Engines)


# 1. Introduction

## 1.1 What is it?

* Recommendation engines are powerful tools and techniques to analyze huge volumes of data to provide relevant suggestions based on data mining approaches, taking into consideration the available digital footprint of the user, such as:

    * User behaviors, 
    * User-demographic information, 
    * Transaction details, 
    * Interaction logs, and information about a product, such as:
    * Specifications, 
    * feedback from users, 
    * comparison with other products, and so on


* In a nutshell and technical terms, a recommendation engine problem is to develop a math model or objective function that can predict how much a user will like an item.  

    * _If U = {users}, I = {items} then F = Objective function and measures the usefulness of item I to user U, given by:_<br />
    * _**F:UXI -> R**_<br />
    * _Where R = {recommended items}_<br />
    * _For each user u, we want to choose the item i that maximizes the objective function:_<br />
![](imgs/1.png)


* Companies are employing different recommendation strategies and that depends what kind of data they are tracking. So here is how they are using them:

![](https://www.packtpub.com/graphics/9781785884856/graphics/image_01_003.jpg)

## 1.2 Why recommender systems

Given the complexity and challenges in building recommendation engines, a considerable amount of thought, skill, investment, and technology goes into building recommender systems. Are they worth such an investment? Let us look at some facts:
* 2/3 of movies watched by Netflix customers are recommended movies
* 38% of click-through rates on Google News are recommended links
* 35% of sales at Amazon arise from recommended products
* ChoiceStream claims that 28% of people would like to buy more music, if they find what they like

## 1.3 Big data driving the recommender systems

Without a doubt, big data is the driving force behind recommender systems. A good recommendation engine should be reliable, scalable, highly available, and be able to provide personalized recommendations, in real time, to the large user base it contains.

So a typical recommendation system cannot do its job without sufficient data and big data supplies plenty of user data such as past purchases, browsing history, and feedback for the Recommendation systems to provide relevant and effective recommendations. In a nutshell, even the most advanced recommender systems cannot be effective without big data.

The role of big data and improvements in technology, both on the software and hardware front, goes beyond just supplying massive data. It also provides meaningful, actionable data fast, and provides the necessary setup to quickly process the data in real time.

## 1.4 Types

### 1.4.1 Collaborative filtering recommender systems

### 1.4.2 Content-based recommender systems

### 1.4.3 Hybrid recommender systems

### 1.4.4 Context-aware recommender systems

# 2. Simple-Recommendation-Engine

## 2.1 Building blocks of the recommendation engine

* The recommendation engine we are going to build is based on the collaborative filtering approach which based on the user's neighbourhood, as explained in the following figure:

![collaborative filtering approach]()

* And the blocks to build the engine are as follows
    * Loading & formatting data 
    * Calculating similarities between users 
    * Predicting the unknown ratings for users 
    * Recommend items to users based on similarities between users

![The steps]()

## 2.2 Loading & formatting

* The dataset we are going to use can be downloaded from [here](https://raw.githubusercontent.com/sureshgorakala/RecommenderSystems_R/master/movie_rating.csv)

* It's a movie rating dataset containing users ratings from 1 to 5 
* The data structured as comma-separated  
* Our **goal** is to build a simple recommendation system 
* Our **objective** is to recommend unknown movies to users based on the ratings of similar users

In [2]:
# Load R (rpy2)
%load_ext rpy2.ipython
# Loading the data from the CSV file 
%R ratings = read.csv("../../datasets/movie_rating.csv")

Unnamed: 0,critic,title,rating
1,Jack Matthews,Lady in the Water,3.0
2,Jack Matthews,Snakes on a Plane,4.0
3,Jack Matthews,You Me and Dupree,3.5
4,Jack Matthews,Superman Returns,5.0
5,Jack Matthews,The Night Listener,3.0
6,Mick LaSalle,Lady in the Water,3.0
7,Mick LaSalle,Snakes on a Plane,4.0
8,Mick LaSalle,Just My Luck,2.0
9,Mick LaSalle,Superman Returns,3.0
10,Mick LaSalle,You Me and Dupree,2.0


In [3]:
# Ratings sample (1st six rows)
%R head(ratings)

Unnamed: 0,critic,title,rating
1,Jack Matthews,Lady in the Water,3.0
2,Jack Matthews,Snakes on a Plane,4.0
3,Jack Matthews,You Me and Dupree,3.5
4,Jack Matthews,Superman Returns,5.0
5,Jack Matthews,The Night Listener,3.0
6,Mick LaSalle,Lady in the Water,3.0


In [5]:
# The dimensions of the dataset 
%R dim(ratings)

array([31,  3], dtype=int32)

In [7]:
# The structure of the data 
%R str(ratings)

'data.frame':	31 obs. of  3 variables:
 $ critic: Factor w/ 6 levels "Claudia Puig",..: 3 3 3 3 3 5 5 5 5 5 ...
 $ title : Factor w/ 6 levels "Just My Luck",..: 2 3 6 4 5 2 3 1 4 6 ...
 $ rating: num  3 4 3.5 5 3 3 4 2 3 2 ...


* As you see the dataset contains 31 observations and three variables. In addition to see the levels of the attributes of a variable we use levels().

In [10]:
# Levels of the attributes
%R levels(ratings$critic)

array(['Claudia Puig', 'Gene Seymour', 'Jack Matthews', 'Lisa Rose',
       'Mick LaSalle', 'Toby'], 
      dtype='|S13')

In [11]:
%R levels(ratings$title)

array(['Just My Luck', 'Lady in the Water', 'Snakes on a Plane',
       'Superman Returns', 'The Night Listener', 'You Me and Dupree'], 
      dtype='|S18')

In [12]:
%R levels(ratings$rating)

* In order to build the recommender we have to create a matrix where rows contain users, columns contain items, and the cells contain ratings given by users to the items.

* The next step is to arrange the data in a format that is useful to build the recommendation engine. The current data contains a row containing critic, title, and rating. This has to be converted to matrix format containing critics as rows, title as columns, and ratings as the cell values

In [14]:
# Data processed and formatted 
%R movie_ratings = as.data.frame(acast(ratings,title-critic,value.var="rating"))


Error in as.data.frame(acast(ratings, title - critic, value.var = "rating")) : 
  could not find function "acast"


**... to be continued :) **

# 3. Recommendation Engines

# 4. Data-Mining & Recommendation Engines

# 5. Collaborative Filtering Recommendation Engines

# 6. Personalized Recommendation Engines

# 7. Real-Time Recommendation Engines

# 8. Scalable Recommendation Engines