#**CA06–kNN based Recommender Engine**#

**1.The Application:** At scale, this would look like recommending products on Amazon, articles on Medium, movies on Netflix, or videos on YouTube. Although, we can be certain they all use more efficient means of making recommendations due to the enormous volume of data they process. However, we could replicate one of these recommender systems on a smaller scale using what we have learned here in this article. Let us build the core of a movies recommender system.

In [1]:
# import packages
import pandas as pd
import numpy as np
from sklearn.neighbors import KNeighborsClassifier, NearestNeighbors

**2.Data Source and Contents:** We will use a small sub-set of the data extracted from the UCI’s IMDB data set.

In [2]:
data = pd.read_csv('https://github.com/ArinB/CA05-kNN/raw/master/movies_recommendation_data.csv')

In [None]:
# descriptions
data.describe 

In [None]:
# non-null values in each column 
data.count()

In [None]:
# nulls in each row 
data.isnull().sum()

In [None]:
# range, column, number of non-null objects of each column, datatype and memory usage 
data.info()

In [None]:
# show head & tail
data.tail()
data.head()

Unnamed: 0,Movie ID,Movie Name,IMDB Rating,Biography,Drama,Thriller,Comedy,Crime,Mystery,History,Label
0,58,The Imitation Game,8.0,1,1,1,0,0,0,0,0
1,8,Ex Machina,7.7,0,1,0,0,0,1,0,0
2,46,A Beautiful Mind,8.2,1,1,0,0,0,0,0,0
3,62,Good Will Hunting,8.3,0,1,0,0,0,0,0,0
4,97,Forrest Gump,8.8,0,1,0,0,0,0,0,0


**3. Building your own Recommender System:** You are building your own movie recommendation website which uses your Recommendation Engine at the back-end. You are going to build this back-end Recommendation Engine. Imagine a user is navigating your recommendation website, and he/she encounters a movie named “The Post”. The user is not sure if he/she wants to watch it, but its genres intrigue the user; he/she is curious about other similar movies. The user scrolls down to the “More Like This” section to see what recommendations your recommendation websitewill make, and the back-end algorithmic gears begin to turn.
Your website sends a request to its back-end for the 5 movies that are most similar to The Post. The back-end has a recommendation data set exactly like ours. It begins by creating the row representation (better known as a feature vector) for The Post, then it runs a program similar to the one below to search for the 5 movies that are most similar to The Post, and finally sends the results back to the user at your website.

**Following is the genre information about the movie**
“The Post”: IMDB Rating = 7.2, Biography = Yes, Drama = Yes, Thriller = No, Comedy = No, Crime = No, Mystery = No, History = Yes

In [4]:
# x and y variables
x = data.drop(['Movie Name', 'Movie ID', 'Label'], axis = 1)
y = data['Movie Name']

In [5]:
# implement knn model
model_knn = NearestNeighbors(metric = 'minkowski', algorithm = 'brute', n_neighbors = 5, p = 2)
model_knn.fit(x,y)

NearestNeighbors(algorithm='brute', leaf_size=30, metric='minkowski',
                 metric_params=None, n_jobs=None, n_neighbors=5, p=2,
                 radius=1.0)

In [6]:
# create list with values from 'The Post'
movie_list = {'IMDB Rating' : 7.2,
              'Biography' : 1,
              'Drama' : 1,
              'Thriller' : 0,
              'Comedy' : 0,
              'Crime' : 0,
              'Mystery' : 0,
              'History' : 1}

# make dataframe
df = pd.DataFrame([movie_list])
df

Unnamed: 0,IMDB Rating,Biography,Drama,Thriller,Comedy,Crime,Mystery,History
0,7.2,1,1,0,0,0,0,1


In [7]:
# make prediction 
prediction = (model_knn.kneighbors(df, return_distance = False))
prediction

array([[28, 27, 29, 16,  9]])

**4. What recommendations he/she will see?** Implement this problem using Python scikit-learn and display the answer within the Notebook with proper narrative / comment.

In [8]:
# show movie names from predictions
movie_names = [y[i] for i in prediction]
movie_names

[28    12 Years a Slave
 27       Hacksaw Ridge
 29      Queen of Katwe
 16      The Wind Rises
 9       The Karate Kid
 Name: Movie Name, dtype: object]

Based on the rating and genre information of "The Post", there were 5 recommendations that were generated. These recommendations included 12 Years a Slave, Hacksaw Ridge, Queen of Katwe, The Wind Rises, and The Karate Kid.