# Collaborative Filtering Using k-Nearest Neighbors (kNN)  
kNN is a machine learning algorithm to find clusters of similar users based on common book ratings, and make predictions using the average rating of top-k nearest neighbors. For example, we first present ratings in a matrix with the matrix having one row for each item (book) and one column for each user,

**Libraries**

In [1]:
import pandas as pd
import numpy as np
#avoid Warnings
import warnings
warnings.filterwarnings('ignore')

In [2]:
df = pd.read_csv('data.csv')

#drop duplicates 
df = df.drop_duplicates()
df.head(2)

Unnamed: 0,course_id,course_title,url,is_paid,price,num_subscribers,num_reviews,num_lectures,level,content_duration,published_timestamp,subject
0,1070968,Ultimate Investment Banking Course,https://www.udemy.com/ultimate-investment-bank...,True,200,2147,23,51,All Levels,1.5,2017-01-18T20:58:58Z,Business Finance
1,1113822,Complete GST Course & Certification - Grow You...,https://www.udemy.com/goods-and-services-tax/,True,75,2792,923,274,All Levels,39.0,2017-03-09T16:34:20Z,Business Finance


In [3]:
df.shape

(3672, 12)

#### Filtering Data

In [4]:
popularity_threshold = 100
rating_popular_book = df.query('num_subscribers >=  @popularity_threshold')
rating_popular_book.head()

Unnamed: 0,course_id,course_title,url,is_paid,price,num_subscribers,num_reviews,num_lectures,level,content_duration,published_timestamp,subject
0,1070968,Ultimate Investment Banking Course,https://www.udemy.com/ultimate-investment-bank...,True,200,2147,23,51,All Levels,1.5,2017-01-18T20:58:58Z,Business Finance
1,1113822,Complete GST Course & Certification - Grow You...,https://www.udemy.com/goods-and-services-tax/,True,75,2792,923,274,All Levels,39.0,2017-03-09T16:34:20Z,Business Finance
2,1006314,Financial Modeling for Business Analysts and C...,https://www.udemy.com/financial-modeling-for-b...,True,45,2174,74,51,Intermediate Level,2.5,2016-12-19T19:26:30Z,Business Finance
3,1210588,Beginner to Pro - Financial Analysis in Excel ...,https://www.udemy.com/complete-excel-finance-c...,True,95,2451,11,36,All Levels,3.0,2017-05-30T20:07:24Z,Business Finance
4,1011058,How To Maximize Your Profits Trading Options,https://www.udemy.com/how-to-maximize-your-pro...,True,200,1276,45,26,Intermediate Level,2.0,2016-12-13T14:57:18Z,Business Finance


In [5]:
rating_popular_book.shape

(2792, 12)

## Implementing KNN

In [6]:
from scipy.sparse import csr_matrix
from sklearn.neighbors import NearestNeighbors

In [7]:
#create Pivot table
df_pivot = rating_popular_book.pivot(index = 'course_title', columns = 'course_id', values = 'num_subscribers' ).fillna(0)

In [8]:
#csc matrix
df_pivot_matrix = csr_matrix(df_pivot.values)

### Model

In [9]:
model = NearestNeighbors(metric='cosine', algorithm='brute')
model.fit(df_pivot_matrix)

In [53]:
query_index = np.random.choice(df_pivot.shape[0])
print(query_index)
distances, indices  = model.kneighbors(df_pivot.iloc[query_index, :].values.reshape(1,-1))

287


In [73]:
df_pivot.index[query_index]

'Become a Graphic Designer, and earn a living from it'

In [72]:
for i in range(0, len(distances.flatten())):
    if i == 0:
        print('Recommendations for {0}:\n'.format(df_pivot.index[query_index]))
    else:
        print('{0}: {1}.'.format(i, df_pivot.index[indices.flatten()[i]]))

Recommendations for Become a Graphic Designer, and earn a living from it:

1: Master Flute Playing: Intermediate Instruction Made Simple!.
2: Master ExpressJS to Build Web Apps with NodeJS&JavaScript.
3: Master Graphic Design Using Photoshop with Rachael.
4: Master JavaScript Programming, 3 Projects Included !!.
