## Objectives
Perform KNN-based collaborative filtering on the user-item interaction matrix

### Load and exploring dataset

In [1]:
import pandas as pd


In [2]:
rating_df = pd.read_csv("ratings.csv")
rating_df.head()

Unnamed: 0,user,item,rating
0,1889878,CC0101EN,3.0
1,1342067,CL0101EN,3.0
2,1990814,ML0120ENv3,3.0
3,380098,BD0211EN,3.0
4,779563,DS0101EN,3.0


The dataset contains three columns, user id (learner), item id(course), and rating(enrollment mode).

This matrix is presented as the dense or vertical form, and you may convert it to a sparse
matrix using pivot :

In [3]:
rating_sparse_df = rating_df.pivot(index='user', columns='item', values='rating').fillna(0).reset_index().rename_axis(index=None, columns=None)
rating_sparse_df.head()

Unnamed: 0,user,AI0111EN,BC0101EN,BC0201EN,BC0202EN,BD0101EN,BD0111EN,BD0115EN,BD0121EN,BD0123EN,...,SW0201EN,TA0105,TA0105EN,TA0106EN,TMP0101EN,TMP0105EN,TMP0106,TMP107,WA0101EN,WA0103EN
0,2,0.0,3.0,0.0,0.0,3.0,2.0,0.0,2.0,2.0,...,0.0,2.0,0.0,3.0,0.0,2.0,2.0,0.0,3.0,0.0
1,4,0.0,0.0,0.0,0.0,2.0,2.0,2.0,2.0,2.0,...,0.0,2.0,0.0,0.0,0.0,2.0,2.0,0.0,2.0,2.0
2,5,2.0,2.0,2.0,0.0,2.0,0.0,0.0,0.0,2.0,...,0.0,0.0,2.0,2.0,2.0,2.0,2.0,2.0,0.0,2.0
3,7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,8,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Perform KNN-based collaborative filtering on the user-item interaction matrix. 

You must implement the two following options of KNN-based collaborative filtering:

1. Use scikit-surprise which is a popular and easy-to-use Python recommendation system library.

2. Implement it with standard numpy, pandas, and sklearn. You may need to write a lot of low-level implementation code along the way.

### Implementation Option 1: Use Surprise library
Surprise is a Python sci-kit library for recommender systems. It is simple and comprehensive to build
and test different recommendation algorithms.

In [4]:
from surprise import KNNBasic
from surprise import Dataset, Reader
from surprise.model_selection import train_test_split
from surprise import accuracy

In [5]:
# Load the movielens-100k dataset (download it if needed),
data = Dataset.load_builtin('ml-100k', prompt=False)


In [6]:
# sample random trainset and testset
# test set is made of 25% of the ratings.
trainset, testset = train_test_split(data, test_size=.25)
# We&#39;ll use the famous KNNBasic algorithm.
algo = KNNBasic()
# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)
# Then compute RMSE
accuracy.rmse(predictions)

Computing the msd similarity matrix...
Done computing similarity matrix.
RMSE: 0.9762


0.9762171896497851

As you can see, just a couple of lines and you can apply KNN collaborative filtering on the sample
movie lens dataset. The main evaluation metric is Root Mean Square Error (RMSE) which is a very
popular rating estimation error metric used in recommender systems as well as many regression
model evaluations.

In [7]:
rating_df.to_csv("course_ratings.csv", index=False)
# Read the course rating dataset with columns user item rating
reader = Reader(line_format='user item rating', sep=',', skip_lines=1, rating_scale=(2, 3))

coruse_dataset = Dataset.load_from_file("course_ratings.csv", reader=reader)


In [8]:
trainset, testset = train_test_split(coruse_dataset, test_size=.3)

#check how many users and items we can use to fit a KNN model:
print(f"Total {trainset.n_users} users and {trainset.n_items} items in the trainingset")

Total 31307 users and 124 items in the trainingset


### TASK: Perform KNN-based collaborative filtering on the user-item interaction matrix
TODO: Fit the KNN-based collaborative filtering model using the trainset and evaluate the results
using the testset: