**Deliberate Practice - Recommender Systems**
In this exercise, we'll be using the [MovieLens 100K Dataset](https://grouplens.org/datasets/movielens/100k/) which contains 100,000 movie ratings from around 1000 users on 1700 movies. This deliberate practice will allow you to recommend movies to a particular user based on the movies he already rated. We'll be using the [LightFM](https://github.com/lyst/lightfm) library which contains a number of popular recommendation algorithms.

Recommender algorithms help us make decisions by learning our preferences. They are used by popular services to suggest products to their customers Netlix would use recommender systems to suggest movies/TV shows to their customers based on what the user has liked in the past or what other similar users have liked. Similarly, Amazon uses recommender systems to suggest products to their customers based on the same principal.




**Recommender systems can be classified under 2 major categories:**

**Collaborative Systems:**<br>
Collaborative systems provide suggestions based on what other similar users liked in the past. By recording the preferences of users, a collaborative system would cluster similar users and provide recommendation to users based on the activity of users of the same group. 

**Content-based Systems:**<br>
Content-based systems provide recommendation based on what the user liked in the past. This can be in the form of movie ratings, likes and clicks. All the recorded  activity allows these algorithms to provide suggestions on products if they possess similar features to the products liked by the user in the past.

We'll start by installing the dependencies. Below we'll be downloading LightFM library that will provide us with our model.

In [None]:
!pip install lightfm

We will now import our dependencies. We'll import numpy for calculations and handling matrices. We'll use LightFM module that has our model. We'll also import fetch_movielens module that has a simplified function to fetch movielens 100K dataset in a suitable format

In [None]:
import numpy as np
from lightfm import LightFM
from lightfm.datasets import fetch_movielens

The dataset contains 100,000 interactions from 1000 users on 1700 movies, and is exhaustively described in its [README](http://files.grouplens.org/datasets/movielens/ml-100k-README.txt).

In [None]:
#We'll fetch movielens data using the imported fetch_movie lens method to our movielens_data variable
movielens_data = fetch_movielens(min_rating = 3.0)



The return value is a dictionary containing the following keys:

Returns:	
* train (sp.coo_matrix of shape [n_users, n_items]) – Contains training set interactions.
* test (sp.coo_matrix of shape [n_users, n_items]) – Contains testing set interactions.
* item_features (sp.csr_matrix of shape [n_items, n_item_features]) – Contains item features.
* item_feature_labels (np.array of strings of shape [n_item_features,]) – Labels of item features.
* item_labels (np.array of strings of shape [n_items,]) – Items’ titles.

In [None]:
#create model
model = LightFM(loss = 'warp')
#train model
model.fit(movielens_data['train'], epochs=30, num_threads=2)

In [None]:
#number of users and movies in training data
n_users, n_items = movielens_data['train'].shape
print("n_users: " + str(n_users))
print("n_items: " + str(n_items))

In [None]:
user_id = 18

#movies they already like:
#known_positives = data['item_labels'][data['train'].tocsr()[user_id].indices]
    
#movies our model predicts they will like:

item_ids = np.arange(n_items)

scores = model.predict(user_id, item_ids)

sorted_ids = np.argsort(-scores)

#rank them in order of most liked to least:
top_items = data['item_labels'][sorted_ids]

In [None]:
print("Top 5 movies to be recommended to user " + str(user_id) + " are:\n")

for movie in range(5):
    print(str(movie+1) + ". " + top_items[movie])