Human beings are faced with lots dilemma when it comes to choosing their needs
among many presented choices.

And in business customers may be frustrated when selecting products to buy, Take for
example Netflix MOVIES. Theere are lots of movies in which the customer may choose to
watch. But the user may not know which movies he/she likes beforehand.

The above problem can be solved by building a recommendation engine which srecommends user
movies based on what he/she liked in the past or what other users liked. This saves user
time and increases the sales.

I am going to build a system that will recommend movies to the user based on user 
ratings using the LightFM library

In [3]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from lightfm import LightFM
from lightfm.datasets import fetch_movielens

In [4]:
# load the data set with a minimum rating of 4
data = fetch_movielens(min_rating=4.0)

In [5]:
# inspect the dataset
data['train'], data['test'], data['item_labels']

(<943x1682 sparse matrix of type '<class 'numpy.int32'>'
 	with 49906 stored elements in COOrdinate format>,
 <943x1682 sparse matrix of type '<class 'numpy.int32'>'
 	with 5469 stored elements in COOrdinate format>,
 array(['Toy Story (1995)', 'GoldenEye (1995)', 'Four Rooms (1995)', ...,
        'Sliding Doors (1998)', 'You So Crazy (1994)',
        'Scream of Stone (Schrei aus Stein) (1991)'], dtype=object))

In [6]:
# Create the model
model = LightFM(loss = 'warp') # warp stands for Weighted Approximate- Rank pairwise - Helps create recommendation for each user by looking at existing user rating pairwise predicting ranking for each

In [7]:
# Train the model
model.fit(data['train'], epochs=30, num_threads=4)

<lightfm.lightfm.LightFM at 0x7f2182a95b00>

In [8]:
# writing a function that generates the recommendation
def movie_recommender(model, data, user_ids):
    
    # number of users and movie in training data
    n_users, n_items = data['train'].shape
    
    # Generating recommendation for each user we input
    for user_id in user_ids:
        
        # movie the user already like
        known_positives = data['item_labels'][data['train'].tocsr()[user_id].indices]
        
        # movie the model predicts they will like
        scores =  model.predict(user_id, np.arange(n_items))
        
        # rank them in order of liked to least
        top_items = data['item_labels'][np.argsort(-scores)]

        
        # print out the results
        print("user %s" % user_id)
       
        print("         known positives")
        for x in known_positives[:3]:
            print("     %s" % x)

        print("         recommended")
        for x in scores[:3]:
            print("     %s" % x)      

In [9]:
# testing the model with three users
movie_recommender(model, data, [2,50,230])

user 2
         known positives
     Return of the Jedi (1983)
     Event Horizon (1997)
     Schindler's List (1993)
         recommended
     -1.0352228879928589
     -2.284733772277832
     -1.176231026649475
user 50
         known positives
     Star Wars (1977)
     Mr. Smith Goes to Washington (1939)
     Die Hard (1988)
         recommended
     1.0686746835708618
     -0.2144317775964737
     -1.09445321559906
user 230
         known positives
     Mr. Holland's Opus (1995)
     Star Wars (1977)
     Evita (1996)
         recommended
     1.520068883895874
     -1.2173616886138916
     -0.5804194808006287
