# Movie Recommender using Turi Create

In [1]:
import turicreate as tc

## read the data

Using [ml-latest-small](http://grouplens.org/datasets/movielens/latest/) dataset

In [None]:
ratings = tc.SFrame.read_csv('./movie_lens/ratings.csv')

In [3]:
ratings

userId,movieId,rating,timestamp
1,31,2.5,1260759144
1,1029,3.0,1260759179
1,1061,3.0,1260759182
1,1129,2.0,1260759185
1,1172,4.0,1260759205
1,1263,2.0,1260759151
1,1287,2.0,1260759187
1,1293,2.0,1260759148
1,1339,3.5,1260759125
1,1343,2.0,1260759131


In [None]:
movies = tc.SFrame.read_csv('./movie_lens/movies.csv')

In [5]:
movies

movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Child ren|Comedy|Fantasy ...
2,Jumanji (1995),Adventure|Children|Fantas y ...
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995) ...,Comedy
6,Heat (1995),Action|Crime|Thriller
7,Sabrina (1995),Comedy|Romance
8,Tom and Huck (1995),Adventure|Children
9,Sudden Death (1995),Action
10,GoldenEye (1995),Action|Adventure|Thriller


## Creating a model

There are a variety of machine learning techniques that can be used to build a recommender model. 

Turi Create provides a method ```turicreate.recommender.create``` that will automatically choose an appropriate model for our data set.

First we create a random split of the data to produce a validation set that can be used to evaluate the model.

In [6]:
training_data, validation_data = tc.recommender.util.random_split_by_user(ratings, 'userId', 'movieId')

In [None]:
model = tc.recommender.create(training_data, 'userId', 'movieId','rating')

In [13]:
 print ('RMSE for training : %s' % model.training_rmse)

RMSE for training : 0.8196613268679588


## Validating the model

Now that you have a model, you can make recommendations

In [None]:
predictions = model.evaluate(validation_data)

In [22]:
print ('RMSE after validation : %s' %predictions.get('rmse_overall'))

RMSE after validation : 1.0179042968912715


## Making Recommendations

In [38]:
user_lowest_rmse = predictions.get('rmse_by_user').sort('rmse')[0]['userId']

In [40]:
user_highest_rmse = predictions.get('rmse_by_user').sort('rmse', ascending=False)[0]['userId']

In [41]:
user_highest_rmse

364

## Recommendations for user with best RMSE

In [42]:
model.recommend(users=[user_lowest_rmse]).join(movies['movieId','title']).sort(['rank'])

userId,movieId,score,rank,title
302,318,4.739254269969267,1,"Shawshank Redemption, The (1994) ..."
302,858,4.678682018291754,2,"Godfather, The (1972)"
302,2959,4.547243584167761,3,Fight Club (1999)
302,2858,4.506648440134329,4,American Beauty (1999)
302,296,4.498555705171627,5,Pulp Fiction (1994)
302,2571,4.495468873750967,6,"Matrix, The (1999)"
302,50,4.454199578833384,7,"Usual Suspects, The (1995) ..."
302,356,4.434019822847647,8,Forrest Gump (1994)
302,2762,4.3986799607394165,9,"Sixth Sense, The (1999)"
302,4226,4.389431048404974,10,Memento (2000)


### What movies this user has seen and rated in the past

In [43]:
ratings[ratings['userId'] == user_lowest_rmse].join(
    movies['movieId','title'], 'movieId').sort(['rating'], ascending=False)

userId,movieId,rating,timestamp,title
302,509,5.0,843793109,"Piano, The (1993)"
302,593,5.0,843720636,"Silence of the Lambs, The (1991) ..."
302,480,4.0,843720680,Jurassic Park (1993)
302,457,4.0,843720636,"Fugitive, The (1993)"
302,596,4.0,849241587,Pinocchio (1940)
302,595,4.0,843720636,Beauty and the Beast (1991) ...
302,236,4.0,849241461,French Kiss (1995)
302,539,4.0,843720853,Sleepless in Seattle (1993) ...
302,586,4.0,843720824,Home Alone (1990)
302,590,4.0,843720537,Dances with Wolves (1990)


## Recommendations for user with worst RMSE

In [44]:
model.recommend(users=[user_highest_rmse]).join(movies['movieId','title'], 'movieId').sort(['rank'])

userId,movieId,score,rank,title
364,318,4.832003746402068,1,"Shawshank Redemption, The (1994) ..."
364,527,4.603790316951079,2,Schindler's List (1993)
364,1148,4.497950528037352,3,Wallace & Gromit: The Wrong Trousers (1993) ...
364,608,4.488048900139136,4,Fargo (1996)
364,50,4.456878517043394,5,"Usual Suspects, The (1995) ..."
364,2858,4.4242724517701095,6,American Beauty (1999)
364,356,4.412297342192931,7,Forrest Gump (1994)
364,1225,4.406916547786993,8,Amadeus (1984)
364,1136,4.405053440701765,9,Monty Python and the Holy Grail (1975) ...
364,912,4.40325090683679,10,Casablanca (1942)


### What movies this user has seen and rated in the past

In [46]:
ratings[ratings['userId'] == user_highest_rmse].join(
    movies['movieId','title']).sort(['rating'], ascending=False)

userId,movieId,rating,timestamp,title
364,134853,5.0,1444530607,Inside Out (2015)
364,93040,5.0,1444530856,"Civil War, The (1990)"
364,34321,5.0,1444531397,Bad News Bears (2005)
364,2424,5.0,1444530507,You've Got Mail (1998)
364,1265,5.0,1444530183,Groundhog Day (1993)
364,109249,5.0,1444529595,"Journey, The (El viaje) (1992) ..."
364,109374,5.0,1444530662,"Grand Budapest Hotel, The (2014) ..."
364,115617,4.5,1444530384,Big Hero 6 (2014)
364,1732,4.5,1444535154,"Big Lebowski, The (1998)"
364,318,4.5,1444529786,"Shawshank Redemption, The (1994) ..."
