### Cold Start Problem

In this notebook, you will get more practice with evaluation methods as a continuation of the previous notebook.  You will also see what happens when you meet the cold start problem face to face.  And, we will consider some methods to handle the cold start problem.

With that, run the cell below to read in the ratings data, and let's get started!

In [2]:
# run this cell to read in the libraries and data needed
import numpy as np
import pandas as pd
import turicreate as tc

ratings_dat = pd.read_csv('../../../data/ratings.dat', sep='::', engine='python', \
                          header=None, names=['user_id', 'movie_id','rating','time'])

ratings_dat2 = ratings_dat.copy(deep=True)
ratings_dat2.columns = ['user_id', 'item_id', 'rating', 'time']
ratings_sframe = tc.SFrame(ratings_dat2[['user_id', 'item_id', 'rating']])

ratings_sframe.head()

user_id,item_id,rating
1,8722346,8
2,1502397,7
3,10526632,8
3,3513548,8
3,4082596,8
3,4658808,8
3,5073642,7
3,7876510,9
3,8075192,9
3,8652728,8


You've had the opportunity to fit a few different recommendation systems.  Now is your chance to put what you learned to practice.  Use the below prompts (and code from earlier in this course) to split your data and fit a few different models to be considered when making a recommendation.

**First,** create train and test datasets from the original data.  Set the `max_num_users` to `None` in order to maximize the test set.

In [3]:
train, test = tc.recommender.util.random_split_by_user(ratings_sframe, 
                                                       user_id = 'user_id',
                                                       item_id = 'item_id',
                                                       max_num_users=None)

**Now** using your `target = rating`, `create` two recommendation systems: a `factorization_recommender` and a `popularity_recommender`. You should only be creating your recommender using the `train` data.

In [4]:
model_factorization = tc.factorization_recommender.create(train, target='rating')
model_popular = tc.popularity_recommender.create(train, target='rating')

Make **3** `recommend`ations for each of the users in `users_test`, the `SFrame` created below using each of your two created recommendation systems.

In [21]:
users_test = tc.SFrame({'user_id': [1, 2, 8]})

# use recommend to make recommendations for users_test with each of your recommenders
factorization_results = model_factorization.recommend(users_test, k=3)
popular_results = model_popular.recommend(users_test, k=3)

In [23]:
factorization_results

user_id,item_id,score,rank
1,96895,10.558148840015914,1
1,119174,10.493620851582076,2
1,327597,10.346538999622847,3
2,96895,10.212757380252864,1
2,119174,10.204077275043511,2
2,327597,10.036493094211604,3
8,96895,11.74122369433024,1
8,119174,11.685085417276408,2
8,327597,11.534000040536906,3


What happens when you make recommendations for the `new_user` below with the `popular` recommender?  How about if you make a recommendation using the `factorization` recommender?

In [18]:
new_user = tc.SFrame({'user_id': [0]})

# use recommend to make recommendations for users_test with each of your recommenders
fact_new_results = model_factorization.recommend(new_user, k=3)
pop_new_results = model_popular.recommend(new_user, k=3)

In [19]:
fact_new_results

user_id,item_id,score,rank
0,96895,10.617140890604045,1
0,119174,10.560908676630046,2
0,327597,10.409880997186686,3


In [20]:
pop_new_results

user_id,item_id,score,rank
0,2910904,10.0,1
0,1638355,10.0,2
0,91129,10.0,3


In [None]:
import solution_part2 as sp

a = "each of the recommenders has nan recommendations for the new user"
b = "both recommenders recommend the same, most popular recommendations"
c = "both recommenders give recommendations, but they don't match one another"
d = "None of the above"

your_answer = c

sp.answer_one(your_answer)

In [None]:
# run this cell to get some final thoughts before the next section
sp.final_thoughts()