# CS 237: Homework 4 Programming Exercises

In [1]:
# add your imports here

# here are some examples of imports
import matplotlib.pyplot as plt   # for plotting
import numpy as np                # for simulating random choices
from collections import Counter   # for aggregating the results

In the following exercises, we will run simulations of a recommender system that recommends movies to users, as we discussed in class on 9/23, and in this Piazza post: https://piazza.com/class/ksg5aj427qney?cid=126

## Step 1: Baseline Movie Ratings

We will start by generating baseline star-ratings to movies.  We assume there are $k$ movies in the system, and their ratings are distributed uniformly between 1.0 stars (terrible) and 5.0 stars (awesome).

In [2]:
k = 10000
movies = np.random.uniform(1.0, 5.0, k)
print (movies)

[1.98975949 2.44588454 1.61562674 ... 1.74720425 4.30222897 3.88257302]


## Exercise 1: Simulating User Populations

In this exercise, we ask you to now simulate ratings from two user different populations.  

User population 1 consists of random movie-watchers.  They pick movies completely at random, without regard to the underlying rating. After watching movie $i$ with rating $r_i$, they then generate a user rating

$u = r_i+ \delta$

where $\delta$ is uniform on [-1.0, 1.0]

and constrain that rating to be in the valid range [1.0, 5.0]

$$u = \max (1.0, u) \textit{   # round up to 1.0 if needed} $$
$$u = \min (5.0, u) \textit{   # round down to 5.0 if needed} $$

A rating is recorded as the tuple $(i, u)$.

Produce a list of 50,000 ratings produced by user population 1 and report the average rating the users generated as average1. 

====================

User population 2 consists of more discriminating users.  They choose movies proportionally to the underlying rating using the following selection probabilities:

$$\Pr[\text{Select movie } i]  = \frac{r_i} {\sum_j {r_j}}$$ 

They then generate a user rating using the same method as user population 1, except their $\delta$ is uniform on [-0.5, 0.5]. 

Produce a list of 50,000 ratings produced by user population 2 and report the average rating the users generated as average2. 

[Hint:  Implementation time-saver:  consider using the numpy function random.choice() to implement selections for User population 2]


In [3]:
#  Your code here to compute average1 and average2
average1 = 0
average2 = 0
ratings = 50000
population1 = []
population2 = []

## Population 1
for i in range(ratings):
    movie_num = np.random.choice(movies)
    rating = np.random.uniform(-1.0, 1.0) 
    user_rating = movie_num + rating
    user_rating = max(1.0, user_rating)
    user_rating = min(5.0, user_rating)
    population1.append((movie_num, user_rating))
    average1 += user_rating / ratings   
        
# print(population1)

## Population 2
movie_ratings_sum = sum(movies)
movie_prob = []
for i in movies:
    movie_prob.append(i/movie_ratings_sum)

for i in range(ratings):
    movie_num = np.random.choice(movies, size = None, p = movie_prob)
    rating = np.random.uniform(-0.5, 0.5) 
    user_rating = movie_num + rating
    user_rating = max(1.0, user_rating)
    user_rating = min(5.0, user_rating)
    population2.append((movie_num, user_rating))
    average2 += user_rating / ratings 

# print(population2)

In [4]:
#  Please execute this next line by un-commenting it:
print(average1, average2)

2.9944452988742007 3.4336412408304513


## Exercise 2: Basic Predictions

In this exercise, we ask you to now assess prediction error from different models.  When a user watches a movie $i$ (with rating $r_i$), each model will make a prediction $p$ and the user will provide a rating $u$.

Model A only knows about users:  it makes a conservative prediction of $p = 3.0$ for any movie watched by user population 1 and $p = 4.0$ for any movie watched by user population 2.  

Model B only knows about movies:  it predicts a value of $p = r_i$ whenever movie $i$ is watched (by any user). 

Simulate the two models on 50,000 trials of the following form.

  - choose a user type uniformly at random between Population 1 and Population 2 (50/50)
  - choose a movie at random following that user population's selection method
  - compute a predicted rating $p_A$ made by model A
  - compute a predicted rating $p_B$ made by model B
  - compute a rating $u$ the user generates for that movie
  
Store the outcomes of all the trials in a list, where the outcome of a trial as a 4-tuple:  (movieID, $u, p_A, p_B$).
  

Finally, assess the models on how much error it made over all predictions.  For this assignment we'll consider the the mean squared error (MSE).
The model error for one trial (for model A) is defined to be $\epsilon = p_A - u$ and the squared error for one trial is $\epsilon^2$.  The mean squared error over all trials (for model A) is the average of all 50,000 squared errors $\epsilon^2$. 

Compute and report the MSE for Model A and similarly, for Model B.




In [10]:
#  Your code here to run the 50,000 trials and compute MSE_A and MSE_B

question2 = []
MSE_A_L = []
MSE_B_L = []
MSE_A = 0
MSE_B = 0

for i in range(50000):
    
    prediction_A = 0
    prediction_B = 0
    MSE_A = 0
    MSE_B = 0
    ## selecting user
    user = np.random.choice((0, 1))

    if user == 0: ## POPULATION 1
        ## MODEL A
        prediction_A = 3.0 
        ## MODEL B
        movie_num = np.random.choice(movies)  
        prediction_B = movie_num
        ## USER RATING
        rating = np.random.uniform(-1.0, 1.0) 
        user_rating = movie_num + rating
    elif user == 1: ## POPULATION 2
        ## MODEL A
        prediction_A = 4.0 
        ## MODEL B
        movie_num = np.random.choice(movies, size=None, p=movie_prob)  
        prediction_B = movie_num
        ## USER RATING
        rating = np.random.uniform(-0.5, 0.5) 
        user_rating = movie_num + rating
    ## BOUNDING USER RATING
    user_rating = max(1.0, user_rating)
    user_rating = min(5.0, user_rating)
    MSE_A = prediction_A - user_rating 
    MSE_B = prediction_B - user_rating
    MSE_A *= MSE_A
    MSE_B *= MSE_B
    
    question2.append((movie_num, user_rating, prediction_A, prediction_B))
    MSE_A_L.append(MSE_A)
    MSE_B_L.append(MSE_B)

MSE_A = sum(MSE_A_L) / 50000
MSE_B = sum(MSE_B_L) / 50000
# print(question2)
# print(user)

In [11]:
#  Please execute this next line by un-commenting it:
print(MSE_A, MSE_B)

1.4826318245626846 0.18601917532418996


## Exercise 3: Can you do better?

Based on your experiments in parts 1 and 2, can you make simple changes to improve upon the existing models?

Improve model A just by changing the hard-wired prediction values $p_1 = 3.0$ and/or $p_2 = 4.0$.  Print the new settings for $p_1$ and $p_2$ and print the MSE for this new Model C.

It's harder to improve upon model B.  Tell us why in a couple of sentences.
Alternatively, impress us by showing that it is possible to do a little better than just predicting $r_i$.




In [7]:
#  Your code here to run the 50,000 trials and compute MSE_A and MSE_B

MSE_C_L = []
MSE_C = 0
p_1 = 0
p_2 = 0

for i in range(50000):
    

    MSE_C = 0
    ## selecting user
    user = np.random.choice((0, 1))

    if user == 0: ## POPULATION 1
        ## MODEL A
        p_1 = 3.0120812054168455 
        ## USER RATING
        rating = np.random.uniform(-1.0, 1.0) 
        movie_num = np.random.choice(movies)  
        user_rating = movie_num + rating
        user_rating = max(1.0, user_rating)
        user_rating = min(5.0, user_rating)
        MSE_C = p_1 - user_rating
    elif user == 1: ## POPULATION 2
        ## MODEL A
        p_2 = 3.449748787093291
        ## USER RATING
        rating = np.random.uniform(-0.5, 0.5)
        movie_num = np.random.choice(movies, size=None, p=movie_prob) 
        user_rating = movie_num + rating
        user_rating = max(1.0, user_rating)
        user_rating = min(5.0, user_rating)
        MSE_C = p_2 - user_rating
    ## GETTING MSE
    MSE_C *= MSE_C 
    MSE_C_L.append(MSE_C)

MSE_C = sum(MSE_C_L) / 50000
# print(question2)
# print(user)

#  Please execute this next line by un-commenting it:
print(p_1, p_2, MSE_C)

3.0120812054168455 3.449748787093291 1.325960745537057


In [8]:
#  Your 1-2 answer below for why it's hard to improve much on Model B, or, code for how you did it.

#  It is hard to improve a model such as Model B because the prediction is already pretty accurate 
#  given it is based on the movies current rating. It is hard to improve because the user rating is
#  completely random and unpredictable, so there will always be a little error. That is where the error occurs.
#
