#### Singular Value Decomposition
In this notebook, you will get some hands on practice with SVD.

In [17]:
import os
os.chdir("../")
%pwd

'c:\\Users\\abhis\\Desktop\\MLProjects\\Movie Recommender'

In [18]:
import numpy as np
import pandas as pd
from pandas.api.types import CategoricalDtype
from scipy.sparse import csr_matrix
from surprise import SVD, Reader, Dataset 
from surprise.model_selection import cross_validate

In [19]:
ratings_df = pd.read_csv('artifacts/data_preparation/final_data/ratings.csv')
movies_df = pd.read_csv('artifacts/data_preparation/final_data/movies.csv')

In [20]:
ratings_df.head()

Unnamed: 0,userId,movieId,rating
0,1,307,3.5
1,1,481,3.5
2,1,1091,1.5
3,1,1257,4.5
4,1,1449,4.5


In [21]:
more_than_200 = ratings_df['userId'].value_counts() > 200
# getting the index of these users
ind = more_than_200[more_than_200].index
ratings_df = ratings_df[ratings_df['userId'].isin(ind)]

In [22]:
# megre movies with ratings

rating_with_movies = ratings_df.merge(movies_df, on = "movieId")

In [23]:
# figure out which movie got how much rating

num_rating = rating_with_movies.groupby('title')['rating'].count().reset_index()
num_rating.rename(columns={"rating":"num_of_rating"},inplace=True)

In [24]:
final_rating = rating_with_movies.merge(num_rating, on = 'title')
# filter out book with more than 50 ratings only

final_rating =final_rating[final_rating['num_of_rating'] >= 50]

In [25]:
final_rating[['userId','title']].nunique()

userId    35004
title     12509
dtype: int64

In [26]:
final_rating.drop_duplicates(['userId','title'], inplace=True)

In [27]:
final_rating[['userId','title']].nunique()

userId    35004
title     12509
dtype: int64

In [28]:
rcLabel, vLabel = ('userId', 'movieId'), 'rating'
rcCat = [CategoricalDtype(sorted(ratings_df[col].unique()), ordered=True) for col in rcLabel]
rc = [ratings_df[column].astype(aType).cat.codes for column, aType in zip(rcLabel, rcCat)]
mat = csr_matrix((ratings_df[vLabel], rc), shape=tuple(cat.categories.size for cat in rcCat))
movie_pivot = ( pd.DataFrame.sparse.from_spmatrix(
    mat, index=rcCat[0].categories, columns=rcCat[1].categories) )

In [13]:
movie_pivot

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,193861,193863,193864,193866,193868,193876,193878,193880,193882,193886
4,4.0,4.0,0.0,0.0,2.0,4.5,0.0,0.0,0.0,4.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
19,0.0,0.0,4.0,0.0,0.0,4.0,0.0,0.0,4.0,4.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
42,4.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
43,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
51,4.0,3.0,4.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
283184,4.0,2.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
283185,0.0,0.0,0.0,4.0,0.0,0.0,3.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
283195,5.0,4.0,4.5,0.0,0.0,4.5,0.0,0.0,0.0,3.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
283204,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [14]:
ratings_df['userId'].nunique()

35004

In [15]:
user_movie_subset = movie_pivot[[13,  155, 855]][np.all(movie_pivot[[13,  155, 855]] != 0, axis = 1)]

In [16]:
# user with the highest average rating
print(user_movie_subset.mean(axis=1))

# movie with highest average rating
print(user_movie_subset.mean(axis=0))

# list of movie names
for movie_id in [13,  155, 855]:
    print(movies_df[movies_df['movieId'] == movie_id]['title'])
    
# users by movies
user_movie_subset.shape

52772     3.666667
63810     2.333333
101928    1.000000
126870    2.666667
128244    3.000000
183233    2.333333
201555    3.000000
216371    3.333333
dtype: float64
13     2.375
155    3.000
855    2.625
dtype: float64
12    Balto (1995)
Name: title, dtype: object
150    Beyond Rangoon (1995)
Name: title, dtype: object
824    Every Other Weekend (Un week-end sur deux) (1990)
Name: title, dtype: object


(8, 3)

Now that you have a little more context about the matrix we will be performing Singular Value Decomposition on, we're going to do just that. To get started, let's remind ourselves about the dimensions of each of the matrices we are going to get back. Essentially, we are going to split the **user_movie_subset** matrix into three matrices:
$$
U \Sigma V^T
$$

4. Below you can find the code used to perform SVD in numpy. You can see more about this functionality in the [documentation](https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html). What do you notice about the shapes of your matrices? If you try to take the dot product of the three objects you get back, can you directly do this to get back the user-movie matrix?

In [17]:
u, s, vt = np.linalg.svd(user_movie_subset)
s.shape, u.shape, vt.shape

((3,), (8, 8), (3, 3))

Looking at the dimensions of the three returned objects, we can see the following:

 1. The u matrix is a square matrix with the number of rows and columns equaling the number of users. 

 2. The v transpose matrix is also a square matrix with the number of rows and columns equaling the number of items.

 3. The sigma matrix is actually returned as just an array with 3 values.  

 In order to set up the matrices in a way that they can be multiplied together, we have a few steps to perform: 

 1. Turn sigma into a square matrix with the number of latent features we would like to keep. 

 2. Change the columns of u and the rows of v transpose to match this number of dimensions. 

 If we would like to exactly re-create the user-movie matrix, we could choose to keep all of the latent features.

 5. Use the thoughts from the above question to create u, s, and vt with four latent features. When you have all three matrices created correctly, run the test below to show that the dot product of the three matrices creates the original user-movie matrix. The matrices should have the following dimensions:

$$
U_{nxk} \\ \Sigma_{kxk} \\ V^T_{kxm}
$$

where:

- n is the number of users
- k is the number of latent features to keep (3 for this case)
- m is the number of movies

In [18]:
# Change the dimensions of u, s, and vt as necessary to use three latent features
# update the shape of u and store in u_new
u_new = u[:, :len(s)]

# update the shape of s and store in s_new
s_new = np.zeros((len(s), len(s)))
s_new[:len(s), :len(s)] = np.diag(s) 

# Because we are using 3 latent features and there are only 3 movies, 
# vt and vt_new are the same
vt_new = vt

It turns out that the sigma matrix can actually tell us how much of the original variability in the user-movie matrix is captured by each latent feature. The total amount of variability to be explained is the sum of the squared diagonal elements. The amount of variability explained by the first componenet is the square of the first value in the diagonal. The amount of variability explained by the second componenet is the square of the second value in the diagonal.

6. Using the above information, can you determine the amount of variability in the original user-movie matrix that can be explained by only using the first two components? Use the cell below for your work, and then test your answer against the solution with the following cell.

In [19]:
total_var = np.sum(s**2)
var_exp_comp1_and_comp2 = s[0]**2 + s[1]**2
perc_exp = round(var_exp_comp1_and_comp2/total_var*100, 2)
print("The total variance in the original matrix is {}.".format(total_var))
print("Ther percentage of variability captured by the first two components is {}%.".format(perc_exp))

The total variance in the original matrix is 193.99999999999997.
Ther percentage of variability captured by the first two components is 99.1%.


7. Similar to in the previous question, change the shapes of your u, sigma, and v transpose matrices. However, this time consider only using the first 2 components to reproduce the user-movie matrix instead of all 3. After you have your matrices set up, check your matrices against the solution by running the tests. The matrices should have the following dimensions:

$$
U_{nxk} \\ \Sigma_{kxk} \\ V^T_{kxm}
$$

where:

- n is the number of users
- k is the number of latent features to keep (2 for this case)
- m is the number of movies

In [20]:
# Change the dimensions of u, s, and vt as necessary to use four latent features
# update the shape of u and store in u_new
k = 2
u_2 = u[:, :k]

# update the shape of s and store in s_new
s_2 = np.zeros((k, k))
s_2[:k, :k] = np.diag(s[:k]) 

# Because we are using 2 latent features, we need to update vt this time
vt_2 = vt[:k, :]

The question is now that we don't have all of the latent features, how well can we really re-create the original user-movie matrix?

8. When using all 3 latent features, we saw that we could exactly reproduce the user-movie matrix. Now that we only have 2 latent features, we might measure how well we are able to reproduce the original matrix by looking at the sum of squared errors from each rating produced by taking the dot product as compared to the actual rating. Find the sum of squared error based on only the two latent features, and use the following cell to test against the solution.

In [23]:
s_2.shape, u_2.shape, vt_2.shape

((2, 2), (8, 2), (2, 3))

In [21]:
# Compute the dot product
pred_ratings = np.dot(np.dot(u_2, s_2), vt_2)

# Compute the squared error for each predicted vs. actual rating
sum_square_errs = np.sum(np.sum((user_movie_subset - pred_ratings)**2))

In [22]:
pred_ratings, user_movie_subset

(array([[3.74988951, 3.7278407 , 3.54998859],
        [1.33228046, 3.36157307, 2.26932108],
        [1.01578809, 1.01717991, 0.96528226],
        [3.20238095, 2.2202221 , 2.55496784],
        [3.04736427, 3.05153973, 2.89584679],
        [1.87655951, 2.86567746, 2.27144347],
        [3.04736427, 3.05153973, 2.89584679],
        [1.87978903, 4.86919167, 3.26434183]]),
         13   155  855
 52772   4.0  4.0  3.0
 63810   1.0  3.0  3.0
 101928  1.0  1.0  1.0
 126870  3.0  2.0  3.0
 128244  3.0  3.0  3.0
 183233  2.0  3.0  2.0
 201555  3.0  3.0  3.0
 216371  2.0  5.0  3.0)

At this point, you may be thinking... why would we want to choose a k that doesn't just give us back the full user-movie matrix with all the original ratings. This is a good question. One reason might be for computational reasons - sure, you may want to reduce the dimensionality of the data you are keeping, but really this isn't the main reason we would want to perform reduce k to lesser than the minimum of the number of movies or users.

Let's take a step back for a second. In this example we just went through, your matrix was very clean. That is, for every user-movie combination, we had a rating. There were no missing values. But what we know from the previous lesson is that the user-movie matrix is full of missing values.

Therefore, if we keep all k latent features it is likely that latent features with smaller values in the sigma matrix will explain variability that is probably due to noise and not signal. Furthermore, if we use these "noisey" latent features to assist in re-constructing the original user-movie matrix it will potentially (and likely) lead to worse ratings than if we only have latent features associated with signal.

9. Let's try introducing just a little of the real world into this example by performing SVD on a matrix with missing values. Below I have added a new user to our matrix who hasn't rated all three of our movies. Try performing SVD on the new matrix. What happens?

In [40]:
# This line adds one nan value as the very first entry in our matrix
user_movie_subset.iloc[0, 0] = np.nan

# Try svd with this new matrix
u, s, vt = np.linalg.svd(user_movie_subset)

TypeError: SparseArray does not support item assignment via setitem

Even with just one nan value we cannot perform SVD! This is going to be a huge problem, because our real dataset has nan values everywhere! This is where FunkSVD comes in to help.

### Implementing FunkSVD

1. You will use the user_movie_subset matrix to show that your FunkSVD algorithm will converge. In the below cell, use the comments and document string to assist you as you complete writing your own function to complete FunkSVD. You may also want to try to complete the funtion on your own without the assistance of comments. You may feel free to remove and add to the function in any way that gets you a working solution!

Notice: There isn't a sigma matrix in this version of matrix factorization.

In [41]:
ratings_mat = np.matrix(user_movie_subset)

In [44]:
def FunkSVD(ratings_mat, latent_features=4, learning_rate=0.0001, iters=100):
    '''
    This function performs matrix factorization using a basic form of FunkSVD with no regularization
    
    INPUT:
    ratings_mat - (numpy array) a matrix with users as rows, movies as columns, and ratings as values
    latent_features - (int) the number of latent features used
    learning_rate - (float) the learning rate 
    iters - (int) the number of iterations
    
    OUTPUT:
    user_mat - (numpy array) a user by latent feature matrix
    movie_mat - (numpy array) a latent feature by movie matrix
    '''
    
    # Set up useful values to be used through the rest of the function
    n_users = ratings_mat.shape[0]
    n_movies = ratings_mat.shape[1]
    num_ratings = np.count_nonzero(~np.isnan(ratings_mat))
    
    # initialize the user and movie matrices with random values
    user_mat = np.random.rand(n_users, latent_features)
    movie_mat = np.random.rand(latent_features, n_movies)
    
    # initialize sse at 0 for first iteration
    sse_accum = 0
    
    # keep track of iteration and MSE
    print("Optimizaiton Statistics")
    print("Iterations | Mean Squared Error ")
    
    # for each iteration
    for iteration in range(iters):

        # update our sse
        old_sse = sse_accum
        sse_accum = 0
        
        # For each user-movie pair
        for i in range(n_users):
            for j in range(n_movies):
                
                # if the rating exists
                if ratings_mat[i, j] > 0:
                    
                    # compute the error as the actual minus the dot product of the user and movie latent features
                    diff = ratings_mat[i, j] - np.dot(user_mat[i, :], movie_mat[:, j])
                    
                    # Keep track of the sum of squared errors for the matrix
                    sse_accum += diff**2
                    
                    # update the values in each matrix in the direction of the gradient
                    for k in range(latent_features):
                        user_mat[i, k] += learning_rate * (2*diff*movie_mat[k, j])
                        movie_mat[k, j] += learning_rate * (2*diff*user_mat[i, k])

        # print results
        print("%d \t\t %f" % (iteration+1, sse_accum / num_ratings))
        
    return user_mat, movie_mat 

In [47]:
user_mat, movie_mat = FunkSVD(ratings_mat, 
                              latent_features=3, 
                              learning_rate=0.05, 
                              iters=10)

Optimizaiton Statistics
Iterations | Mean Squared Error 
1 		 2.596023
2 		 0.606464
3 		 0.495485
4 		 0.397009
5 		 0.300170
6 		 0.231268
7 		 0.193369
8 		 0.174199
9 		 0.162853
10 		 0.154003


In [48]:
print(np.dot(user_mat, movie_mat))
print(ratings_mat)

[[3.75970223 3.53204469 3.37950272]
 [1.47838469 3.60052284 2.5170406 ]
 [1.00899746 1.05385648 0.9636682 ]
 [3.30721258 2.474125   2.65998433]
 [3.0003963  2.93900709 2.95777222]
 [1.78147728 2.81359643 2.0772378 ]
 [3.03125546 3.0867048  2.9090706 ]
 [1.88539646 4.81076812 3.0571695 ]]
[[4. 4. 3.]
 [1. 3. 3.]
 [1. 1. 1.]
 [3. 2. 3.]
 [3. 3. 3.]
 [2. 3. 2.]
 [3. 3. 3.]
 [2. 5. 3.]]


**The predicted ratings from the dot product are already starting to look a lot like the original data values even after only 10 iterations. Clearly the model is not done learning, but things are looking good.**

3. Let's try out the function again on the user_movie_subset dataset. This time we will again use 3 latent features and a learning rate of 0.005. However, let's bump up the number of iterations to 300. When you take the dot product of the resulting U and V matrices, how does the resulting user_movie matrix compare to the original subset of the data? What do you notice about the mean squared error at the end of each training iteration?

In [49]:
user_mat, movie_mat = FunkSVD(ratings_mat, 
                              latent_features=3, 
                              learning_rate=0.005, 
                              iters=300)

Optimizaiton Statistics
Iterations | Mean Squared Error 
1 		 4.432049
2 		 3.804166
3 		 3.189154
4 		 2.612832
5 		 2.097802
6 		 1.659277
7 		 1.303033
8 		 1.025953
9 		 0.818530
10 		 0.668078
11 		 0.561527
12 		 0.487239
13 		 0.435811
14 		 0.400156
15 		 0.375192
16 		 0.357407
17 		 0.344427
18 		 0.334673
19 		 0.327102
20 		 0.321023
21 		 0.315977
22 		 0.311654
23 		 0.307845
24 		 0.304403
25 		 0.301225
26 		 0.298238
27 		 0.295387
28 		 0.292633
29 		 0.289945
30 		 0.287300
31 		 0.284680
32 		 0.282073
33 		 0.279466
34 		 0.276853
35 		 0.274225
36 		 0.271577
37 		 0.268906
38 		 0.266208
39 		 0.263480
40 		 0.260721
41 		 0.257928
42 		 0.255101
43 		 0.252240
44 		 0.249343
45 		 0.246412
46 		 0.243445
47 		 0.240444
48 		 0.237410
49 		 0.234343
50 		 0.231244
51 		 0.228117
52 		 0.224960
53 		 0.221778
54 		 0.218572
55 		 0.215343
56 		 0.212095
57 		 0.208830
58 		 0.205550
59 		 0.202259
60 		 0.198959
61 		 0.195654
62 		 0.192346
63 		 0.189039
64 		 0

In [50]:
print(np.dot(user_mat, movie_mat))
print(ratings_mat)

[[3.89025207 3.86684729 3.24419733]
 [0.98879293 2.98619465 3.0259511 ]
 [0.96191579 0.95822486 1.08019867]
 [3.02263668 2.02688292 2.95079038]
 [3.05188087 3.06023541 2.88775409]
 [2.00030295 2.99813973 2.00202972]
 [3.05395481 3.06909669 2.87583806]
 [2.01611003 5.02026574 2.96339174]]
[[4. 4. 3.]
 [1. 3. 3.]
 [1. 1. 1.]
 [3. 2. 3.]
 [3. 3. 3.]
 [2. 3. 2.]
 [3. 3. 3.]
 [2. 5. 3.]]


**In this case, we were able to completely reconstruct the item-movie matrix to closer to 0 mean squared error at the end of 300th iterations.**

The last time we placed an np.nan value into this matrix the entire svd algorithm in python broke. Let's see if that is still the case using your FunkSVD function. In the below cell, I have placed a nan into the first cell of your numpy array.

4. Use 3 latent features, a learning rate of 0.005, and 450 iterations. Are you able to run your SVD without it breaking (something that was not true about the python built in)? Do you get a prediction for the nan value? What is your prediction for the missing value? Use the cells below to answer these questions.

In [51]:
ratings_mat[0, 0] = np.nan
ratings_mat

matrix([[nan,  4.,  3.],
        [ 1.,  3.,  3.],
        [ 1.,  1.,  1.],
        [ 3.,  2.,  3.],
        [ 3.,  3.,  3.],
        [ 2.,  3.,  2.],
        [ 3.,  3.,  3.],
        [ 2.,  5.,  3.]])

In [52]:
# run SVD on the matrix with the missing value
user_mat, movie_mat = FunkSVD(ratings_mat, 
                              latent_features=3, 
                              learning_rate=0.005, 
                              iters=450)

Optimizaiton Statistics
Iterations | Mean Squared Error 
1 		 3.551159
2 		 3.033426
3 		 2.547944
4 		 2.109363
5 		 1.727487
6 		 1.406337
7 		 1.144544
8 		 0.936695
9 		 0.775076
10 		 0.651262
11 		 0.557254
12 		 0.486121
13 		 0.432218
14 		 0.391145
15 		 0.359574
16 		 0.335037
17 		 0.315721
18 		 0.300302
19 		 0.287809
20 		 0.277531
21 		 0.268939
22 		 0.261639
23 		 0.255333
24 		 0.249796
25 		 0.244854
26 		 0.240375
27 		 0.236253
28 		 0.232407
29 		 0.228774
30 		 0.225303
31 		 0.221955
32 		 0.218699
33 		 0.215508
34 		 0.212364
35 		 0.209251
36 		 0.206157
37 		 0.203072
38 		 0.199989
39 		 0.196903
40 		 0.193809
41 		 0.190704
42 		 0.187587
43 		 0.184456
44 		 0.181312
45 		 0.178153
46 		 0.174982
47 		 0.171799
48 		 0.168605
49 		 0.165402
50 		 0.162193
51 		 0.158979
52 		 0.155763
53 		 0.152547
54 		 0.149335
55 		 0.146128
56 		 0.142930
57 		 0.139744
58 		 0.136573
59 		 0.133420
60 		 0.130287
61 		 0.127177
62 		 0.124094
63 		 0.121041
64 		 0

In [54]:
preds = np.dot(user_mat, movie_mat)
print("The predicted value for the missing rating is: {}".format(preds[0,0]))


The predicted value for the missing rating is: 2.177346902360894


Now let's extend this to a more realistic example. Unfortunately, running this function on your entire user-movie matrix is still not something you likely want to do on your local machine. However, we can see how well this example extends to 1000 users. In the above portion, you were using a very small subset of data with no missing values.

5. Given the size of this matrix, this will take quite a bit of time. Consider the following hyperparameters: 3 latent features, 0.005 learning rate, and 500 iterations. Grab a snack, take a walk, and this should be done running in a bit.

In [11]:
# Setting up a matrix of the first 1000 users with movie ratings
first_1000_users = np.matrix(movie_pivot.head(1000))
first_1000_users[first_1000_users == 0] = np.nan

# # perform funkSVD on the matrix of the top 1000 users
# user_mat, movie_mat = FunkSVD(first_1000_users, 
#                               latent_features=3, 
#                               learning_rate=0.005, 
#                               iters=500)


In [12]:
# We will use the famous SVD algorithm.
svd = SVD()
reader = Reader()

In [13]:
data = Dataset.load_from_df(final_rating[['userId', 'title', 'rating']], reader)

In [15]:
# Load the ratings_small dataset (download it if needed),
# data = Dataset.load_from_df(ratings_df[['userId', 'movieId', 'rating']], reader)

In [14]:
cross_validate(svd, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

Evaluating RMSE, MAE of algorithm SVD on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    0.7394  0.7395  0.7391  0.7401  0.7399  0.7396  0.0004  
MAE (testset)     0.5604  0.5606  0.5602  0.5607  0.5606  0.5605  0.0002  
Fit time          215.55  219.23  215.89  218.94  217.25  217.37  1.51    
Test time         59.05   49.73   44.20   60.62   40.59   50.84   7.92    


{'test_rmse': array([0.73940993, 0.73953583, 0.73905324, 0.74005728, 0.73988416]),
 'test_mae': array([0.56038511, 0.5606157 , 0.56015645, 0.5607372 , 0.56057395]),
 'fit_time': (215.55395007133484,
  219.2347650527954,
  215.89060854911804,
  218.93954825401306,
  217.250962972641),
 'test_time': (59.054768085479736,
  49.72729754447937,
  44.19604015350342,
  60.61952543258667,
  40.588096618652344)}

In [15]:
#sample full trainset
trainset = data.build_full_trainset()

In [16]:
# Train the algorithm on the trainset
svd.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x1a5ae3d4f70>

In [18]:
ratings_df[ratings_df['userId'] == 1]

Unnamed: 0,userId,movieId,rating


In [19]:
# predict ratings for the testset
svd.predict(uid=1, iid=302, r_ui=None)

Prediction(uid=1, iid=302, r_ui=None, est=3.437940601992745, details={'was_impossible': False})

In [20]:
# directly grab the estimated ratings for the testset
svd.predict(uid=1, iid=302, r_ui=None).est

3.437940601992745

In [21]:
ratings_df[ratings_df['movieId'] == 302].rating.mean()

3.6588103254769924

In [22]:
# predict ratings for the testset
svd.predict(uid=100, iid=302, r_ui=None)

Prediction(uid=100, iid=302, r_ui=None, est=3.9398385352737106, details={'was_impossible': False})

In [23]:
# directly grab the estimated ratings for the testset
svd.predict(uid=100, iid=302, r_ui=None).est

3.9398385352737106

In [2]:
import torch
import numpy as np
from surprise import Dataset
from surprise import SVD, Reader
from scipy.sparse import csr_matrix

reader = Reader()
# Load the Surprise dataset
data = Dataset.load_builtin('ml-100k')
# data = Dataset.load_from_df(final_rating[['userId', 'title', 'rating']], reader)
# Build the trainset
trainset = data.build_full_trainset()

# Get the ratings data
ratings_data = list(trainset.all_ratings())

if len(ratings_data) == 0:
    # Handle the case when there is no rating data
    print("No rating data available.")
else:
    # Extract user IDs, item IDs, and ratings from the ratings data
    user_ids = [r[0] for r in ratings_data]
    item_ids = [r[1] for r in ratings_data]
    ratings = [r[2] for r in ratings_data]

    # Create a sparse ratings matrix
    ratings_matrix = csr_matrix((ratings, (user_ids, item_ids)))

    # Convert the sparse matrix to a PyTorch tensor
    ratings_tensor = torch.FloatTensor(ratings_matrix.toarray())

    # Move the ratings tensor to the GPU
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    ratings_tensor = ratings_tensor.to(device)
    # Print if using CPU or GPU
    print('Device:', device)

    # Perform SVD on GPU
    U, S, V = torch.svd(ratings_tensor)

    # Move the resulting matrices back to the CPU if needed
    U = U.cpu()
    S = S.cpu()
    V = V.cpu()

    # Print the shapes of U, S, V
    print('U shape:', U.shape)
    print('S shape:', S.shape)
    print('V shape:', V.shape)


Device: cpu
U shape: torch.Size([943, 943])
S shape: torch.Size([943])
V shape: torch.Size([1682, 943])


In [8]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [9]:
device

device(type='cpu')

In [7]:
import torch
torch.cuda.is_available()

False