<table><tbody><tr><th><p><img alt="Emblema" src="https://cdn6.aptoide.com/imgs/6/f/4/6f4821daa840da8fe971445350759fe5_icon.png" style="width:150px;"></p></th><th><p><strong>Inteligencia Artificial</strong></p><p><strong>Grado en Ingeniería Informática en Sistemas de Información – Curso 2024/2025</strong></p><p><strong>ENSEÑANZAS PRÁCTICAS Y DE DESARROLLO</strong></p><h1>EPD 6: Machine Learning – Sistemas de recomendación</h1></th></tr></tbody></table>

____

## Objetivos
- Implementación en Python de un algoritmo de sistemas de recomendación.

___

## Bibliografía Básica
- Recommender systems. Charu C. Aggarwal. Springer, 2016. Disponible online: http://pzs.dstu.dp.ua/DataMining/recom/bibl/1aggarwal_c_c_recommender_systems_the_textbook.pdf

- Recommender systems handbook. Francesco Ricci, Lior Rokach, Bracha Shapira, Paul B. Kantor. Springer, 2011. Disponible online: https://www.cse.iitk.ac.in/users/nsrivast/HCC/Recommender_systems_handbook.pdf

___

In [1]:
import numpy as np
import pandas as pd
import scipy.io as sio
import scipy.optimize as opt

## Ejercicios
Implementar un algoritmo que recomiende películas a los usuarios. Para ello, usar el fichero “ex8_movies.mat” que contiene datos de películas clasificadas por los usuarios en una escala del 1 al 5. En concreto, 943 usuarios han clasificado 1682 películas. Las películas se identifican con 10 características relativas a su contenido. El objetivo del algoritmo es predecir la puntuación que le daría un usuario a una película que no ha visto aún y recomendar a ese usuario las películas con las puntuaciones más altas.

#### EJ01. 

Cargar el dataset y prepararlo para el algoritmo usando 2 matrices. La matriz Y almacenará las clasificaciones de las películas y la matriz R contendrá solamente valores binarios donde R(i,j) = 1 significará que el usuario j clasificó la película i y R(i,j) = 0 indicará que no la clasificó. Ambas matrices tendrán como dimensión: número de películas x número de usuarios. La media de las puntuaciones que recibe la primera película (Toy Story) debe ser aproximadamente 3.878319. Almacenar en las matrices de parámetros X y Theta los valores pre-entrenados disponibles en el fichero “ex8_movieParams.mat”. Las dimensiones de X deben ser número de películas x número de características y las de Theta número de características x número de usuarios. Compruebe las dimensiones y actúe en caso de que no coincidan.

##### Solución:

In [2]:
# =============== EJ1: Cargar datos ================
print('Loading movie ratings dataset.')
movies = sio.loadmat("ex8_movies.mat")
Y = movies['Y'] # [n_items, n_users] puntuaciones de 1-5
R = movies['R'] # [n_items, n_users] R(i,j)=1 si usuario j puntuó pelicula i
print("Shape de Y: ", Y.shape)  # [n_items, features]
print("Shape de R: ", R.shape)  # [n_items, features]

print('\tAverage rating for the first movie (Toy Story): ', Y[0, np.where(R[0, :] == 1)[0]].mean(), "/5\n")

#  Cargar parámetros preentrenados (X, Theta, num_users, num_movies, num_features)
    
params_data = sio.loadmat('ex8_movieParams.mat')
X = params_data['X']
Theta = params_data['Theta']
Theta = Theta.T # Hacemos la traspuesta directamente, así podemos usar las fórmulas de regresión lineal
print("Shape de X: ", X.shape)  # [n_items, features]
print("Shape de Theta: ", Theta.shape)  # [features, n_users]


Loading movie ratings dataset.
Shape de Y:  (1682, 943)
Shape de R:  (1682, 943)
	Average rating for the first movie (Toy Story):  3.8783185840707963 /5

Shape de X:  (1682, 10)
Shape de Theta:  (10, 943)


#### EJ02.
Implementar la función coste sin regularización para un sistema de recomendación de filtrado colaborativo en cofiCostFuncSinReg siguiendo la fórmula indicada en EB. El coste se acumula para el usuario j y la película i sólo si R(i,j)= 1. Si usa las matrices de parámetros X y Theta almacenadas en el fichero para los 4 primeros usuarios, 5 primeras películas y 3 primeros atributos/características, el coste debe ser 22.22 aproximadamente.

##### Solución:

In [3]:
def cofiCostFuncSinReg(params, Y, R, num_features):
    # No es necesario: Y = np.matrix(Y)
    # No es necesario: R = np.matrix(R)
    num_movies = Y.shape[0]
    num_users = Y.shape[1]

    # Unfold the X and Theta matrices from params by reshaping
    X = np.reshape(params[:num_movies * num_features], (num_movies, num_features), 'F')  # [n_items, features] # Antes: X = np.matrix(np.reshape(params[:num_movies * num_features], (num_movies, num_features),'F'))
    # Como Theta la he traspuesto al inicio: cambia el reshape
    Theta = np.reshape(params[num_movies * num_features:], (num_features, num_users), 'F') # [features, n_users] #Antes: Theta = np.matrix(np.reshape(params[num_movies * num_features:], (num_users, num_features),'F'))  # (943, 10)

    # Initializations
    J = 0

    ## ====================== YOUR CODE HERE ======================
    # Instructions: Compute the cost function for collaborative
    #               filtering. Concretely, you should first implement the cost
    #               function (without regularization) and make sure it is
    #               matches our costs.
    #
    #  Notes: X:  num_movies  x num_features matrix of movie features
    #        Theta:  num_users  x num_features matrix of user features
    #        Y: num_movies x num_users matrix of user ratings of movies
    #        R: num_movies x num_users matrix, where R(i, j) = 1 if the
    #           i-th movie was rated by the j-th user

    error = np.multiply(np.dot(X, Theta) - Y, R) # [n_items, n_users] # Antes: error = np.multiply((X * Theta.T) - Y, R)  # (1682, 943)
    # Multiplicar por R es importante para asegurar que solo se tengan en cuenta las calificaciones que se han hecho
    
    squared_error = np.power(error, 2)  # [n_items, n_users]
    J = (1. / 2) * np.sum(squared_error)

    return J

# OTRA FORMA DE HACERLO
    #num_movies = Y.shape[0]
    #num_users = Y.shape[1]
    # X = np.reshape(params[:num_movies * num_features], (num_movies, num_features), 'F')  # (1682, 10)
    # Theta = np.reshape(params[num_movies * num_features:], (num_users, num_features), 'F')  # (943, 10)
    #C = np.subtract((X @ Theta.T),Y)
    #indices = np.where(R[:, :] == 1)
    #error = C[indices]
    #squared_error = np.power(error, 2)  # (1682, 943)
    #J = (1. / 2) * np.sum(squared_error)
    #return J
    

In [4]:
#  Filtrado colaborativo de sistemas de recomendación
### Subconjunto de datos para que ejecute más rápidamente
users = 4
movies = 5
features = 3

X_sub = X[:movies, :features] # [n_items, features]
Theta_sub = Theta[:features, :users] # [features, n_users] # Sin la Theta.T era: Theta_sub = Theta[:users, :features]
Y_sub = Y[:movies, :users] # [n_items, n_users]
R_sub = R[:movies, :users] # [n_items, n_users]

params = np.hstack((np.ravel(X_sub, order='F'), np.ravel(Theta_sub, order='F'))) # Antes lo teníamos con concatenate: params = np.concatenate((np.ravel(X_sub,order='F'), np.ravel(Theta_sub,order='F')))

J = cofiCostFuncSinReg(params, Y_sub, R_sub, features)
print("Cost without regularization at loaded parameters: ", J, "(this value should be about 22.22)")


Cost without regularization at loaded parameters:  22.224603725685675 (this value should be about 22.22)


#### EJ03.
Implementar la función gradiente sin regularización en cofiGradientFuncSinReg. Usar la función auxiliar checkNNGradientsSinReg.py para verificar que los gradientes están bien calculados.

##### Solución:

In [5]:
def cofiGradientFuncSinReg(params, Y, R, num_features):
    # No es necesario: Y = np.matrix(Y)
    # No es necesario: R = np.matrix(R)
    num_movies = Y.shape[0]
    num_users = Y.shape[1]

    # Unfold the X and Theta matrices from params by reshaping
    X = np.reshape(params[:num_movies * num_features], (num_movies, num_features), 'F')  # [n_items, features] # Antes: X = np.matrix(np.reshape(params[:num_movies * num_features], (num_movies, num_features),'F'))  # (1682, 10)
    Theta = np.reshape(params[num_movies * num_features:], (num_features, num_users), 'F') # [features, n_users] # Antes: Theta = np.matrix(np.reshape(params[num_movies * num_features:], (num_users, num_features),'F'))  # (943, 10)

    # Initializations
    X_grad = np.zeros(X.shape)  # [n_items, features]
    Theta_grad = np.zeros(Theta.shape)  # [features, n_users]

    ## ====================== YOUR CODE HERE ======================
    # Instructions: Compute the gradient function for collaborative
    #               filtering. You should implement the gradient without regularization
    #
    #  Notes: X:  num_movies  x num_features matrix of movie features
    #        Theta:  num_users  x num_features matrix of user features
    #        Y: num_movies x num_users matrix of user ratings of movies
    #        R: num_movies x num_users matrix, where R(i, j) = 1 if the
    #           i-th movie was rated by the j-th user
    #
    error = np.multiply(np.dot(X, Theta) - Y, R)  # [n_items, n_users] # Antes: error = np.multiply((X * Theta.T) - Y, R)  # (1682, 943)

    # calculate the gradients
    Theta_grad = np.dot(X.T, error) # [features, n_users]=[features,n_items]*[n_items,n_users] Antes: Theta_grad = error.T * X
    X_grad = np.dot(error, Theta.T) # [n_items,features]=[n_items,n_users]*[n_users, features] # Sería lo mismo que: X_grad2 = np.dot(Theta, error.T).T -- Antes: X_grad = error * Theta


    # Desenrollar the gradient matrices into a single array
    grad = np.hstack((np.ravel(X_grad, order='F'), np.ravel(Theta_grad, order='F'))) # ANTES: grad = np.concatenate((np.ravel(X_grad,order='F'), np.ravel(Theta_grad,order='F')))

    return grad
    

In [6]:
def computeNumericalGradientSinReg(X,Theta, Y, R, num_features):
    mygrad = np.zeros(Theta.size + X.size)
    perturb = np.zeros(Theta.size + X.size)
    myeps = 0.0001
    params = np.concatenate((np.ravel(X, order='F'), np.ravel(Theta, order='F')))

    for i in range(np.size(Theta)+np.size(X)):
        # Set perturbation vector
        perturb[i] = myeps
        params_plus = params + perturb
        params_minus = params - perturb
        cost_high = cofiCostFuncSinReg(params_plus, Y, R, num_features)
        cost_low = cofiCostFuncSinReg(params_minus, Y, R, num_features)

        # Compute Numerical Gradient
        mygrad[i] = (cost_high - cost_low) / float(2 * myeps)
        perturb[i] = 0

    return mygrad
    
def checkNNGradientsSinReg():
    #Create small problem
    X_t = np.random.rand(4, 3)
    Theta_t = np.random.rand(5, 3)

    #Zap out most entries
    Y = X_t @ Theta_t.T
    dim = Y.shape
    aux = np.random.rand(*dim)
    Y[aux > 0.5] = 0
    R = np.zeros((Y.shape))
    R[Y != 0] = 1

    #Run Gradient Checking
    dim_X_t = X_t.shape
    dim_Theta_t = Theta_t.shape
    X = np.random.randn(*dim_X_t)
    Theta = np.random.randn(*dim_Theta_t)
    num_users = Y.shape[1]
    num_movies = Y.shape[0]
    num_features = Theta_t.shape[1]

    params = np.concatenate((np.ravel(X,order='F'), np.ravel(Theta,order='F')))

    # Calculo gradiente mediante aproximación numérica
    mygrad = computeNumericalGradientSinReg(X, Theta, Y, R, num_features)

    #Calculo gradiente
    grad = cofiGradientFuncSinReg(params, Y, R, num_features)

    # Visually examine the two gradient computations.  The two columns
    # you get should be very similar.
    df = pd.DataFrame(mygrad,grad)
    print(df)

    # Evaluate the norm of the difference between two solutions.
    # If you have a correct implementation, and assuming you used EPSILON = 0.0001
    # in computeNumericalGradient.m, then diff below should be less than 1e-9
    diff = np.linalg.norm((mygrad-grad))/np.linalg.norm((mygrad+grad))

    print('If your gradient implementation is correct, then the differences will be small (less than 1e-9):' , diff)
    

In [7]:
grad = cofiGradientFuncSinReg(params, Y_sub, R_sub, features)
print("Gradient without regularization at loaded parameters: \n", grad)

checkNNGradientsSinReg()


Gradient without regularization at loaded parameters: 
 [ -2.52899165  -0.56819597  -0.83240713  -0.38358278  -0.80378006
   7.57570308   3.35265031   4.91163297   2.26333698   4.74271842
  -1.89979026  -0.52339845  -0.76677878  -0.35334048  -0.74040871
 -10.5680202    4.62776019  -7.16004443  -3.05099006   1.16441367
  -3.47410789   0.           0.           0.           0.
   0.           0.        ]
                    0
-4.996603   -4.996603
-2.115634   -2.115634
 0.347809    0.347809
-7.243245   -7.243245
 5.218865    5.218865
 1.230161    1.230161
 9.828190    9.828190
 16.302149  16.302149
 4.844864    4.844864
 2.011649    2.011649
 0.027489    0.027489
 18.144868  18.144868
-3.485974   -3.485974
-8.310673   -8.310673
-6.142032   -6.142032
-6.934408   -6.934408
 3.206941    3.206941
 1.622969    1.622969
-2.890199   -2.890199
-6.291380   -6.291380
-3.083133   -3.083133
 0.178006    0.178006
 0.424371    0.424371
 0.313633    0.313633
 0.383903    0.383903
 5.471823    5.471823


#### EJ04.
Implementar la función coste y la función gradiente con regularización en cofiCostFuncReg y cofiGradientFuncReg respectivamente. Se debe incluir el parámetro lambda inicializado a 1.5. La función coste debe devolver un coste de 31.34 aproximadamente si usa las matrices de parámetros X y Theta almacenadas en el fichero “ex8_movieParams.mat” para los 4 primeros usuarios, 5 primeras películas y 3 primeros atributos. Usar la función auxiliar checkNNGradientsReg con el parámetro lambda inicializado a 1.5 para verificar que los gradientes están bien calculados.

##### Solución:

In [8]:
def cofiCostFuncReg(params, Y, R, num_features, lambda_param):
    # No es necesario: Y = np.matrix(Y)
    # No es necesario: R = np.matrix(R)
    num_movies = Y.shape[0]
    num_users = Y.shape[1]

    # Unfold the X and Theta matrices from params by reshaping
    X = np.reshape(params[:num_movies * num_features], (num_movies, num_features), 'F')  # [n_items, features] # Antes: X = np.matrix(np.reshape(params[:num_movies * num_features], (num_movies, num_features),'F'))
    Theta = np.reshape(params[num_movies * num_features:], (num_features, num_users), 'F') # [n_features, n_users] # Antes: Theta = np.matrix(np.reshape(params[num_movies * num_features:], (num_users, num_features),'F'))

    # Initializations
    J = 0

    ## ====================== YOUR CODE HERE ======================
    # Instructions: Compute the cost function for collaborative
    #               filtering. Concretely, you should implement the cost
    #               function (with regularization) and make sure it is
    #               matches our costs.
    #
    #  Notes: X:  num_movies  x num_features matrix of movie features
    #        Theta:  num_users  x num_features matrix of user features
    #        Y: num_movies x num_users matrix of user ratings of movies
    #        R: num_movies x num_users matrix, where R(i, j) = 1 if the
    #           i-th movie was rated by the j-th user


    error = np.multiply(np.dot(X, Theta) - Y, R) # [n_items, n_users]  # Antes: error = np.multiply((X * Theta.T) - Y, R)

    squared_error = np.power(error, 2)  # [n_items, n_users]
    J = (1. / 2) * np.sum(squared_error)

    # add the cost regularization
    J = J + ((lambda_param / 2) * np.sum(np.power(Theta, 2))) # Se aplica de forma global: empezamos en 1 (aquí no añadimos 1)
    J = J + ((lambda_param / 2) * np.sum(np.power(X, 2)))

    return J
    

In [9]:
def cofiGradientFuncReg(params, Y, R, num_features, lambda_param):
    # No es necesario: Y = np.matrix(Y)
    # No es necesario: R = np.matrix(R)
    num_movies = Y.shape[0]
    num_users = Y.shape[1]

    # Unfold the X and Theta matrices from params by reshaping
    X = np.reshape(params[:num_movies * num_features], (num_movies, num_features), 'F')  # [n_items, features] # Antes: X = np.matrix(np.reshape(params[:num_movies * num_features], (num_movies, num_features),'F'))  # (1682, 10)
    Theta = np.reshape(params[num_movies * num_features:], (num_features, num_users), 'F')  # [features, n_users] # Antes: Theta = np.matrix(np.reshape(params[num_movies * num_features:], (num_users, num_features),'F'))  # (943, 10)

    # Initializations
    X_grad = np.zeros(X.shape)  # [n_items, features]
    Theta_grad = np.zeros(Theta.shape)  # [feauteres, n_users]

    ## ====================== YOUR CODE HERE ======================
    # Instructions: Compute the gradient function for collaborative
    #               filtering with regularization.
    #
    #  Notes: X:  num_movies  x num_features matrix of movie features
    #        Theta:  num_users  x num_features matrix of user features
    #        Y: num_movies x num_users matrix of user ratings of movies
    #        R: num_movies x num_users matrix, where R(i, j) = 1 if the
    #           i-th movie was rated by the j-th user
    #
    error = np.multiply(np.dot(X, Theta) - Y, R)  # [n_items, n_users] # Antes: error = np.multiply((X * Theta.T) - Y, R)  # (1682, 943)

    # calculate the gradients with regularization
    X_grad = np.dot(error, Theta.T) + (lambda_param*X) # Antes: X_grad = (error * Theta) + (lambda_param * X)
    Theta_grad = np.dot(X.T, error) + (lambda_param*Theta) # Antes: Theta_grad = (error.T * X) + (lambda_param * Theta)

    # unravel the gradient matrices into a single array
    grad = np.hstack((np.ravel(X_grad,order='F'), np.ravel(Theta_grad,order='F'))) # ANTES: grad = np.concatenate((np.ravel(X_grad,order='F'), np.ravel(Theta_grad,order='F')))

    return grad

In [10]:
def computeNumericalGradientReg(X,Theta, Y, R, num_features, lambda_param):
    mygrad = np.zeros(Theta.size + X.size)
    perturb = np.zeros(Theta.size + X.size)
    myeps = 0.0001
    params = np.concatenate((np.ravel(X, order='F'), np.ravel(Theta, order='F')))

    for i in range(np.size(Theta)+np.size(X)):
        # Set perturbation vector
        perturb[i] = myeps
        params_plus = params + perturb
        params_minus = params - perturb
        cost_high = cofiCostFuncReg(params_plus, Y, R, num_features, lambda_param)
        cost_low = cofiCostFuncReg(params_minus, Y, R, num_features, lambda_param)

        # Compute Numerical Gradient
        mygrad[i] = (cost_high - cost_low) / float(2 * myeps)
        perturb[i] = 0

    return mygrad
    
def checkNNGradientsReg(lambda_param):
    #Create small problem
    X_t = np.random.rand(4, 3)
    Theta_t = np.random.rand(5, 3)

    #Zap out most entries
    Y = X_t @ Theta_t.T
    dim = Y.shape
    aux = np.random.rand(*dim)
    Y[aux > 0.5] = 0
    R = np.zeros((Y.shape))
    R[Y != 0] = 1

    #Run Gradient Checking
    dim_X_t = X_t.shape
    dim_Theta_t = Theta_t.shape
    X = np.random.randn(*dim_X_t)
    Theta = np.random.randn(*dim_Theta_t)
    num_users = Y.shape[1]
    num_movies = Y.shape[0]
    num_features = Theta_t.shape[1]

    params = np.concatenate((np.ravel(X,order='F'), np.ravel(Theta,order='F')))

    # Calculo gradiente mediante aproximación numérica
    mygrad = computeNumericalGradientReg(X, Theta, Y, R, num_features, lambda_param)

    #Calculo gradiente
    grad = cofiGradientFuncReg(params, Y, R, num_features, lambda_param)

    # Visually examine the two gradient computations.  The two columns
    # you get should be very similar.
    df = pd.DataFrame(mygrad,grad)
    print(df)

    # Evaluate the norm of the difference between two solutions.
    # If you have a correct implementation, and assuming you used EPSILON = 0.0001
    # in computeNumericalGradient.m, then diff below should be less than 1e-9
    diff = np.linalg.norm((mygrad-grad))/np.linalg.norm((mygrad+grad))

    print('If your gradient implementation is correct, then the differences will be small (less than 1e-9):' , diff)
    

In [11]:
# Evaluate cost function and gradient function, both with regularization
lambda_param = 1.5
J = cofiCostFuncReg(params, Y_sub, R_sub, features, lambda_param)
print("\n\nCost with regularization at loaded parameters: ", J, "(this value should be about 31.34)")

grad = cofiGradientFuncReg(params, Y_sub, R_sub, features, lambda_param)
print("Gradient with regularization at loaded parameters: \n", grad)
checkNNGradientsReg(lambda_param)




Cost with regularization at loaded parameters:  31.34405624427422 (this value should be about 31.34)
Gradient with regularization at loaded parameters: 
 [ -0.95596339   0.60308088   0.12985616   0.29684395   0.60252677
   6.97535514   2.77421145   4.0898522    1.06300933   4.90185327
  -0.10861109   0.25839822  -0.89247334   0.66738144  -0.19747928
 -10.13985478   2.10136256  -6.76563628  -2.29347024   0.48244098
  -2.99791422  -0.64787484  -0.71820673   1.27006666   1.09289758
  -0.40784086   0.49026541]
                    0
-4.374483   -4.374483
-4.616106   -4.616106
-13.697149 -13.697149
-2.596977   -2.596977
 7.684187    7.684187
-0.794055   -0.794055
 7.514065    7.514065
 3.974738    3.974738
-10.019073 -10.019073
-0.737630   -0.737630
-3.691873   -3.691873
-5.696226   -5.696226
-0.593458   -0.593458
-3.027686   -3.027686
 5.053231    5.053231
 2.629300    2.629300
-5.540864   -5.540864
 10.178213  10.178213
 11.996270  11.996270
-6.552486   -6.552486
-0.417659   -0.417659
 0

#### EJ05.
Inicializar de forma random con valores pequeños tanto la matriz X como la matriz Theta para todo el conjunto de datos, utilize la función np.random.rand() indicando las dimensiones en los parámetros de entrada. A continuación, entrenar con regularización para obtener los parámetros óptimos X y Theta usando la función fmin_cg de la librería scipy.optimize con 200 iteraciones y lambda con valor 1.5.

##### Solución:

In [12]:
# Useful Values
movies = Y.shape[0]  # 1682
users = Y.shape[1]  # 943
features = 10
lambda_param = 1.5
maxiter = 200

# Set Initial Parameters (Theta, X)
X = np.random.rand(movies, features) * (2*0.12) #Escala los valores al rango [0, 0.24]
Theta = np.random.rand(features, users) * (2*0.12) # Antes: Theta = np.random.random(size=(users, features))
params = np.hstack((np.ravel(X, order='F'), np.ravel(Theta, order='F'))) # Antes: params = np.concatenate((np.ravel(X,order='F'), np.ravel(Theta,order='F')))

# OPTIMIZATION
fmin_1 = opt.fmin_cg(maxiter=maxiter, f=cofiCostFuncReg, x0=params, fprime=cofiGradientFuncReg,
                  args=(Y, R, features, lambda_param))

# Unfold the returned result into X and Theta: the trained parameters
X_fmin = np.reshape(fmin_1[:movies * features], (movies, features), 'F') # Antes: X = np.matrix(np.reshape(fmin[:movies * features], (movies, features),'F'))
Theta_fmin = np.reshape(fmin_1[movies * features:], (features, users), 'F') # Antes: Theta = np.matrix(np.reshape(fmin[movies * features:], (users, features),'F'))


         Current function value: 32990.506782
         Iterations: 200
         Function evaluations: 301
         Gradient evaluations: 301


  res = _minimize_cg(f, x0, args, fprime, callback=callback, c1=c1, c2=c2,


#### EJ06.
Después del entrenamiento, conseguir la matriz de predicciones. Además, imprimir por pantalla la recomendación de las 10 películas con mejores puntuaciones para el usuario 2. Deben ser películas que no estuviesen previamente puntuadas por dicho usuario, para ello use np.where() con la correspondiente condición.

##### Solución:

In [13]:
predictions = np.dot(X_fmin, Theta_fmin)  # Antes: predictions = X * Theta.T
# SOLO EL USUARIO j COGEMOS SUS PELÍCULAS Y A LAS PREDICHAS LES PONEMOS EL VALOR DE PREDICCION Y A LAS QUE NO, UN 0
j = 2
pred_userj = predictions[:, j]
res_user = np.zeros((movies, 1))
for i in range(movies):
    res_user[i, 0] = np.where(R[i,j]==0, predictions[i,j], 0) # Ponemos 0 si la película ya estaba puntuada y si no el valor de la predicción.
idx = np.argsort(res_user, axis=0)[::-1] # Ordenar por las predicciones de menor a mayor y coger sus índice. [::-1] significa que le damos la vuelta a la salida: de mayor a menor

# Leer el fichero con los nombres de cada película
movie_idx = {}
f = open('movie_ids.txt',encoding = 'ISO-8859-1')
for line in f:
    tokens = line.split(' ')
    tokens[-1] = tokens[-1][:-1]
    movie_idx[int(tokens[0]) - 1] = ' '.join(tokens[1:])
    
print("Top 10 movie predictions:")
for i in range(10):
    j = int(idx[i])
    print('Predicted rating of {0} for movie {1}.'.format(str(float(res_user[j])), movie_idx[j]))


Top 10 movie predictions:
Predicted rating of 5.731326570026947 for movie Big Lebowski, The (1998).
Predicted rating of 5.708573491758893 for movie She's So Lovely (1997).
Predicted rating of 5.46243147557749 for movie Clerks (1994).
Predicted rating of 5.259416729791748 for movie Last Supper, The (1995).
Predicted rating of 5.224206542374071 for movie Pillow Book, The (1995).
Predicted rating of 5.223051006196031 for movie Flirting With Disaster (1996).
Predicted rating of 5.127633460994417 for movie Heavy Metal (1981).
Predicted rating of 5.069148600494864 for movie Spawn (1997).
Predicted rating of 5.013033352809765 for movie Mallrats (1995).
Predicted rating of 4.945568310277267 for movie Koyaanisqatsi (1983).


  j = int(idx[i])
  print('Predicted rating of {0} for movie {1}.'.format(str(float(res_user[j])), movie_idx[j]))
