#### Deep Learning
#### Laboratorio 6: Sistemas de Recomendaciones
##### Sistema de recomendaciones basado en filtros colectivos
##### Autores: 
- Roberto Rios 20979
- Javier Mombiela 20067

#### Importando librerias

In [40]:
# importando librerias
import numpy as np
import pandas as pd
import tensorflow as tf
from keras.models import Model
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from keras.layers import Input, Embedding, Flatten, Dense, Concatenate

#### Carga de Datos

In [41]:
books = pd.read_csv('datasets/Books.csv')
users = pd.read_csv('datasets/Users.csv')
ratings = pd.read_csv('datasets/Ratings.csv')

  books = pd.read_csv('datasets/Books.csv')


In [42]:
books = books.drop(['Image-URL-S', 'Image-URL-M', 'Image-URL-L'], axis=1)

#### Preprocesamiento de los datos

In [43]:
books['ISBN'] = books['ISBN'].astype(str)
ratings['ISBN'] = ratings['ISBN'].astype(str)

user_enc = LabelEncoder()
users['User-ID'] = user_enc.fit_transform(users['User-ID'].values)
n_users = users['User-ID'].nunique()

item_enc = LabelEncoder()
item_enc.fit(pd.concat([books['ISBN'], ratings['ISBN']]))
books['ISBN'] = item_enc.transform(books['ISBN'])
ratings['ISBN'] = item_enc.transform(ratings['ISBN'])
n_books = len(item_enc.classes_)

In [44]:
print("User-ID encodings:")
print(users['User-ID'].unique())

print("ISBN encodings:")
print(books['ISBN'].unique())

User-ID encodings:
[     0      1      2 ... 278855 278856 278857]
ISBN encodings:
[ 32170    231  10531 ...   5601  30800 192622]


Seleccionamos las caracteristicas y el target

In [45]:
X = ratings[['User-ID', 'ISBN']].values
y = ratings['Book-Rating'].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)

#### Creacion del modelo

In [46]:
user_input = Input(shape=[1])
user_embedding = Embedding(n_users, 5)(user_input)
user_vec = Flatten()(user_embedding)

book_input = Input(shape=[1])
book_embedding = Embedding(n_books, 5)(book_input)
book_vec = Flatten()(book_embedding)

concat = Concatenate()([user_vec, book_vec])
dense1 = Dense(128, activation='relu')(concat)
dense2 = Dense(32, activation='relu')(dense1)
output = Dense(1)(dense2)

In [47]:
model = Model([user_input, book_input], output)
model.summary()

Model: "model_2"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_6 (InputLayer)        [(None, 1)]                  0         []                            
                                                                                                  
 input_7 (InputLayer)        [(None, 1)]                  0         []                            
                                                                                                  
 embedding_4 (Embedding)     (None, 1, 5)                 1394290   ['input_6[0][0]']             
                                                                                                  
 embedding_5 (Embedding)     (None, 1, 5)                 1708825   ['input_7[0][0]']             
                                                                                            

Compilamos el modelo

In [48]:
model.compile(loss='mean_squared_error', optimizer='adam')

Entrenamos del modelo

In [49]:
model.fit([X_train[:,0], X_train[:,1]], y_train, batch_size=64, epochs=3, verbose=1, validation_data=([X_test[:,0], X_test[:,1]], y_test))

Epoch 1/3


Epoch 2/3
Epoch 3/3


<keras.src.callbacks.History at 0x1962831f490>

#### Predicciones del modelo

In [56]:
def recommend_books(user_id, num_recommendations):
    # Obtén los libros que el usuario aún no ha calificado
    user_ratings = ratings[ratings['User-ID'] == user_id]
    unrated_books = books[~books['ISBN'].isin(user_ratings['ISBN'])]

    # Crea un array de entrada para el modelo
    user_array = np.array([user_id for _ in range(len(unrated_books))])
    book_array = np.array(unrated_books['ISBN'])

    # Usa el modelo para predecir las calificaciones
    predictions = model.predict([user_array, book_array])

    # Añade las predicciones al dataframe de libros no calificados
    unrated_books['Predicted-Rating'] = predictions 

    # Ordena los libros por la calificación predicha
    recommended_books = unrated_books.sort_values(by='Predicted-Rating', ascending=False)

    # Devuelve solo el título, el autor y la calificación predicha de los libros con las calificaciones más altas
    return recommended_books[['Book-Title', 'Book-Author', 'Predicted-Rating']][:num_recommendations]

Recomendaciones para un usuario especifico

In [58]:
print(recommend_books(9, 10))

                                               Book-Title  \
78867                           The Shrinking of Treehorn   
184411  Michelin THE GREEN GUIDE Quebec, 4e (THE GREEN...   
79431   The Blue Day Book: A Lesson in Cheering Yourse...   
31331                              A Kiss for Little Bear   
38292                                           The Lorax   
3028                                                 Free   
16190                                          Falling Up   
238677                          Fiction Writer's Handbook   
66613                                  M.Y.T.H. Inc. Link   
53754             A Baby...Maybe -- How To Hunt a Husband   

                         Book-Author  Predicted-Rating  
78867           Florence Parry Heide          9.678238  
184411  Michelin Travel Publications          9.446127  
79431          Bradley Trevor Greive          9.397828  
31331         Else Holmelund Minarik          9.312489  
38292                      Dr. Seuss       

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  unrated_books['Predicted-Rating'] = predictions * 1.3
