<a href="https://colab.research.google.com/github/Bhandari007/recommendation_system/blob/main/rating_prediction_using_nn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Course Rating Prediction Using Neural Network

Neural Networks can be used to extract the latent user and item features. They are very good at learning patterns from data. When training nerual networks, it gradually captures and stores the features within its hidden layers as weight matrices and can be extracted to represent the original data.

### Objectives:

* Use `tensorflow` to train neural networks to extract the user and item latent features from the hidden layers

* Predict course ratings with trained neural networks

# Packages

In [1]:
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow import keras
from tensorflow.keras import layers

In [2]:
# Set random state
rs = 123

# Loading and preprocessing dataset

In [3]:
rating_url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-ML321EN-SkillsNetwork/labs/datasets/ratings.csv"
rating_df = pd.read_csv(rating_url)
rating_df.head()

Unnamed: 0,user,item,rating
0,1889878,CC0101EN,3.0
1,1342067,CL0101EN,3.0
2,1990814,ML0120ENv3,3.0
3,380098,BD0211EN,3.0
4,779563,DS0101EN,3.0


We have three columns: `user`, `item` and `rating`

In [4]:
num_users = len(rating_df['user'].unique())
num_items = len(rating_df['item'].unique())
print(f"There are total `{num_users}` of users and `{num_items}` items")

There are total `33901` of users and `126` items


It means our each user can be represented as `33901x1` one-hot vector and each item can be represented as `126x1` one-hot vector

# Implementing the recommender neural network using tensorflow

In [26]:
class RecommenderNet(keras.Model):


  
  def __init__(self, num_user, num_items, embedding_size = 16, **kwargs):
    """
    Constructor
      :param int num_users: number of users
      :param int num_items: number of items
      :param int embedding_size: the size of embedding_vector

    """
    super(RecommenderNet, self).__init__(**kwargs)
    self.num_users = num_users
    self.num_items = num_items
    self.embedding_size = embedding_size

    self.user_embedding_layer = layers.Embedding(
        input_dim = num_users,
        output_dim = embedding_size,
        name = 'user_embedding_layer',
        embeddings_initializer = "he_normal",
        embeddings_regularizer = keras.regularizers.l2(1e-6)
    )

    self.user_bias = layers.Embedding(
        input_dim = num_users,
        output_dim = 1,
        name = 'user_bias'
    )
    # Define an item_embedding vector
        # Input dimension is the num_items
        # Output dimension is the embedding size
    self.item_embedding_layer = layers.Embedding(
            input_dim=num_items,
            output_dim=embedding_size,
            name='item_embedding_layer',
            embeddings_initializer="he_normal",
            embeddings_regularizer=keras.regularizers.l2(1e-6),
        )
        # Define an item bias layer
    self.item_bias = layers.Embedding(
            input_dim=num_items,
            output_dim=1,
            name="item_bias")
  def call(self, inputs):
    """
    method to be called during model fitting        
    :param inputs: user and item one-hot vectors
    """
    # Compute the user embedding vector
    user_vector = self.user_embedding_layer(inputs[:, 0])
    user_bias = self.user_bias(inputs[:, 0])
    item_vector = self.item_embedding_layer(inputs[:, 1])
    item_bias = self.item_bias(inputs[:, 1])
    dot_user_item = tf.tensordot(user_vector, item_vector, 2)
    # Add all the components (including bias)
    x = dot_user_item + user_bias + item_bias
    # Sigmoid output layer to output the probability
    return tf.nn.relu(x)
        



## Train and evaluate the RecommendeNet()

### Creating one-hot encoding vectors

In [6]:
def process_dataset(raw_data):

  encoded_data = raw_data.copy()

  # Map use ids to indices
  user_list = encoded_data['user'].unique().tolist()
  user_id2idx_dict = {x: i for i, x in enumerate(user_list)}
  user_idx2id_dict = {i: x for i, x in enumerate(user_list)}

  # Map course ids to indices
  course_list = encoded_data['item'].unique().tolist()
  course_id2idx_dict = {x: i for i, x in enumerate(course_list)}
  course_idx2id_dict = {i: x for i, x in enumerate(course_list)}

  # Convert original user ids to idx
  encoded_data["user"] = encoded_data["user"].map(user_id2idx_dict)
  # Conert original course ids to idx
  encoded_data["item"] = encoded_data["item"].map(course_id2idx_dict)
  # Convert rating into in
  encoded_data["rating"] = encoded_data["rating"].values.astype("int")

  return encoded_data, user_idx2id_dict, course_idx2id_dict



In [7]:
encoded_data, user_idx2id_dict, course_idx2id_dict = process_dataset(rating_df)

In [8]:
encoded_data.head()

Unnamed: 0,user,item,rating
0,0,0,3
1,1,1,3
2,2,2,3
3,3,3,3
4,4,4,3


### Split into train and test dataset

In [9]:
def generate_train_test_datasets(dataset, scale = True):
  
  min_rating = min(dataset["rating"])
  max_rating = max(dataset["rating"])

  dataset = dataset.sample(frac = 1, random_state = 42)
  x = dataset[["user", "item"]].values
  if scale:
    y = dataset["rating"].apply(lambda x: (x- min_rating)/ (max_rating - min_rating))
  else:
    y = dataset["rating"].values

  # Assume training on 80% of the data and validation on 10% and testing 10%

  train_indices = int(0.8 * dataset.shape[0])

  test_indices = int(0.9 * dataset.shape[0])

  x_train, x_val, x_test, y_train, y_val, y_test = (
       x[:train_indices],
        x[train_indices:test_indices],
        x[test_indices:],
        y[:train_indices],
        y[train_indices:test_indices],
        y[test_indices:],
  )

  return x_train, x_val, x_test, y_train, y_val, y_test
    

In [10]:
x_train, x_val, x_test, y_train, y_val, y_test = generate_train_test_datasets(encoded_data)

In [11]:
user_indices = x_train[:, 0]
user_indices

array([ 8376,  7659, 10717, ...,  3409, 28761,  4973])

In [12]:
item_indices = x_train[:, 1]
item_indices

array([12, 29,  3, ..., 18, 19, 17])

In [13]:
y_train

173735    1.0
20280     1.0
201508    1.0
170108    1.0
187957    1.0
         ... 
107376    1.0
116707    1.0
162675    1.0
105408    0.0
41819     1.0
Name: rating, Length: 186644, dtype: float64

In [28]:
embedding_size = 16
model = RecommenderNet(num_users, num_items,embedding_size)

### Train the model

In [29]:
model.compile(
    metrics=[tf.keras.metrics.RootMeanSquaredError()], 
    optimizer = "adam", 
    loss = tf.keras.losses.MeanSquaredError())

In [33]:
history = model.fit(
    x_train,
    y_train,
    epochs = 5,
    validation_data = (x_val, y_val)
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [35]:
model.evaluate(x_test,y_test)



[0.01575668901205063, 0.12186067551374435]

# Extract the user and item embedding vectors as latent feature vectors

Now that we have trained `RecommenderNet()`, we can extract the latent feature vectors.

In [36]:
model.summary()

Model: "recommender_net_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 user_embedding_layer (Embed  multiple                 542416    
 ding)                                                           
                                                                 
 user_bias (Embedding)       multiple                  33901     
                                                                 
 item_embedding_layer (Embed  multiple                 2016      
 ding)                                                           
                                                                 
 item_bias (Embedding)       multiple                  126       
                                                                 
Total params: 578,459
Trainable params: 578,459
Non-trainable params: 0
_________________________________________________________________


In [38]:
# User features
user_latent_features = model.get_layer('user_embedding_layer').get_weights()[0]
user_latent_features.shape

(33901, 16)

In [39]:
user_latent_features[0]

array([-0.05660417, -0.00197861,  0.15610763, -0.05782869,  0.02075379,
       -0.02110894,  0.12818082,  0.01095002,  0.00592644,  0.05365551,
        0.14405552, -0.07637911, -0.06294359, -0.16191413, -0.08112145,
        0.0003381 ], dtype=float32)

In [40]:
item_latent_features = model.get_layer('item_embedding_layer').get_weights()[0]
item_latent_features.shape

(126, 16)

In [41]:
item_latent_features[0]

array([ 3.2357245e-03, -3.3233676e-03,  8.2252305e-03, -8.5670175e-03,
       -7.2578909e-03, -6.0052699e-03, -1.8494405e-02, -7.9738414e-03,
        7.8390762e-03,  3.3829689e-02,  8.1926296e-03, -1.3557538e-02,
       -3.4701516e-05,  2.4580786e-02,  2.0499341e-03, -3.3527457e-03],
      dtype=float32)

As we can see each user of the total 33901 users has been transformed into 16x1 latent feature and each item of the total 126 has been transformed into a 16x1 latent feature vector