## Introduction
This project demonstrates the implementation of a Neural Collaborative Filtering (NCF) model for a movie recommendation system using the MovieLens dataset. Collaborative filtering is a popular method in recommendation systems that makes predictions about user preferences based on the preferences of other users. By integrating neural network architectures, NCF models can capture the complex non-linear relationships between users and items (movies, in this case), leading to more accurate and personalized recommendations.

The purpose of this project is to:

Introduce the concept and implementation of Neural Collaborative Filtering.

Demonstrate how to preprocess data for a recommendation system.

Showcase the building, training, and evaluation of an NCF model using TensorFlow's Keras API.

In [34]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Embedding, Flatten, Dense, Concatenate, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import mean_squared_error

In [21]:
# Load the dataset
# MovieLens 100K dataset

df = pd.read_csv('ml-latest-small/ratings.csv')

# Encode user and movie IDs to create dense indices
user_encoder = LabelEncoder()
df['userId'] = user_encoder.fit_transform(df['userId'])
movie_encoder = LabelEncoder()
df['movieId'] = movie_encoder.fit_transform(df['movieId'])

# Define the number of unique users and movies
num_users = df['userId'].nunique()
num_movies = df['movieId'].nunique()

# Prepare training and testing datasets
X = df[['userId', 'movieId']].values

# Normalization: Neural networks generally perform better when the input data
# is normalized or standardized. This is because normalization helps in speeding up the learning process
# and reaching convergence faster. It ensures that the scale of the output matches the scale of the activation function.
y = df['rating'].values / 5.0 # Scale ratings to [0, 1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## Defining the Neural Collaborative Filtering Model

In this step, we will define our Neural Collaborative Filtering model using Keras. The model will have the following components:
- **Embedding Layers**: To convert user and movie IDs into dense vectors of fixed size.
- **Flatten Layers**: To convert the embeddings into a format suitable for input to the dense layers.
- **Concatenate Layer**: To merge the user and movie embeddings.
- **Dense Layers**: A series of dense layers that learn to predict user ratings from the concatenated embeddings.
- **Output Layer**: A single neuron with a sigmoid activation function to predict the scaled rating.


In [32]:
# Model architecture
embedding_size = 50

# User and Movie Input layers
user_input = Input(shape=(1,), name='user_input')
movie_input = Input(shape=(1,), name='movie_input')

# Embedding layers
user_embedding = Embedding(input_dim=num_users, output_dim=embedding_size, name='user_embedding')(user_input)
movie_embedding = Embedding(input_dim=num_movies, output_dim=embedding_size, name='movie_embedding')(movie_input)

# Flatten the embeddings
user_vec = Flatten(name='flatten_users')(user_embedding)
movie_vec = Flatten(name='flatten_movies')(movie_embedding)

# Concatenate the flattened embeddings
concat = Concatenate()([user_vec, movie_vec])

# Dense layers
dense_1 = Dense(128, activation='relu')(concat)
output = Dense(1, activation='sigmoid')(dense_1)  # Predicts the scaled rating

# Compile the model
model = Model(inputs=[user_input, movie_input], outputs=output)
model.compile(optimizer='adam', loss='mean_squared_error')


## Training the Model
We use mean squared error as the loss function and Adam as the optimizer. The model will be trained for a predefined number of epochs, and we will also use a validation split to monitor its performance on unseen data during training.


In [23]:
history = model.fit([X_train[:, 0], X_train[:, 1]], y_train, batch_size=64, epochs=5, validation_split=0.2)

Epoch 1/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - loss: 0.0396 - val_loss: 0.0311
Epoch 2/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0272 - val_loss: 0.0308
Epoch 3/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0248 - val_loss: 0.0309
Epoch 4/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0227 - val_loss: 0.0316
Epoch 5/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0210 - val_loss: 0.0321


## Observations
Decreasing Training Loss: Training loss is decreasing with each epoch, from 0.0399 in the first epoch to 0.0212 by the fifth epoch. This indicates that the model is learning from the training data and improving its predictions over time.

Validation Loss Behavior: The validation loss decreases initially, reaching its lowest at the second epoch (0.0308), but then it starts to increase slightly in epochs 3,4 and 5. This could be an early sign of overfitting, where the model performs better on the training data but slightly worse on unseen data (validation set).

In [24]:
early_stopping = EarlyStopping(monitor='val_loss', patience=2, restore_best_weights=True)

history = model.fit(
    [X_train[:, 0], X_train[:, 1]], y_train,
    batch_size=64,
    epochs=5,
    validation_split=0.2,
    callbacks=[early_stopping]
)


Epoch 1/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0188 - val_loss: 0.0326
Epoch 2/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0168 - val_loss: 0.0343
Epoch 3/5
[1m1009/1009[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.0148 - val_loss: 0.0352


## Early Stopping Effectiveness: 
The early stopping callback functioned as intended. It stopped the training process when it detected that the validation loss was no longer decreasing (and actually began to increase), which occurred after the third epoch. This helped prevent further overfitting by not allowing the model to continue learning from the training data to the point where its performance on the validation set could have worsened.

In [25]:
# adjusting the learning rate
optimizer = Adam(learning_rate=0.001)  # Try different learning rates
model.compile(optimizer=optimizer, loss='mean_squared_error')

# Training with a different batch size
history = model.fit([X_train[:, 0], X_train[:, 1]], y_train, batch_size=32, epochs=5, validation_split=0.2)  # Adjusted batch size


Epoch 1/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0173 - val_loss: 0.0339
Epoch 2/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0154 - val_loss: 0.0352
Epoch 3/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0135 - val_loss: 0.0370
Epoch 4/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0116 - val_loss: 0.0370
Epoch 5/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0105 - val_loss: 0.0382


In [19]:

# Assuming these are your existing parameters and input layers
embedding_size = 50
user_input = Input(shape=(1,), name='user_input')
movie_input = Input(shape=(1,), name='movie_input')

# Embedding layers
user_embedding = Embedding(input_dim=num_users, output_dim=embedding_size, name='user_embedding')(user_input)
movie_embedding = Embedding(input_dim=num_movies, output_dim=embedding_size, name='movie_embedding')(movie_input)

# Flatten the embeddings
user_vec = Flatten(name='flatten_users')(user_embedding)
movie_vec = Flatten(name='flatten_movies')(movie_embedding)

# Concatenate the flattened embeddings
concat = Concatenate()([user_vec, movie_vec])

# Adding dropout after concatenation
concat_dropout = Dropout(0.5)(concat)  # Dropout layer with a 50% dropout rate

# Dense layer after dropout
dense_1 = Dense(64, activation='relu')(concat_dropout)  # Reduced the number of neurons from previous examples to 64


dense_dropout = Dropout(0.4)(dense_1)  # Additional dropout layer

# Output layer
output = Dense(1, activation='sigmoid')(dense_dropout)

# Compile the model
model = Model(inputs=[user_input, movie_input], outputs=output)
optimizer = Adam(learning_rate=0.001)  # Adjust learning rate as needed
model.compile(optimizer=optimizer, loss='mean_squared_error')




early_stopping = EarlyStopping(monitor='val_loss', patience=2, restore_best_weights=True)

# Add early_stopping to your model.fit() callbacks
history = model.fit(
    [X_train[:, 0], X_train[:, 1]], y_train,
    batch_size=32,
    epochs=5,  # Increased epochs
    validation_split=0.2,
    callbacks=[early_stopping]
)


Epoch 1/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0405 - val_loss: 0.0317
Epoch 2/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0295 - val_loss: 0.0309
Epoch 3/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0279 - val_loss: 0.0307
Epoch 4/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0267 - val_loss: 0.0307
Epoch 5/5
[1m2017/2017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - loss: 0.0261 - val_loss: 0.0308


## Evaluating the Model Again
After retraining the model with the adjustments, evaluating its performance on the test set again to see if the changes helped reduce overfitting and improved generalization.

In [26]:
test_loss = model.evaluate([X_test[:, 0], X_test[:, 1]], y_test)
print(f'Test Loss: {test_loss}')


[1m631/631[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 236us/step - loss: 0.0375
Test Loss: 0.037041060626506805


## Model Performance

In [28]:
predictions = model.predict([X_test[:, 0], X_test[:, 1]])
# Calculate RMSE
rmse = np.sqrt(mean_squared_error(y_test, predictions))
print(f"Test RMSE: {rmse}")


[1m631/631[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 398us/step
Test RMSE: 0.19242336325658457


## Interpretation
An RMSE of 0.1924 indicates that, on average, the model's predictions deviate from the actual ratings by about 0.1924 points on the normalized scale (since we normalized ratings to be between 0 and 1). Considering that the original ratings likely range from 1 to 5, this level of error might be quite acceptable, depending on the specifics of your application and the level of precision required for this case .

RMSE is particularly useful for understanding the size of the error in the same units as the predicted quantity, making it easier to interpret in the context of the problem domain.

## conclusion
This project has successfully demonstrated the construction and evaluation of a Neural Collaborative Filtering (NCF) model using the MovieLens dataset, achieving promising results with a test RMSE of approximately 0.1924. This metric indicates a relatively low average error in the model's rating predictions, suggesting that the NCF model can effectively capture user preferences and predict unseen movie ratings. The incorporation of techniques such as embedding layers, dropout for regularization, and early stopping to prevent overfitting has shown to be beneficial in refining the model's performance.

To further enhance the model, several strategies can be considered. First, fine-tuning the model's hyperparameters, including the learning rate, embedding size, and dropout rates, could yield improvements in accuracy. Exploring different model architectures or adding additional layers could also help in capturing more complex user-item interactions. Moreover, incorporating additional data, such as movie genres or user demographics, might enrich the model's understanding and lead to more personalized recommendations.

In conclusion, while the model has shown promising results, the path to optimizing a recommendation system is iterative and requires continuous experimentation and adaptation. The insights gained from this project lay a solid foundation for future enhancements, driving towards a more accurate and user-centric recommendation system.