# **Collaborative Filtering-based Recommendation System**

#This notebook demonstrates a Collaborative Filtering-based Recommendation System implemented using the SVD (Singular Value Decomposition) algorithm from the Surprise library. The approach focuses on user-item interactions to learn latent factors representing preferences and item characteristics.


In [4]:
!pip install numpy==1.23.5

Collecting numpy==1.23.5
  Downloading numpy-1.23.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.3 kB)
Downloading numpy-1.23.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.1/17.1 MB[0m [31m57.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: numpy
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jaxlib 0.5.1 requires numpy>=1.25, but you have numpy 1.23.5 which is incompatible.
jax 0.5.2 requires numpy>=1.25, but you have numpy 1.23.5 which is incompatible.
xarray 2025.3.1 requires numpy>=1.24, but you have numpy 1.23.5 which is incompatible.
tensorflow 2.18.0 requires numpy<2.1.0,>=1.26.0, but you have numpy 1.23.5 which is incompatible.
bigframes 2.4.0 requires numpy>=1.24.0, but you have numpy 1.23.5 which is incompatib

In [1]:
!pip install scikit-surprise

Collecting scikit-surprise
  Using cached scikit_surprise-1.1.4-cp311-cp311-linux_x86_64.whl
Installing collected packages: scikit-surprise
Successfully installed scikit-surprise-1.1.4


In [2]:
from surprise import SVD, Dataset, Reader
from surprise.model_selection import cross_validate, train_test_split
import pandas as pd

# Load built-in MovieLens dataset (100k)
data = Dataset.load_builtin('ml-100k')

# Train-test split
trainset, testset = train_test_split(data, test_size=0.25)

# Build & Train SVD Model
algo = SVD()
algo.fit(trainset)

# Predict on test set
predictions = algo.test(testset)

# Evaluate with RMSE and MAE
from surprise.accuracy import rmse, mae
print("🔍 Evaluation Metrics:")
rmse(predictions)
mae(predictions)

# Generate Top-N recommendations for a sample user
from collections import defaultdict

def get_top_n(predictions, n=5):
    '''Return the top-N recommendation for each user from a set of predictions.'''
    top_n = defaultdict(list)
    for uid, iid, true_r, est, _ in predictions:
        top_n[uid].append((iid, est))
    for uid, user_ratings in top_n.items():
        user_ratings.sort(key=lambda x: x[1], reverse=True)
        top_n[uid] = user_ratings[:n]
    return top_n

top_n = get_top_n(predictions, n=5)

print("\n🎬 Top-5 Recommendations for Sample Users:")
for uid, user_ratings in list(top_n.items())[:3]:
    print(f"User {uid}: {[iid for (iid, _) in user_ratings]}")

# Convert predictions to DataFrame for better viewing
def predictions_to_df(predictions):
    return pd.DataFrame([(pred.uid, pred.iid, pred.est) for pred in predictions],
                        columns=['UserID', 'ItemID', 'EstimatedRating'])

pred_df = predictions_to_df(predictions)
display(pred_df.head())


Dataset ml-100k could not be found. Do you want to download it? [Y/n] y
Trying to download dataset from https://files.grouplens.org/datasets/movielens/ml-100k.zip...
Done! Dataset ml-100k has been saved to /root/.surprise_data/ml-100k
🔍 Evaluation Metrics:
RMSE: 0.9449
MAE:  0.7449

🎬 Top-5 Recommendations for Sample Users:
User 566: ['56', '166', '23', '651', '467']
User 583: ['265', '209', '655', '524', '663']
User 248: ['168', '185', '153', '96', '187']


Unnamed: 0,UserID,ItemID,EstimatedRating
0,566,7,3.913126
1,583,524,4.3932
2,248,121,3.194654
3,495,184,3.871482
4,934,805,3.56376
