# **Movie Recommendation System**

The Movie Recommendation System implements Collaborative Filtering, a popular technique for predicting user preferences in movies. It suggests movies to users based on the preferences of other users who have similar tastes. The system utilizes the MovieLens 100k dataset and employs Singular Value Decomposition (SVD) as the underlying model.

The code starts by installing the required libraries and loading the dataset. It then creates the SVD model and performs cross-validation to evaluate its performance. After training the model on the entire dataset, it generates predictions for user-movie pairs that are not in the training set.

To provide personalized movie recommendations, the code defines a function called get_top_n that identifies the top movies with the highest predicted ratings for each user. The top three recommendations are then displayed for every user.

In summary, the Movie Recommendation System uses Collaborative Filtering and SVD to offer personalized movie suggestions to users, enhancing their movie-watching experience by introducing them to potentially interesting films they might not have discovered otherwise.







###**Installing Libraries:**
The code begins with installing the scikit-surprise library, which is used for building recommendation systems.

In [15]:
%pip install scikit-surprise



In [16]:
from surprise import Reader, Dataset, SVD
from surprise.accuracy import rmse, mae
from surprise.model_selection import cross_validate

###**Loading the Dataset:**
The MovieLens 100k dataset is loaded using

In [17]:
data=Dataset.load_builtin('ml-100k')

In [18]:
data

<surprise.dataset.DatasetAutoFolds at 0x7d04a0281ff0>

### **Creating the Model:**
The model is initialized using Singular Value Decomposition (SVD), a matrix factorization technique commonly used in recommendation systems.

### **Cross-validation:**
Cross-validation is performed using cross_validate to evaluate the model's performance using measures like Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE).

In [19]:
# Convert dataset to pandas data frame
model=SVD()
cross_validate(model, data, measures=['RMSE','MAE'], cv=5, verbose=True)
# cv =5 means it breaks it into 5 folds

Evaluating RMSE, MAE of algorithm SVD on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    0.9442  0.9372  0.9362  0.9282  0.9370  0.9366  0.0051  
MAE (testset)     0.7427  0.7399  0.7361  0.7327  0.7381  0.7379  0.0034  
Fit time          1.47    1.38    2.18    1.41    2.08    1.70    0.35    
Test time         0.22    0.13    0.38    0.15    0.25    0.23    0.09    


{'test_rmse': array([0.94417327, 0.93722977, 0.93624651, 0.92823393, 0.93701003]),
 'test_mae': array([0.7427454 , 0.73993878, 0.73608589, 0.73270861, 0.73810779]),
 'fit_time': (1.471924066543579,
  1.3773095607757568,
  2.1813876628875732,
  1.4090654850006104,
  2.0756123065948486),
 'test_time': (0.21621155738830566,
  0.1331017017364502,
  0.38264012336730957,
  0.15277338027954102,
  0.2533442974090576)}

### **Training the Model:**
The model is trained on the entire dataset using `build_full_trainset()` and `model.fit(trainset)`.

In [20]:
trainset=data.build_full_trainset()
model.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x7d04a0282350>

### **Generating Predictions:**
The model's predictions are generated for all pairs of users and movies that are not present in the training set.

In [21]:
testset=trainset.build_anti_testset()
predictions=model.test(testset)

### **Recommendation Function:**
The get_top_n function is defined to find the highest predicted ratings for movies that a user hasn't seen yet.

In [22]:
from collections import defaultdict

In [23]:
def get_top_n(predictions, n):
  #First map the predictions to each user
  top_n=defaultdict(list)
  for uid, iid, true_r,est, _ in predictions:
    top_n[uid].append((iid, est))
  for uid, user_ratings in top_n.items():
    user_ratings.sort(key=lambda x:x[1], reverse=True)
    top_n[uid]=user_ratings[:n]

  return top_n

### **Recommendation Generation:**
The get_top_n function is applied to the predictions to recommend the top three movies with the highest predicted ratings for each user.

In [24]:
top_n=get_top_n(predictions, n=3)

### **Displaying Recommendations:**
The recommended movies are printed for each user.


In [25]:
for uid, user_ratings in top_n.items():
  print(uid, [iid for (iid, _) in user_ratings])

196 ['64', '169', '408']
186 ['515', '318', '170']
22 ['408', '169', '178']
244 ['483', '127', '98']
166 ['210', '169', '64']
298 ['64', '480', '12']
115 ['179', '135', '195']
253 ['242', '603', '963']
305 ['514', '604', '615']
6 ['654', '603', '646']
62 ['408', '169', '515']
286 ['603', '496', '197']
200 ['963', '272', '251']
210 ['511', '408', '963']
224 ['64', '316', '190']
303 ['114', '14', '178']
122 ['134', '169', '408']
194 ['480', '114', '285']
291 ['603', '127', '318']
234 ['272', '408', '919']
119 ['513', '114', '408']
167 ['208', '482', '98']
299 ['357', '528', '64']
308 ['647', '190', '474']
95 ['12', '318', '313']
38 ['483', '205', '496']
102 ['318', '64', '178']
63 ['496', '484', '127']
160 ['318', '98', '480']
50 ['169', '50', '318']
301 ['408', '272', '190']
225 ['408', '318', '169']
290 ['313', '96', '963']
97 ['318', '22', '176']
157 ['98', '174', '318']
181 ['64', '195', '22']
278 ['318', '483', '408']
276 ['483', '114', '480']
7 ['313', '493', '963']
10 ['318', '408