# Movie Recommendation System with Collaborative Filtering
In this notebook, we will:
1. Install necessary libraries.
2. Load and inspect the dataset.
3. Preprocess the data.
4. Split the data into training and testing sets.
5. Build and train a collaborative filtering model.
6. Evaluate the model's performance.

In [None]:
# Install necessary libraries
!pip install numpy pandas scikit-surprise

In [None]:
# Importing necessary libraries
import pandas as pd
import numpy as np
from surprise import Dataset, Reader, SVD
from surprise.model_selection import train_test_split
from surprise import accuracy

## 1. Load and inspect the dataset
We'll use the MovieLens dataset for this example.

In [None]:
# Load the MovieLens dataset
url = 'http://files.grouplens.org/datasets/movielens/ml-100k/u.data'
column_names = ['user_id', 'item_id', 'rating', 'timestamp']
df = pd.read_csv(url, sep='\t', names=column_names)

# Display the first 5 rows of the dataframe
df.head()

## 2. Preprocess the data
We'll preprocess the data by creating a `surprise` dataset.

In [None]:
# Preprocess the data
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['user_id', 'item_id', 'rating']], reader)

# Display the first 5 rows of the dataset
df[['user_id', 'item_id', 'rating']].head()

## 3. Split the data into training and testing sets
We'll split the data into 80% training and 20% testing sets.

In [None]:
# Split the data into training and testing sets
trainset, testset = train_test_split(data, test_size=0.2)

# Display the number of training and testing samples
len(trainset.all_ratings()), len(testset)

## 4. Build and train a collaborative filtering model
We'll use the `SVD` algorithm for this example.

In [None]:
# Build and train the model
model = SVD()
model.fit(trainset)

# Make predictions on the testing set
predictions = model.test(testset)

# Display the first 5 predictions
predictions[:5]

## 5. Evaluate the model's performance
We'll evaluate the model's performance using RMSE and MAE.

In [None]:
# Evaluate the model's performance
rmse = accuracy.rmse(predictions)
mae = accuracy.mae(predictions)

# Display the evaluation results
rmse, mae