# Class Exercise 8

Recommender systems are algorithms that suggest items to users based on their preferences or past behavior. This exercise will guide students through creating different types of recommender systems in Python, focusing on collaborative filtering and content-based filtering.

In this exercise, you will create:

1. Collaborative Filtering

   - Question 1: Create a user-item matrix.
   - Question 2: Implement user-based collaborative filtering using the Pearson correlation similarity metric.
   - Question 3: Make recommendations for a specific user.
2. Content-Based Filtering

   - Question 4: Extract relevant features from the dataset (e.g., genres, keywords).
   - Question 5: Calculate item similarity using cosine similarity.
   - Question 6: Make recommendations based on item similarity.
   
## Dataset
We'll use the MovieLens dataset, a popular choice for recommender system research. Download it from https://movielens.org/.



In [1]:
pip install pandas numpy surprise


Note: you may need to restart the kernel to use updated packages.


In [2]:
import pandas as pd
import numpy as np
from surprise import Reader, Dataset, SVD, accuracy
from sklearn.metrics.pairwise import cosine_similarity

In [3]:
# Load the ratings data
ratings_df = pd.read_csv('ml-latest-small/ratings.csv')

# Explore the dataset
print("Number of users:", ratings_df['userId'].nunique())
print("Number of items:", ratings_df['movieId'].nunique())
print("Average rating:", ratings_df['rating'].mean())

# Rating distribution
print(ratings_df['rating'].value_counts())

Number of users: 610
Number of items: 9724
Average rating: 3.501556983616962
rating
4.0    26818
3.0    20047
5.0    13211
3.5    13136
4.5     8551
2.0     7551
2.5     5550
1.0     2811
1.5     1791
0.5     1370
Name: count, dtype: int64


# Collaborative Filtering: User-Based

- Create a Surprise Dataset object.
- Train a user-based collaborative filtering model (SVD).
- Predict ratings for a sample of users and items.
- Evaluate the model using RMSE.

## Question 1: Create a Surprise Dataset with ['userId', 'movieId', 'rating']


## Question 2: Train a user-based SVD model


## Question 3: Predict ratings for a sample

## Question 4: Evaluate the model


##  Collaborative Filtering: Item-Based

- Train an item-based collaborative filtering model.
- Predict ratings for the same sample.
- Evaluate the model.

## Question 5: Create user-item matrix using pivot method


## Question 6: Calculate cosine similarity between users


## Question 7: Find similarity to use 101 by identifing users similar to a given user by finding the highest cosine similarity values.



## Question 8: Make recommendations by using the ratings of similar users to recommend items to the target user. Show top 10 recommendations.



## Content-Based Filtering
- Load additional data (e.g., movie metadata) if necessary.
- Extract relevant features from the data.


In [4]:
# Load movie metadata
movies = pd.read_csv("ml-latest-small/movies.csv")

# Extract features (e.g., genres)
movie_features = movies[['movieId', 'genres']]

## Question 9: Create a TF-IDF matrix based on movie genres


## Question 10: Recommend top 5 items based on content similarity.


## Question 11: Print recommendations for movie 122.
