# Personalized Recommendation System

A simple hybrid recommender system using content-based and collaborative filtering techniques.

## 1. Install Required Libraries

In [None]:
!pip install pandas numpy scikit-learn matplotlib scikit-surprise

## 2. Import Libraries

In [None]:

import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from surprise import SVD, Dataset, Reader
from surprise.model_selection import train_test_split
import matplotlib.pyplot as plt
    

## 3. Load Sample Dataset (MovieLens 100k)

In [None]:

# Load built-in MovieLens 100k dataset
data = Dataset.load_builtin('ml-100k')
trainset, testset = train_test_split(data, test_size=0.2)
    

## 4. Train Collaborative Filtering Model (SVD)

In [None]:

model = SVD()
model.fit(trainset)
    

## 5. Predict Ratings for Sample User

In [None]:

# Predict rating for a specific user and item
prediction = model.predict(uid='196', iid='302')
print(f"Predicted Rating: {prediction.est:.2f}")
    

## 6. Load Movies Metadata for Content-Based Filtering

In [None]:

# Load sample movie metadata file (you can replace this with your own)
# This file should contain at least: movieId, title, genres
movies = pd.read_csv('https://raw.githubusercontent.com/sidooms/MovieTweetings/master/latest/movies.dat', sep='::', 
                     engine='python', header=None, names=['movieId', 'title', 'genres'])
movies['genres'] = movies['genres'].fillna('')
movies.head()
    

## 7. Content-Based Filtering (TF-IDF + Cosine Similarity)

In [None]:

tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(movies['genres'])
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
    

## 8. Hybrid Recommendation Function

In [None]:

# Function to get top similar movies
def get_recommendations(title, cosine_sim=cosine_sim):
    indices = pd.Series(movies.index, index=movies['title']).drop_duplicates()
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:6]
    movie_indices = [i[0] for i in sim_scores]
    return movies['title'].iloc[movie_indices]

# Example usage
get_recommendations('Toy Story (1995)')
    

## 9. Conclusion


- We built a simple hybrid recommendation system.
- Content-based filtering used genres to find similar items.
- Collaborative filtering used SVD to predict ratings.
- You can enhance this by combining both results and ranking them.
    

## 10. References


1. scikit-learn Documentation – https://scikit-learn.org  
2. Surprise Library – https://surpriselib.com  
3. MovieLens Dataset – https://grouplens.org/datasets/movielens  
4. Python Docs – https://docs.python.org/3/  
5. Jupyter Project – https://jupyter.org  
    