# Content Recommendation System
Module E: AI Applications â€“ Individual Open Project


## 1. Problem Definition & Objective

### Project Track
Recommendation System (AI / Machine Learning)

### Problem Statement
The objective of this project is to build a content-based recommendation system
that suggests relevant items to users based on content similarity.

### Real-World Relevance
Content recommendation systems are widely used in platforms such as Netflix,
YouTube, and Amazon to personalize user experience and improve engagement.


## 2. Data Understanding & Preparation

### Dataset Source
This project uses a small, manually created dataset for demonstration purposes.

### Data Description
Each content item contains a title and a short description, which will be used
to compute similarity for recommendations.

### Data Preprocessing
Text descriptions are cleaned and transformed into numerical vectors
using the TF-IDF technique for similarity computation.


## 3. Model / System Design

### AI Technique Used
This project implements a content-based recommendation system using
Natural Language Processing (NLP) techniques.

### System Architecture / Pipeline
1. Input content metadata
2. TF-IDF vectorization
3. Cosine similarity computation
4. Recommendation generation


## 5. Evaluation & Analysis

### Evaluation Method
Qualitative evaluation based on similarity relevance.

### Limitations
- Small dataset
- No user personalization


## 6. Ethical Considerations & Responsible AI

- Possible bias due to limited data
- Risk of filter bubbles


## 7. Conclusion & Future Scope

### Conclusion
A content-based recommendation system was successfully implemented.

### Future Scope
- Add user behavior
- Use deep learning embeddings


In [106]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
data = {
    'title': [
        'The Matrix',
        'Inception',
        'Interstellar',
        'The Dark Knight',
        'Avengers'
    ],
    'description': [
        'A computer hacker learns about the true nature of reality',
        'A skilled thief enters dreams to steal secrets',
        'A journey through space and time to save humanity',
        'Batman fights crime and chaos in Gotham City',
        'A team of superheroes saves the world from threats'
    ]
}

df = pd.DataFrame(data)
df
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(df['description'])
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
def recommend_content(title, df, similarity_matrix):
    index = df[df['title'] == title].index[0]
    similarity_scores = list(enumerate(similarity_matrix[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)
    top_items = similarity_scores[1:4]
    return df['title'].iloc[[i[0] for i in top_items]]
recommend_content('Inception', df, cosine_sim)


0         The Matrix
2       Interstellar
3    The Dark Knight
Name: title, dtype: object