# **Movie Recommendation System**

-------------

## **Objective**

To build a content-based movie recommendation system that suggests similar movies based on genres and descriptions.

## **Data Source**

The dataset includes movie titles, genres, and descriptions.

## **Import Library**

In [None]:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

## **Import Dataset**

In [None]:
# Sample dataset for demonstration
data = {
    'movie_id': [1, 2, 3, 4, 5],
    'title': ['The Matrix', 'Inception', 'Interstellar', 'The Dark Knight', 'Pulp Fiction'],
    'genres': ['Action Sci-Fi', 'Action Adventure Sci-Fi', 'Adventure Drama Sci-Fi', 'Action Crime Drama', 'Crime Drama'],
    'description': [
        'A computer hacker learns about the true nature of his reality.',
        'A thief who steals secrets through dream-sharing technology.',
        'A group of explorers travel through a wormhole in space.',
        'A vigilante battles crime in Gotham City.',
        'The lives of two mob hitmen, a boxer, and others intertwine.'
    ]
}

movies_df = pd.DataFrame(data)
movies_df

## **Data Preprocessing**

In [None]:
# Combine features for similarity calculation
movies_df['combined_features'] = movies_df['genres'] + " " + movies_df['description']

# Convert text to feature vectors
count_vectorizer = CountVectorizer(stop_words='english')
count_matrix = count_vectorizer.fit_transform(movies_df['combined_features'])

# Preview the feature vectors
count_matrix.shape

## **Model Building**

In [None]:
# Calculate cosine similarity
cosine_sim = cosine_similarity(count_matrix)

def get_recommendations(title, cosine_sim=cosine_sim):
    # Get the index of the movie that matches the title
    idx = movies_df[movies_df['title'] == title].index[0]

    # Get the pairwise similarity scores for all movies with that movie
    sim_scores = list(enumerate(cosine_sim[idx]))

    # Sort the movies based on similarity scores
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

    # Get the indices of the top 5 similar movies
    sim_scores = sim_scores[1:6]
    movie_indices = [i[0] for i in sim_scores]

    # Return the top 5 most similar movies
    return movies_df['title'].iloc[movie_indices]

## **Evaluation**

In [None]:
# Test the recommendation system
movie_to_search = 'The Matrix'
recommended_movies = get_recommendations(movie_to_search)

print(f"Movies recommended for '{movie_to_search}':")
print(recommended_movies)

## **Conclusion**

The system successfully recommends similar movies based on content features like genres and descriptions. It can be enhanced with larger datasets or advanced models.