# 🎬 Movie Recommendation System

---

## 📖 Introduction

In today's world of overwhelming choices, finding the right movie to watch can be a challenge. This project builds a **Content-Based Recommendation System** that suggests similar movies based on their description, genre, cast, and other features.

## 🛠️ Project Workflow
1. Import necessary libraries
2. Load the datasets
3. Perform Exploratory Data Analysis (EDA)
4. Data Preprocessing and Feature Engineering
5. Building the Recommendation System
6. Creating a Recommendation Function
7. (Bonus) Deployment Strategy

# 1. 📚 Import Libraries

In [None]:

# Importing essential libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import nltk
import ast
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import warnings
warnings.filterwarnings("ignore")


# 2. 📂 Load the Data

In [None]:

# Loading movie details dataset
movies = pd.read_csv("tmdb_5000_movies.csv")

# Loading credits dataset
credits = pd.read_csv("tmdb_5000_credits.csv")


# 3. 👀 Initial Data Exploration

In [None]:

# Checking the structure of datasets
print(movies.shape)
print(credits.shape)

movies.head(2)


# 4. 🔗 Merge Datasets

In [None]:

# Merging on 'title' column
movies = movies.merge(credits, on='title')

movies.shape


# 5. 🧹 Data Preprocessing

In [None]:

# Keeping useful columns
movies = movies[['movie_id','title','overview','genres','keywords','cast','crew']]

# Checking for missing values
movies.isnull().sum()


In [None]:

# Drop rows with null values
movies.dropna(inplace=True)


# 6. 🛠️ Feature Engineering

In [None]:

# Function to extract names from genres, keywords, cast, crew
def extract_names(obj):
    L = []
    for i in ast.literal_eval(obj):
        L.append(i['name'])
    return L

movies['genres'] = movies['genres'].apply(extract_names)
movies['keywords'] = movies['keywords'].apply(extract_names)

# Extract top 3 cast members
def extract_cast(obj):
    L = []
    counter = 0
    for i in ast.literal_eval(obj):
        if counter < 3:
            L.append(i['name'])
            counter += 1
        else:
            break
    return L

movies['cast'] = movies['cast'].apply(extract_cast)

# Extract director's name
def extract_director(obj):
    L = []
    for i in ast.literal_eval(obj):
        if i['job'] == 'Director':
            L.append(i['name'])
            break
    return L

movies['crew'] = movies['crew'].apply(extract_director)


In [None]:

# Overview is a string, we split into list of words
movies['overview'] = movies['overview'].apply(lambda x: x.split())

# Removing spaces in multi-word names
movies['genres'] = movies['genres'].apply(lambda x: [i.replace(" ","") for i in x])
movies['keywords'] = movies['keywords'].apply(lambda x: [i.replace(" ","") for i in x])
movies['cast'] = movies['cast'].apply(lambda x: [i.replace(" ","") for i in x])
movies['crew'] = movies['crew'].apply(lambda x: [i.replace(" ","") for i in x])


# 7. 🏷️ Creating Tags

In [None]:

# Combining all features into a single 'tags' column
movies['tags'] = movies['overview'] + movies['genres'] + movies['keywords'] + movies['cast'] + movies['crew']

# Creating a new DataFrame with useful columns
new = movies[['movie_id','title','tags']]

# Convert list of words into space separated string
new['tags'] = new['tags'].apply(lambda x: " ".join(x))

new.head()


# 8. 🔡 Text Preprocessing

In [None]:

# Convert tags to lowercase for uniformity
new['tags'] = new['tags'].apply(lambda x: x.lower())


# 9. ✨ Vectorization

In [None]:

# Initializing TF-IDF Vectorizer
tfidf = TfidfVectorizer(max_features=5000, stop_words='english')

# Fitting and transforming the 'tags'
vectors = tfidf.fit_transform(new['tags']).toarray()

vectors.shape


# 10. 📏 Calculate Similarity

In [None]:

# Calculating cosine similarity between all movies
similarity = cosine_similarity(vectors)


# 11. 🎬 Recommendation Function

In [None]:

# Recommender function
def recommend(movie):
    movie = movie.lower()
    if movie not in new['title'].str.lower().values:
        return "Movie not found. Please check the spelling."
    
    index = new[new['title'].str.lower() == movie].index[0]
    distances = similarity[index]
    movies_list = sorted(list(enumerate(distances)),reverse=True,key=lambda x:x[1])[1:6]
    
    for i in movies_list:
        print(new.iloc[i[0]].title)


In [None]:

# Example usage
recommend('Avatar')


# 🚀 Deployment (Streamlit App Guide)


## How to Deploy this Project using Streamlit:

1. Install Streamlit:
```bash
pip install streamlit
```

2. Create a file called `app.py`:
```python
import streamlit as st
import pickle

# Load saved data (vectors and movie data)
new = pickle.load(open('movies.pkl','rb'))
similarity = pickle.load(open('similarity.pkl','rb'))

def recommend(movie):
    movie = movie.lower()
    if movie not in new['title'].str.lower().values:
        return ["Movie not found"]
    index = new[new['title'].str.lower() == movie].index[0]
    distances = similarity[index]
    movies_list = sorted(list(enumerate(distances)),reverse=True,key=lambda x:x[1])[1:6]
    return [new.iloc[i[0]].title for i in movies_list]

st.title('Movie Recommendation System')

movie = st.text_input('Enter Movie Name')
if st.button('Recommend'):
    recommendations = recommend(movie)
    for i in recommendations:
        st.write(i)
```

3. Save your model and similarity matrix:
```python
import pickle
pickle.dump(new,open('movies.pkl','wb'))
pickle.dump(similarity,open('similarity.pkl','wb'))
```

4. Run the Streamlit app:
```bash
streamlit run app.py
```

---

🎉 Congratulations! You now have a working Movie Recommendation System with deployment!
