<a href="https://colab.research.google.com/github/Thotaaasish/Deep-learningg/blob/main/Movie_Recommendation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Movie Recommendation

A movie recommendation system is an application designed to suggest movies to users based on various algorithms and data analysis techniques. These systems can significantly enhance user experience by providing personalized movie suggestions that align with individual preferences and viewing habits

# Data Source

**1.Data Collection:**
**6.User Data: Demographics, viewing history, ratings.
Movie Data: Genres, actors, directors, release dates, ratings.
Interaction Data: Views, searches, ratings, reviews.

**2.Data Preprocessing:**
Removing duplicates, handling missing values.
Normalization: Scaling data to a common range.
Feature Extraction: Identifying relevant data features.

**3.Recommendation Algorithms:**
Collaborative Filtering:User-based and item-based.
Content-Based Filtering: Similarity between movie features and user preferences.
Hybrid Methods: Combining collaborative and content-based filtering.

**4.Model Training:**
Machine Learning:Training models on historical data.
Matrix Factorization: Techniques like SVD.
Neural Networks: Deep learning models for complex patterns.

**5.Recommendation Generation**
Producing personalized movie suggestions.
Real-time recommendations based on user interactions.

**6.Evaluation and Feedback:**
Offline Evaluation: Metrics like MAE and RMSE.
Online Evaluation: A/B testing with real users.
Feedback Loop: Updating models with new data


# Import Library

In [1]:
import pandas as pd


In [2]:
import numpy as np

# Inport Datasets

In [3]:
df = pd.read_csv('amazon_prime_titles.csv')

In [4]:
df.head()


Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,The Grand Seduction,Don McKellar,"Brendan Gleeson, Taylor Kitsch, Gordon Pinsent",Canada,"March 30, 2021",2014,,113 min,"Comedy, Drama",A small fishing village must procure a local d...
1,s2,Movie,Take Care Good Night,Girish Joshi,"Mahesh Manjrekar, Abhay Mahajan, Sachin Khedekar",India,"March 30, 2021",2018,13+,110 min,"Drama, International",A Metro Family decides to fight a Cyber Crimin...
2,s3,Movie,Secrets of Deception,Josh Webber,"Tom Sizemore, Lorenzo Lamas, Robert LaSardo, R...",United States,"March 30, 2021",2017,,74 min,"Action, Drama, Suspense",After a man discovers his wife is cheating on ...
3,s4,Movie,Pink: Staying True,Sonia Anderson,"Interviews with: Pink, Adele, Beyoncé, Britney...",United States,"March 30, 2021",2014,,69 min,Documentary,"Pink breaks the mold once again, bringing her ..."
4,s5,Movie,Monster Maker,Giles Foster,"Harry Dean Stanton, Kieran O'Brien, George Cos...",United Kingdom,"March 30, 2021",1989,,45 min,"Drama, Fantasy",Teenage Matt Banting wants to work with a famo...


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9668 entries, 0 to 9667
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       9668 non-null   object
 1   type          9668 non-null   object
 2   title         9668 non-null   object
 3   director      7585 non-null   object
 4   cast          8435 non-null   object
 5   country       672 non-null    object
 6   date_added    155 non-null    object
 7   release_year  9668 non-null   int64 
 8   rating        9331 non-null   object
 9   duration      9668 non-null   object
 10  listed_in     9668 non-null   object
 11  description   9668 non-null   object
dtypes: int64(1), object(11)
memory usage: 906.5+ KB


In [6]:
df.shape

(9668, 12)

In [7]:
df.columns

Index(['show_id', 'type', 'title', 'director', 'cast', 'country', 'date_added',
       'release_year', 'rating', 'duration', 'listed_in', 'description'],
      dtype='object')

# **Describe Data**

In [8]:
df_features = df[['type','title','cast','country','rating','duration']].fillna('')

# **Data Visualization**

In [9]:
df_features.shape

(9668, 6)

In [10]:
df_features

Unnamed: 0,type,title,cast,country,rating,duration
0,Movie,The Grand Seduction,"Brendan Gleeson, Taylor Kitsch, Gordon Pinsent",Canada,,113 min
1,Movie,Take Care Good Night,"Mahesh Manjrekar, Abhay Mahajan, Sachin Khedekar",India,13+,110 min
2,Movie,Secrets of Deception,"Tom Sizemore, Lorenzo Lamas, Robert LaSardo, R...",United States,,74 min
3,Movie,Pink: Staying True,"Interviews with: Pink, Adele, Beyoncé, Britney...",United States,,69 min
4,Movie,Monster Maker,"Harry Dean Stanton, Kieran O'Brien, George Cos...",United Kingdom,,45 min
...,...,...,...,...,...,...
9663,Movie,Pride Of The Bowery,"Leo Gorcey, Bobby Jordan",,7+,60 min
9664,TV Show,Planet Patrol,"DICK VOSBURGH, RONNIE STEVENS, LIBBY MORRIS, M...",,13+,4 Seasons
9665,Movie,Outpost,"Ray Stevenson, Julian Wadham, Richard Brake, M...",,R,90 min
9666,TV Show,Maradona: Blessed Dream,"Esteban Recagno, Ezequiel Stremiz, Luciano Vit...",,TV-MA,1 Season


In [11]:
X = df_features['type'] +' '+ df_features['title'] +' '+ df_features['cast'] +' '+ df_features['country'] +' '+ df_features['rating'] +' '+ df_features['duration']


In [12]:
X

0       Movie The Grand Seduction Brendan Gleeson, Tay...
1       Movie Take Care Good Night Mahesh Manjrekar, A...
2       Movie Secrets of Deception Tom Sizemore, Loren...
3       Movie Pink: Staying True Interviews with: Pink...
4       Movie Monster Maker Harry Dean Stanton, Kieran...
                              ...                        
9663    Movie Pride Of The Bowery Leo Gorcey, Bobby Jo...
9664    TV Show Planet Patrol DICK VOSBURGH, RONNIE ST...
9665    Movie Outpost Ray Stevenson, Julian Wadham, Ri...
9666    TV Show Maradona: Blessed Dream Esteban Recagn...
9667    Movie Harry Brown Michael Caine, Emily Mortime...
Length: 9668, dtype: object

In [13]:
X.shape

(9668,)

# **Feature Text conversion into Tokens**

In [14]:
from sklearn.feature_extraction.text import TfidfVectorizer

In [15]:
tfidf = TfidfVectorizer()

In [16]:
X = tfidf.fit_transform(X)

In [17]:
X.shape

(9668, 32412)

In [18]:
print(X)

  (0, 19386)	0.050913457569742304
  (0, 21)	0.2583525659117182
  (0, 4893)	0.2724842231586582
  (0, 22553)	0.3521622662921497
  (0, 11290)	0.2501981152620488
  (0, 15600)	0.3812740838843488
  (0, 28756)	0.2277669313148968
  (0, 11104)	0.3351329446728788
  (0, 4251)	0.2757585669605611
  (0, 25883)	0.3983034055036198
  (0, 11410)	0.3351329446728788
  (0, 29015)	0.10314125698163805
  (0, 19915)	0.05093494617336866
  (1, 18)	0.23370611393787347
  (1, 41)	0.09141676444641544
  (1, 13403)	0.1814074458298866
  (1, 15376)	0.31148148279952587
  (1, 25028)	0.2915411847572299
  (1, 17761)	0.3543662862427581
  (1, 420)	0.32129164965314905
  (1, 18072)	0.32129164965314905
  (1, 17783)	0.27485572370066136
  (1, 20792)	0.22718508585629768
  (1, 11247)	0.2619645973439958
  (1, 5005)	0.32129164965314905
  :	:
  (9666, 16503)	0.12716202774103505
  (9666, 16067)	0.17854081404177735
  (9666, 6533)	0.1824230021181038
  (9666, 21648)	0.1551479711845116
  (9666, 24538)	0.1824230021181038
  (9666, 17530)	0.12

# **Get Similarity**

In [19]:
from sklearn.metrics.pairwise import cosine_similarity

In [20]:
Similarity_score = cosine_similarity(X)

In [21]:
Similarity_score

array([[1.        , 0.00482052, 0.00427903, ..., 0.00487692, 0.        ,
        0.00568795],
       [0.00482052, 1.        , 0.00397705, ..., 0.00453274, 0.        ,
        0.00528653],
       [0.00427903, 0.00397705, 1.        , ..., 0.03308455, 0.        ,
        0.0046927 ],
       ...,
       [0.00487692, 0.00453274, 0.03308455, ..., 1.        , 0.        ,
        0.03356218],
       [0.        , 0.        , 0.        , ..., 0.        , 1.        ,
        0.        ],
       [0.00568795, 0.00528653, 0.0046927 , ..., 0.03356218, 0.        ,
        1.        ]])

In [22]:
Similarity_score.shape

(9668, 9668)

# **Model Evaluation**

In [23]:
Favourite_movie_name = input('Enter your favourite movie name:')

Enter your favourite movie name:monster maker


In [24]:
All_movie_titles_list = df['title'].tolist()

In [25]:
import difflib

In [26]:
Movie_Recommendation = difflib.get_close_matches(Favourite_movie_name, All_movie_titles_list)

In [27]:
print(Movie_Recommendation)

['Monster Maker', 'Mister Maker', 'Monster']


In [28]:
close_match = Movie_Recommendation[0]

In [29]:
print(close_match)

Monster Maker


In [30]:
Index_of_close_match_movie = df[df.title == close_match]['duration'].values[0]

In [31]:
print(f"Value of Index_of_close_match_movie: {Index_of_close_match_movie}")
print(f"Length of Similarity_score: {len(Similarity_score)}")

Value of Index_of_close_match_movie: 45 min
Length of Similarity_score: 9668


In [32]:
len(Similarity_score)

9668

In [33]:
Sorted_similar_movies = sorted(Similarity_score, key = lambda x:x[1],reverse = True)
print(Sorted_similar_movies)

[array([0.00482052, 1.        , 0.00397705, ..., 0.00453274, 0.        ,
       0.00528653]), array([0.00541072, 0.21806978, 0.06054754, ..., 0.00508771, 0.        ,
       0.00593379]), array([0.0085282 , 0.19055235, 0.00703598, ..., 0.00801907, 0.        ,
       0.00935264]), array([0.00457875, 0.17714358, 0.00377758, ..., 0.0043054 , 0.        ,
       0.00502138]), array([0.00431137, 0.17427379, 0.00355699, ..., 0.00405399, 0.        ,
       0.00472817]), array([0.01237191, 0.1654127 , 0.01020714, ..., 0.01163332, 0.        ,
       0.01356794]), array([0.00299062, 0.16257016, 0.00246734, ..., 0.00281209, 0.        ,
       0.00327974]), array([0.00829548, 0.15669231, 0.00684398, ..., 0.00780025, 0.        ,
       0.00909742]), array([0.00334924, 0.15018738, 0.0027632 , ..., 0.00314929, 0.        ,
       0.00367302]), array([0.00470519, 0.13725311, 0.0038819 , ..., 0.00442429, 0.        ,
       0.00516005]), array([0.0127224 , 0.13282748, 0.        , ..., 0.        , 0.0318850

In [44]:
print('Top 10 movies suggested for you :\n')
i = 1
for movie in Sorted_similar_movies:
    index = movie[31]
    title_from_index = df[df.index == index]['title'].values
    if len(title_from_index) > 0:
        title_from_index = title_from_index[0]
        if i < 31:
            print(i, '.', title_from_index)
            i += 1


Top 10 movies suggested for you :

1 . The Grand Seduction
2 . The Grand Seduction
3 . The Grand Seduction
4 . The Grand Seduction
5 . The Grand Seduction
6 . The Grand Seduction
7 . The Grand Seduction
8 . The Grand Seduction
9 . The Grand Seduction
10 . The Grand Seduction
11 . The Grand Seduction
12 . The Grand Seduction
13 . The Grand Seduction
14 . The Grand Seduction
15 . The Grand Seduction
16 . The Grand Seduction
17 . The Grand Seduction
18 . The Grand Seduction
19 . The Grand Seduction
20 . The Grand Seduction
21 . The Grand Seduction
22 . The Grand Seduction
23 . The Grand Seduction
24 . The Grand Seduction
25 . The Grand Seduction
26 . The Grand Seduction
27 . The Grand Seduction
28 . The Grand Seduction
29 . The Grand Seduction
30 . The Grand Seduction
