# Content-Based Recommender System using Cosine Similarity

A recommender system based on cosine similarity is commonly known as a **Content-Based Recommender System**. Content-based recommenders use information about items and users to make recommendations. In the context of text data, such as movie descriptions, product descriptions, or user profiles, cosine similarity can be employed to measure the similarity between items or between items and user preferences.

## Overview

Here's a basic overview of how a content-based recommender system using cosine similarity might work:

### Item Representation

- Each item (movie, product, etc.) is represented as a vector in a high-dimensional space.
- The elements of the vector correspond to features or attributes of the item. For example, in the case of movies, features might include keywords, genres, or other relevant information.

### User Profile Representation

- The user's preferences are also represented as a vector in the same space.
- The vector is constructed based on the user's interactions, such as items they have liked or rated in the past.

### Cosine Similarity Calculation

- Cosine similarity is used to measure the similarity between the user's profile vector and the vectors representing items.
- Higher cosine similarity values indicate greater similarity.

### Recommendation Generation

- Items with the highest cosine similarity to the user's profile are recommended.
- The system can recommend items that the user has not interacted with but are similar to the ones they have liked in the past.


In [17]:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import random

In [18]:
df2 = pd.read_csv('./model/tmdb.csv')
count = CountVectorizer(stop_words='english')
count_matrix = count.fit_transform(df2['soup'])

cosine_sim2 = cosine_similarity(count_matrix, count_matrix)

df2 = df2.reset_index()
indices = pd.Series(df2.index, index=df2['title'])
all_titles = [df2['title'][i] for i in range(len(df2['title']))]

def get_recommendations(title):
    cosine_sim = cosine_similarity(count_matrix, count_matrix)
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:11]
    movie_indices = [i[0] for i in sim_scores]
    tit = df2['title'].iloc[movie_indices]
    dat = df2['release_date'].iloc[movie_indices]
    rating = df2['vote_average'].iloc[movie_indices]
    moviedetails=df2['overview'].iloc[movie_indices]
    movietypes=df2['keywords'].iloc[movie_indices]
    movieid=df2['id'].iloc[movie_indices]


    return_df = pd.DataFrame(columns=['Title','Year'])
    return_df['Title'] = tit
    return_df['Year'] = dat
    return_df['Ratings'] = rating
    return_df['Overview']=moviedetails
    return_df['Types']=movietypes
    return_df['ID']=movieid
    return return_df

def get_suggestions():
    data = pd.read_csv('tmdb.csv')
    return list(data['title'].str.capitalize())

In [19]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4803 entries, 0 to 4802
Data columns (total 27 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   index                 4803 non-null   int64  
 1   Unnamed: 0            4803 non-null   int64  
 2   budget                4803 non-null   int64  
 3   genres                4803 non-null   object 
 4   homepage              1712 non-null   object 
 5   id                    4803 non-null   int64  
 6   keywords              4803 non-null   object 
 7   original_language     4803 non-null   object 
 8   original_title        4803 non-null   object 
 9   overview              4800 non-null   object 
 10  popularity            4803 non-null   float64
 11  production_companies  4803 non-null   object 
 12  production_countries  4803 non-null   object 
 13  release_date          4802 non-null   object 
 14  revenue               4803 non-null   int64  
 15  runtime              

In [20]:
suggestion = get_suggestions()
suggestion

['Avatar',
 "Pirates of the caribbean: at world's end",
 'Spectre',
 'The dark knight rises',
 'John carter',
 'Spider-man 3',
 'Tangled',
 'Avengers: age of ultron',
 'Harry potter and the half-blood prince',
 'Batman v superman: dawn of justice',
 'Superman returns',
 'Quantum of solace',
 "Pirates of the caribbean: dead man's chest",
 'The lone ranger',
 'Man of steel',
 'The chronicles of narnia: prince caspian',
 'The avengers',
 'Pirates of the caribbean: on stranger tides',
 'Men in black 3',
 'The hobbit: the battle of the five armies',
 'The amazing spider-man',
 'Robin hood',
 'The hobbit: the desolation of smaug',
 'The golden compass',
 'King kong',
 'Titanic',
 'Captain america: civil war',
 'Battleship',
 'Jurassic world',
 'Skyfall',
 'Spider-man 2',
 'Iron man 3',
 'Alice in wonderland',
 'X-men: the last stand',
 'Monsters university',
 'Transformers: revenge of the fallen',
 'Transformers: age of extinction',
 'Oz: the great and powerful',
 'The amazing spider-man 2',

In [21]:
movie_name = 'Furious 7'
recommendation=get_recommendations(movie_name)
recommendation

Unnamed: 0,Title,Year,Ratings,Overview,Types,ID
99,The Fast and the Furious,2001-06-22,6.6,Domenic Toretto is a Los Angeles street racer ...,"['streetgang', 'carrace', 'undercover']",9799
204,Fast Five,2011-04-20,7.1,Former cop Brian O'Conner partners with ex-con...,"['brazil', 'fbi', 'freedom']",51497
500,2 Fast 2 Furious,2003-06-05,6.2,It's a major double-cross when former police o...,"['miami', 'carrace', 'sportscar']",584
1319,Riddick,2013-09-02,6.2,Betrayed by his own kind and left for dead on ...,"['dystopia', 'revenge', 'alien']",87421
2218,Death Sentence,2007-08-31,6.5,Nick Hume is a mild-mannered executive with a ...,"['lossofson', 'repayment', 'revenge']",11835
1986,Faster,2010-11-23,6.1,Driver (Dwayne Johnson) has spent the last 10 ...,[],41283
223,The Chronicles of Riddick,2004-06-11,6.3,After years of outrunning ruthless bounty hunt...,"['prison', 'dystopia', 'matteroflifeanddeath']",2789
304,Hercules,2014-07-23,5.6,"Fourteen hundred years ago, a tormented soul w...","['mercenary', 'battle', 'ancientgreece']",184315
715,The Scorpion King,2002-04-16,5.3,"In ancient Egypt, peasant Mathayus is hired to...","['egypt', 'temple']",9334
914,Central Intelligence,2016-06-15,6.2,After he reunites with an old pal through Face...,"['spy', 'cia', 'espionage']",302699
