# Hybrid Movie Recommendation System Demo

<i>The following recommendation system is using hybrid approach. The hybrid approach gives much better results compared to simple collaborative filtering and simple content based filtering</i>

![img_reco](https://www.researchgate.net/profile/Marwa_Mohamed49/publication/331063850/figure/fig3/AS:729493727621125@1550936266704/Content-based-filtering-and-Collaborative-filtering-recommendation.ppm)

In [5]:
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Load Feature Vectors

In [12]:
latent_matrix_1_df = pd.read_csv('LFeature1.csv',index_col=0)
latent_matrix_2_df = pd.read_csv('LFeature2.csv',index_col=0)

# Movies in the dataset

#### sample of movies in dataset

In [17]:
latent_matrix_1_df.index[:50]

Index(['Toy Story (1995)', 'Jumanji (1995)', 'Grumpier Old Men (1995)',
       'Waiting to Exhale (1995)', 'Father of the Bride Part II (1995)',
       'Heat (1995)', 'Sabrina (1995)', 'Tom and Huck (1995)',
       'Sudden Death (1995)', 'GoldenEye (1995)',
       'American President, The (1995)', 'Dracula: Dead and Loving It (1995)',
       'Balto (1995)', 'Nixon (1995)', 'Cutthroat Island (1995)',
       'Casino (1995)', 'Sense and Sensibility (1995)', 'Four Rooms (1995)',
       'Ace Ventura: When Nature Calls (1995)', 'Money Train (1995)',
       'Get Shorty (1995)', 'Copycat (1995)', 'Assassins (1995)',
       'Powder (1995)', 'Leaving Las Vegas (1995)', 'Othello (1995)',
       'Now and Then (1995)', 'Persuasion (1995)',
       'City of Lost Children, The (Cité des enfants perdus, La) (1995)',
       'Shanghai Triad (Yao a yao yao dao waipo qiao) (1995)',
       'Dangerous Minds (1995)', 'Twelve Monkeys (a.k.a. 12 Monkeys) (1995)',
       'Wings of Courage (1995)', 'Babe (1995)',

![hybrid](https://miro.medium.com/max/635/1*XH3CT3gwQtwtOLvL-n48pg.jpeg)

In [13]:
def getSimilarity(movie_name,index):
    a_1 = np.array(latent_matrix_1_df.loc[movie_name]).reshape(1, -1)
    a_2 = np.array(latent_matrix_2_df.loc[movie_name]).reshape(1, -1)

    # calculate the similartity of this movie with the others in the list
    score_1 = cosine_similarity(latent_matrix_1_df, a_1).reshape(-1)
    score_2 = cosine_similarity(latent_matrix_2_df, a_2).reshape(-1)

    # an average measure of both content and collaborative 
    hybrid = ((score_1 + score_2)/2.0)
    
    # form a data frame of similar movies  
    #similar = pd.DataFrame(dictDf, index = latent_matrix_1_df.index )
    similarity = pd.DataFrame(columns = latent_matrix_1_df.index )
    similarity.loc[index] = hybrid

    return similarity


## Appending movies in the list will give recommendation

In [18]:
list_of_movies = ['Toy Story (1995)', 'Jumanji (1995)','Alice in Wonderland (2010)',
                  'Ace Ventura: When Nature Calls (1995)',
                  'Shanghai Triad (Yao a yao yao dao waipo qiao) (1995)']
Rec_Df = pd.DataFrame()
for index,movie in enumerate(list_of_movies):
    Rec_Df = Rec_Df.append(getSimilarity(movie,index))
    
Rec_Df.head(10)
Rec_Df.sum().sort_values(ascending=False).head(20)[len(list_of_movies):]

Dumb & Dumber (Dumb and Dumber) (1994)      1.757729
Liar Liar (1997)                            1.725731
Cable Guy, The (1996)                       1.674836
Ice Age (2002)                              1.626564
Bruce Almighty (2003)                       1.621566
Aladdin (1992)                              1.613701
Mrs. Doubtfire (1993)                       1.605806
Batman Forever (1995)                       1.605254
Alice in Wonderland (2010)                  1.597173
Charlie and the Chocolate Factory (2005)    1.559272
Finding Nemo (2003)                         1.556854
Antz (1998)                                 1.546802
Toy Story 2 (1999)                          1.539201
James and the Giant Peach (1996)            1.538228
Lion King, The (1994)                       1.537001
dtype: float64

## Conclusion

Compared to a simple collaborative approach and content-based filtering hybrid approach gives better results since it tackles the problem like cold start and data sparsity. Content-based filtering gives model support in case ratings are not a strong indicator of a particular item.

*This notebook is for Demo purpose. To create your own recommendation system please refer the python script in the repo 