In [1]:
import pymongo
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics.pairwise import linear_kernel, cosine_similarity

In [2]:
myclient = pymongo.MongoClient()
mydb = myclient["mflix"]
mycol = mydb["movies"]

mydoc = mycol.find()

In [3]:
df = pd.DataFrame(list(mydoc))
df.head()

Unnamed: 0,_id,plot,genres,runtime,cast,num_mflix_comments,title,fullplot,countries,released,...,awards,lastupdated,year,imdb,type,tomatoes,poster,languages,writers,metacritic
0,573a1390f29313caabcd4135,Three men hammer on an anvil and pass a bottle...,[Short],1.0,"[Charles Kayser, John Ott]",1.0,Blacksmith Scene,A stationary camera looks at a large anvil wit...,[USA],1893-05-09,...,"{'wins': 1, 'nominations': 0, 'text': '1 win.'}",2015-08-26 00:03:50.133000000,1893,"{'rating': 6.2, 'votes': 1189, 'id': 5}",movie,"{'viewer': {'rating': 3, 'numReviews': 184, 'm...",,,,
1,573a1390f29313caabcd42e8,A group of bandits stage a brazen train hold-u...,"[Short, Western]",11.0,"[A.C. Abadie, Gilbert M. 'Broncho Billy' Ander...",,The Great Train Robbery,Among the earliest existing films in American ...,[USA],1903-12-01,...,"{'wins': 1, 'nominations': 0, 'text': '1 win.'}",2015-08-13 00:27:59.177000000,1903,"{'rating': 7.4, 'votes': 9847, 'id': 439}",movie,"{'viewer': {'rating': 3.7, 'numReviews': 2559,...",https://m.media-amazon.com/images/M/MV5BMTU3Nj...,[English],,
2,573a1390f29313caabcd4323,"A young boy, opressed by his mother, goes on a...","[Short, Drama, Fantasy]",14.0,"[Martin Fuller, Mrs. William Bechtel, Walter E...",2.0,The Land Beyond the Sunset,"Thanks to the Fresh Air Fund, a slum child esc...",[USA],1912-10-28,...,"{'wins': 1, 'nominations': 0, 'text': '1 win.'}",2015-08-29 00:27:45.437000000,1912,"{'rating': 7.1, 'votes': 448, 'id': 488}",movie,"{'viewer': {'rating': 3.7, 'numReviews': 53, '...",https://m.media-amazon.com/images/M/MV5BMTMzMD...,[English],[Dorothy G. Shore],
3,573a1390f29313caabcd446f,"A greedy tycoon decides, on a whim, to corner ...","[Short, Drama]",14.0,"[Frank Powell, Grace Henderson, James Kirkwood...",1.0,A Corner in Wheat,"A greedy tycoon decides, on a whim, to corner ...",[USA],1909-12-13,...,"{'wins': 1, 'nominations': 0, 'text': '1 win.'}",2015-08-13 00:46:30.660000000,1909,"{'rating': 6.6, 'votes': 1375, 'id': 832}",movie,"{'viewer': {'rating': 3.6, 'numReviews': 109, ...",,[English],,
4,573a1390f29313caabcd5501,"A venal, spoiled stockbroker's wife impulsivel...",[Drama],59.0,"[Fannie Ward, Sessue Hayakawa, Jack Dean, Jame...",,The Cheat,Edith Hardy uses charity funds for Wall Street...,[USA],1915-12-13,...,"{'wins': 1, 'nominations': 0, 'text': '1 win.'}",2015-08-31 00:41:20.670000000,1915,"{'rating': 6.5, 'votes': 1660, 'id': 5078}",movie,"{'viewer': {'rating': 3.2, 'numReviews': 423, ...",https://m.media-amazon.com/images/M/MV5BMjEzMj...,[English],"[Hector Turnbull (scenario), Jeanie Macpherson...",


In [4]:
df['fullplot'] = df['fullplot'].fillna('')

In [5]:
tf = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(df['fullplot'])

In [6]:
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

In [7]:
titles = df['title']
indices = pd.Series(df.index, index=df['title'].str.lower())

In [8]:
def get_recommendations(title):
    idxlist = []
    title = title.lower()
    if title in indices.index:
        idx = indices[title]
        if(isinstance(idx, np.integer)):
            idxlist.append(idx)
            idx = idxlist
        for i in idx:
            sim_scores = list(enumerate(cosine_sim[i]))
            sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
            sim_scores = sim_scores[1:21]
            movie_indices = [i[0] for i in sim_scores]
            print(titles.iloc[i], '(' + str(df.iloc[i]['year']) + ')', df.iloc[i]['countries'])
            print('---------------------------------------------------------------------')
            print(df.iloc[i]['fullplot'])
            print('---------------------------------------------------------------------')
            print()
            print('Recommendations:')
            tdf = titles.iloc[movie_indices].head(20)
            i=0
            for ind in tdf.index:
                i+=1
                print('• ' , tdf[ind])
            print()
            print()
    else:
        print("Movie not found")

In [9]:
get_recommendations('3 Idiots')

3 Idiots (2009) ['India']
---------------------------------------------------------------------
Farhan Qureshi and Raju Rastogi want to re-unite with their fellow collegian, Rancho, after faking a stroke aboard an Air India plane, and excusing himself from his wife - trouser less - respectively. Enroute, they encounter another student, Chatur Ramalingam, now a successful businessman, who reminds them of a bet they had undertaken 10 years ago. The trio, while recollecting hilarious antics, including their run-ins with the Dean of Delhi's Imperial College of Engineering, Viru Sahastrabudhe, race to locate Rancho, at his last known address - little knowing the secret that was kept from them all this time.
---------------------------------------------------------------------

Recommendations:
•  Dil
•  Awaara
•  Hide Away
•  Enemy of the State
•  Khaleja
•  The Chumscrubber
•  The Chumscrubber
•  Polytechnique
•  Halloween 4: The Return of Michael Myers
•  Tian guo ni zi
•  Dirty Rotten Sc

In [10]:
get_recommendations('The Dark Knight Rises')

The Dark Knight Rises (2012) ['USA', 'UK']
---------------------------------------------------------------------
Despite his tarnished reputation after the events of The Dark Knight, in which he took the rap for Dent's crimes, Batman feels compelled to intervene to assist the city and its police force which is struggling to cope with Bane's plans to destroy the city.
---------------------------------------------------------------------

Recommendations:
•  Batman: The Dark Knight Returns, Part 2
•  Batman Forever
•  Batman
•  Courageous
•  The Dark Knight
•  Batman Beyond: Return of the Joker
•  Batman: Gotham Knight
•  Torrente 2: Mission in Marbella
•  The Sniper
•  Batman Returns
•  Batman: Mask of the Phantasm
•  The Seventh Seal
•  The War at Home
•  Project A 2
•  Something from Nothing: The Art of Rap
•  Get Rich or Die Tryin'
•  Europe '51
•  CHiPs '99
•  The Forgotten
•  Crime Busters




In [11]:
get_recommendations('The Avengers')

The Avengers (1998) ['USA']
---------------------------------------------------------------------
British Ministry agent John Steed, under direction from "Mother", investigates a diabolical plot by arch-villain Sir August de Wynter to rule the world with his weather control machine. Steed investigates the beautiful Doctor Mrs. Emma Peel, the only suspect, but simultaneously falls for her and joins forces with her to combat Sir August.
---------------------------------------------------------------------

Recommendations:
•  Monty Python and the Holy Grail
•  Murder by Decree
•  Our Man Flint
•  That Hamilton Woman
•  The Ultimate Christmas Present
•  Quo Vadis, Baby?
•  Water for Elephants
•  It's Me, It's Me
•  The Adventures of Robin Hood
•  The Adventures of Robin Hood
•  Witness for the Prosecution
•  An Ideal Husband
•  Emma
•  Princess
•  Watchmen
•  The Wolf Man
•  The Case of the Whitechapel Vampire
•  Dr. No
•  She Monkeys
•  You're So Cupid!


The Avengers (2012) ['USA']
----

In [12]:
get_recommendations('Kung Fu Panda')

Kung Fu Panda (2008) ['USA']
---------------------------------------------------------------------
It's the story about a lazy, irreverent slacker panda, named Po, who is the biggest fan of Kung Fu around...which doesn't exactly come in handy while working every day in his family's noodle shop. Unexpectedly chosen to fulfill an ancient prophecy, Po's dreams become reality when he joins the world of Kung Fu and studies alongside his idols, the legendary Furious Five -- Tigress, Crane, Mantis, Viper and Monkey -- under the leadership of their guru, Master Shifu. But before they know it, the vengeful and treacherous snow leopard Tai Lung is headed their way, and it's up to Po to defend everyone from the oncoming threat. Can he turn his dreams of becoming a Kung Fu master into reality? Po puts his heart - and his girth - into the task, and the unlikely hero ultimately finds that his greatest weaknesses turn out to be his greatest strengths.
-------------------------------------------------