# Date Night Movie

#### Grading:


- Code: 90 pts
- Markdown Documentation: 10 pts


In this assignment we are going to use pandas to figure out - What's the best **date-night movie**?

This assignment is going to use
- Joining
- Groupby
- Sorting


In [398]:
import os
import pandas as pd

##### Read in the movie data: `pd.read_table`

In [399]:
def get_movie_data():
    
    unames = ['user_id','gender','age','occupation','zip']
    users = pd.read_table(r'../data/users.dat',encoding="unicode_escape", sep='::', header=None, names=unames)
    
    rnames = ['user_id', 'movie_id', 'rating', 'timestamp'] 
    ratings = pd.read_table(r'../data/ratings.dat',encoding="unicode_escape", sep='::', header=None, names=rnames)
    
    mnames = ['movie_id', 'title','genres']
    movies = pd.read_table(r'../data/movies.dat',encoding="unicode_escape", sep='::', header=None, names=mnames)

    return users, ratings, movies

In [400]:
users, ratings, movies = get_movie_data()

  return func(*args, **kwargs)


In [401]:
users.head()

Unnamed: 0,user_id,gender,age,occupation,zip
0,1,F,1,10,48067
1,2,M,56,16,70072
2,3,M,25,15,55117
3,4,M,45,7,2460
4,5,M,25,20,55455


In [402]:
ratings.head()

Unnamed: 0,user_id,movie_id,rating,timestamp
0,1,1193,5,978300760
1,1,661,3,978302109
2,1,914,3,978301968
3,1,3408,4,978300275
4,1,2355,5,978824291


In [403]:
movies.head()

Unnamed: 0,movie_id,title,genres
0,1,Toy Story (1995),Animation|Children's|Comedy
1,2,Jumanji (1995),Adventure|Children's|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama
4,5,Father of the Bride Part II (1995),Comedy


##### Clean up the `movies`

- Get the `year`
- Shorten the `title`


In [404]:
tmp = movies.title.str.extract('(.*) \(([0-9]+)\)')
tmp.apply(lambda x:x[0] if len(x) > 0 else None)
tmp.apply(lambda x: x[0][:40] if len(x) > 0 else None)

0    Toy Story
1         1995
dtype: object

In [405]:
movies['year'] = tmp[1]
movies['short_title'] = tmp[0]

In [406]:
movies.head()

Unnamed: 0,movie_id,title,genres,year,short_title
0,1,Toy Story (1995),Animation|Children's|Comedy,1995,Toy Story
1,2,Jumanji (1995),Adventure|Children's|Fantasy,1995,Jumanji
2,3,Grumpier Old Men (1995),Comedy|Romance,1995,Grumpier Old Men
3,4,Waiting to Exhale (1995),Comedy|Drama,1995,Waiting to Exhale
4,5,Father of the Bride Part II (1995),Comedy,1995,Father of the Bride Part II


##### Join the tables with `pd.merge` (20 pts)

In [407]:
#merging all the three datasets (ratings, users and movies) into one dataframe.
rating_users= pd.merge(ratings,users, on='user_id')
rating_users_movies=pd.merge(rating_users,movies,on='movie_id')
rating_users_movies

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
0,1,1193,5,978300760,F,1,10,48067,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
1,2,1193,5,978298413,M,56,16,70072,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
2,12,1193,4,978220179,M,25,12,32793,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
3,15,1193,4,978199279,M,25,7,22903,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
4,17,1193,5,978158471,M,50,1,95350,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
...,...,...,...,...,...,...,...,...,...,...,...,...
1000204,5949,2198,5,958846401,M,18,17,47901,Modulations (1998),Documentary,1998,Modulations
1000205,5675,2703,3,976029116,M,35,14,30030,Broken Vessels (1998),Drama,1998,Broken Vessels
1000206,5780,2845,1,958153068,M,18,17,92886,White Boys (1999),Drama,1999,White Boys
1000207,5851,3607,5,957756608,F,18,20,55410,One Little Indian (1973),Comedy|Drama|Western,1973,One Little Indian


##### What's the highest rated movie? (20 pts))

In [408]:
#selecting the movie with highest rating
list_max=rating_users_movies.loc[rating_users_movies['rating'] == rating_users_movies['rating'].max()]
list_max

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
0,1,1193,5,978300760,F,1,10,48067,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
1,2,1193,5,978298413,M,56,16,70072,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
4,17,1193,5,978158471,M,50,1,95350,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
6,19,1193,5,982730936,M,1,10,48073,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
7,24,1193,5,978136709,F,25,7,10023,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
...,...,...,...,...,...,...,...,...,...,...,...,...
1000189,5532,404,5,959619841,M,25,17,27408,Brother Minister: The Assassination of Malcolm...,Documentary,1994,Brother Minister: The Assassination of Malcolm X
1000195,5313,3656,5,960920392,M,56,0,55406,Lured (1947),Crime,1947,Lured
1000199,5334,3382,5,960796159,F,56,13,46140,Song of Freedom (1936),Drama,1936,Song of Freedom
1000204,5949,2198,5,958846401,M,18,17,47901,Modulations (1998),Documentary,1998,Modulations


In [409]:
list_max.head()

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
0,1,1193,5,978300760,F,1,10,48067,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
1,2,1193,5,978298413,M,56,16,70072,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
4,17,1193,5,978158471,M,50,1,95350,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
6,19,1193,5,982730936,M,1,10,48073,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
7,24,1193,5,978136709,F,25,7,10023,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest


In [410]:
list_max.dtypes

user_id         int64
movie_id        int64
rating          int64
timestamp       int64
gender         object
age             int64
occupation      int64
zip            object
title          object
genres         object
year           object
short_title    object
dtype: object

### Printing the max rated movie without any conditions

In [411]:
#initiating variables with zero
#avg_rating = 0.0
#highest_rating = 0

#for x in range(0, list_max.movie_id.max()+1):
    
#    List1 = list_max.loc[list_max['movie_id'] == x]
    
#    if List1.rating.mean() > avg_rating:
#        avg_rating = List1.rating.mean()
#        highest_rating = x

#highest_rating
##print(list_max.loc[templist['movie_id'] == highestRated].head())
#list_max.loc[list_max['movie_id'] == highest_rating].head()




avg_rating = 0.0
highest_rating = 0

for x in range(0, rating_users_movies.movie_id.max()+1):
    
    List1 =rating_users_movies.loc[rating_users_movies['movie_id'] == x]
    
    if List1.rating.mean() > avg_rating:
        avg_rating = List1.rating.mean()
        highest_rating = x

highest_rating
#print(list_max.loc[templist['movie_id'] == highestRated].head())
rating_users_movies.loc[rating_users_movies['movie_id'] == highest_rating].head()



Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
965717,149,787,5,977325719,M,25,1,29205,"Gate of Heavenly Peace, The (1995)",Documentary,1995,"Gate of Heavenly Peace, The"
965718,2825,787,5,972610193,F,25,20,94014,"Gate of Heavenly Peace, The (1995)",Documentary,1995,"Gate of Heavenly Peace, The"
965719,2872,787,5,972423586,M,25,20,94014,"Gate of Heavenly Peace, The (1995)",Documentary,1995,"Gate of Heavenly Peace, The"


In [412]:
print ("movie_id with highest rating is ")
highest_rating

movie_id with highest rating is 


787

###### What is a good rated movie for date night? (60 pts)

- Hint - highly rated movie by 
    - both partners (might be the same gender or not),
    - based on genre preferences,
    - age group can also be combined

In [413]:
#extracting the Unique genres in the data
rating_users_movies.genres.unique()

array(['Drama', "Animation|Children's|Musical", 'Musical|Romance',
       "Animation|Children's|Comedy", 'Action|Adventure|Comedy|Romance',
       'Action|Adventure|Drama', 'Comedy|Drama',
       "Adventure|Children's|Drama|Musical", 'Musical', 'Comedy',
       "Animation|Children's", 'Comedy|Fantasy', 'Animation',
       'Comedy|Sci-Fi', 'Drama|War', 'Romance',
       "Animation|Children's|Musical|Romance",
       "Children's|Drama|Fantasy|Sci-Fi", 'Drama|Romance',
       'Animation|Comedy|Thriller',
       "Adventure|Animation|Children's|Comedy|Musical",
       "Animation|Children's|Comedy|Musical", 'Thriller',
       'Action|Crime|Romance', 'Action|Adventure|Fantasy|Sci-Fi',
       "Children's|Comedy|Musical", 'Action|Drama|War',
       "Children's|Drama", 'Crime|Drama|Thriller', 'Action|Crime|Drama',
       'Action|Adventure|Mystery', 'Crime|Drama',
       'Action|Adventure|Sci-Fi|Thriller',
       'Action|Adventure|Romance|Sci-Fi|War', 'Action|Thriller',
       'Action|Drama', 'Co

## Filtering out the ratings with respect to age group (18-30)

In [414]:
rating_users_movies=rating_users_movies.loc[(rating_users_movies['age']>18) & (rating_users_movies['age']<30)]

In [415]:
rating_users_movies

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
2,12,1193,4,978220179,M,25,12,32793,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
3,15,1193,4,978199279,M,25,7,22903,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
7,24,1193,5,978136709,F,25,7,10023,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
8,28,1193,3,978125194,F,25,1,14607,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
11,42,1193,3,978038981,M,25,8,24502,One Flew Over the Cuckoo's Nest (1975),Drama,1975,One Flew Over the Cuckoo's Nest
...,...,...,...,...,...,...,...,...,...,...,...,...
1000190,5543,404,3,960127592,M,25,17,97401,Brother Minister: The Assassination of Malcolm...,Documentary,1994,Brother Minister: The Assassination of Malcolm X
1000191,5220,2543,3,961546137,M,25,7,91436,Six Ways to Sunday (1997),Comedy,1997,Six Ways to Sunday
1000194,5795,591,1,958145253,M,25,1,92688,Tough and Deadly (1995),Action|Drama|Thriller,1995,Tough and Deadly
1000196,5328,2438,4,960838075,F,25,4,91740,Outside Ozona (1998),Drama|Thriller,1998,Outside Ozona


In [416]:
#choosing the genre of choice
genres = rating_users_movies.loc[rating_users_movies['genres'].str.contains('Comedy' and 'Drama' and 'Musical' and 'Romance')]

#choice of males
male_choices = genres.loc[genres['gender'] == 'M']

#choice of females
female_choices = genres.loc[genres['gender'] == 'F']

In [419]:
genres

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
2256,48,914,3,978059754,M,25,4,92107,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2257,53,914,5,977979589,M,25,0,96931,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2262,114,914,5,980373930,F,25,2,83712,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2263,117,914,5,977507452,M,25,17,33314,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2269,169,914,4,977198553,M,25,7,55439,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
...,...,...,...,...,...,...,...,...,...,...,...,...
1000017,2453,2833,4,974189925,M,25,7,55429,Lucie Aubrac (1997),Romance|War,1997,Lucie Aubrac
1000019,3301,2833,4,968215777,F,25,2,85719,Lucie Aubrac (1997),Romance|War,1997,Lucie Aubrac
1000023,2507,1714,2,975382922,M,25,4,94107,Never Met Picasso (1996),Romance,1996,Never Met Picasso
1000039,2796,1851,4,997320494,M,25,14,92104,Leather Jacket Love Story (1997),Drama|Romance,1997,Leather Jacket Love Story


In [420]:
#saving genres into csv file
genres.to_csv('../data/genres.csv',index=False)

In [421]:
genres.head()

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
2256,48,914,3,978059754,M,25,4,92107,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2257,53,914,5,977979589,M,25,0,96931,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2262,114,914,5,980373930,F,25,2,83712,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2263,117,914,5,977507452,M,25,17,33314,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2269,169,914,4,977198553,M,25,7,55439,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady


In [422]:
male_choices

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
2256,48,914,3,978059754,M,25,4,92107,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2257,53,914,5,977979589,M,25,0,96931,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2263,117,914,5,977507452,M,25,17,33314,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2269,169,914,4,977198553,M,25,7,55439,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2270,173,914,5,1009652726,M,25,0,45237,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
...,...,...,...,...,...,...,...,...,...,...,...,...
999921,4682,1770,4,964567063,M,25,7,05346,B. Monkey (1998),Romance|Thriller,1998,B. Monkey
999922,4732,1770,3,963637239,M,25,14,24450,B. Monkey (1998),Romance|Thriller,1998,B. Monkey
1000017,2453,2833,4,974189925,M,25,7,55429,Lucie Aubrac (1997),Romance|War,1997,Lucie Aubrac
1000023,2507,1714,2,975382922,M,25,4,94107,Never Met Picasso (1996),Romance,1996,Never Met Picasso


In [423]:
female_choices

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
2262,114,914,5,980373930,F,25,2,83712,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2271,175,914,5,977114271,F,25,2,95123,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2291,340,914,4,976341003,F,25,3,28001,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2293,346,914,5,976333830,F,25,0,55110,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
2308,499,914,5,976212779,F,25,1,55108,My Fair Lady (1964),Musical|Romance,1964,My Fair Lady
...,...,...,...,...,...,...,...,...,...,...,...,...
999834,4345,1520,2,966435789,F,25,14,44304,Commandments (1997),Romance,1997,Commandments
999925,5333,1770,3,1016121418,F,25,7,02332,B. Monkey (1998),Romance|Thriller,1998,B. Monkey
999926,5387,1770,3,987644662,F,25,1,45056,B. Monkey (1998),Romance|Thriller,1998,B. Monkey
1000019,3301,2833,4,968215777,F,25,2,85719,Lucie Aubrac (1997),Romance|War,1997,Lucie Aubrac


In [424]:
#printing the number of ratings in the data for a specific movie with respect to movie_id
len(genres[genres['movie_id'] == 914].rating)

183

In [431]:
#printing the number of ratings in the data for a specific movie with respect to movie_id
len(genres[genres['movie_id'] == 1197].rating)

980

In [426]:
#printing the number of ratings in the data for a specific movie with respect to movie_id
len(genres[genres['movie_id'] == 1721].rating)

581

### Printing the max rated movie of choices

In [427]:
# initiating the below variables with 0
avg_both = 0.0
highest_rating_by_both = 0

for i in range(0, genres.movie_id.max()+1):
    # grab next movie for both males and females
    male = male_choices.loc[male_choices['movie_id'] == i]
    female = female_choices.loc[female_choices['movie_id'] == i]
    #Picking up movie with atleast 200 ratings
    if len(genres[genres['movie_id'] == i].rating) >= 200:
        # lets find the highest rating for males and females, calculated separately then combined
        if male.rating.mean() + female.rating.mean() > avg_both:
            avg_both = male.rating.mean() + female.rating.mean()
            highest_rating_by_both = i
            
print(highest_rating_by_both) #hightest rated movie combined.
print(rating_users_movies.loc[rating_users_movies['movie_id'] == highest_rating_by_both].head())

1197
      user_id  movie_id  rating  timestamp gender  age  occupation    zip  \
5905        3      1197       5  978297570      M   25          15  55117   
5907       11      1197       5  978903297      F   25           1  04093   
5912       24      1197       4  978132232      F   25           7  10023   
5913       28      1197       5  978125233      F   25           1  14607   
5917       36      1197       4  978210557      M   25           3  94123   

                           title                           genres  year  \
5905  Princess Bride, The (1987)  Action|Adventure|Comedy|Romance  1987   
5907  Princess Bride, The (1987)  Action|Adventure|Comedy|Romance  1987   
5912  Princess Bride, The (1987)  Action|Adventure|Comedy|Romance  1987   
5913  Princess Bride, The (1987)  Action|Adventure|Comedy|Romance  1987   
5917  Princess Bride, The (1987)  Action|Adventure|Comedy|Romance  1987   

              short_title  
5905  Princess Bride, The  
5907  Princess Bride, The

In [428]:
#max value of movie_id in genres
genres.movie_id.max()

3909

In [429]:
highest_rating_by_both#hightest rated movie combined.

1197

In [430]:
rating_users_movies.loc[rating_users_movies['movie_id'] == highest_rating_by_both].head()

Unnamed: 0,user_id,movie_id,rating,timestamp,gender,age,occupation,zip,title,genres,year,short_title
5905,3,1197,5,978297570,M,25,15,55117,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1987,"Princess Bride, The"
5907,11,1197,5,978903297,F,25,1,4093,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1987,"Princess Bride, The"
5912,24,1197,4,978132232,F,25,7,10023,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1987,"Princess Bride, The"
5913,28,1197,5,978125233,F,25,1,14607,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1987,"Princess Bride, The"
5917,36,1197,4,978210557,M,25,3,94123,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1987,"Princess Bride, The"


The movie highly rated by both genders for age group of 18 to 30 is **Princess Bride, The (1987)**.