# Date Night Movie

#### Grading:


- Code: 90 pts
- Markdown Documentation: 10 pts


In this assignment we are going to use pandas to figure out - What's the best **date-night movie**?

This assignment is going to use
- Joining
- Groupby
- Sorting


In [1]:
import os
import pandas as pd

##### Read in the movie data: `pd.read_table`

In [4]:
def get_movie_data():
    
    unames = ['user_id','gender','age','occupation','zip']
    users = pd.read_table(os.path.join('../data','users.dat'), 
                          sep='::', header=None, names=unames,encoding = "ISO-8859-1")
    
    rnames = ['user_id', 'movie_id', 'rating', 'timestamp']
    ratings = pd.read_table(os.path.join('../data', 'ratings.dat'), 
                            sep='::', header=None, names=rnames,encoding = "ISO-8859-1")
    
    mnames = ['movie_id', 'title','genres']
    movies = pd.read_table(os.path.join('../data', 'movies.dat'), 
                           sep='::', header=None, names=mnames,encoding = "ISO-8859-1")

    return users, ratings, movies

In [5]:
users, ratings, movies = get_movie_data()

  users = pd.read_table(os.path.join('../data','users.dat'),
  ratings = pd.read_table(os.path.join('../data', 'ratings.dat'),
  movies = pd.read_table(os.path.join('../data', 'movies.dat'),


In [6]:
users.head()

Unnamed: 0,user_id,gender,age,occupation,zip
0,1,F,1,10,48067
1,2,M,56,16,70072
2,3,M,25,15,55117
3,4,M,45,7,2460
4,5,M,25,20,55455


In [7]:
ratings.head()

Unnamed: 0,user_id,movie_id,rating,timestamp
0,1,1193,5,978300760
1,1,661,3,978302109
2,1,914,3,978301968
3,1,3408,4,978300275
4,1,2355,5,978824291


In [8]:
movies.head()

Unnamed: 0,movie_id,title,genres
0,1,Toy Story (1995),Animation|Children's|Comedy
1,2,Jumanji (1995),Adventure|Children's|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama
4,5,Father of the Bride Part II (1995),Comedy


##### Clean up the `movies`

- Get the `year`
- Shorten the `title`


In [9]:
tmp = movies.title.str.extract('(.*) \(([0-9]+)\)')
tmp.apply(lambda x:x[0] if len(x) > 0 else None)
tmp.apply(lambda x: x[0][:40] if len(x) > 0 else None)

0    Toy Story
1         1995
dtype: object

In [10]:
movies['year'] = tmp[1]
movies['short_title'] = tmp[0]

In [12]:
movies.head()

Unnamed: 0,movie_id,title,genres,year,short_title
0,1,Toy Story (1995),Animation|Children's|Comedy,1995,Toy Story
1,2,Jumanji (1995),Adventure|Children's|Fantasy,1995,Jumanji
2,3,Grumpier Old Men (1995),Comedy|Romance,1995,Grumpier Old Men
3,4,Waiting to Exhale (1995),Comedy|Drama,1995,Waiting to Exhale
4,5,Father of the Bride Part II (1995),Comedy,1995,Father of the Bride Part II


##### Join the tables with `pd.merge` (20 pts)

In [14]:
movies_ratings = pd.merge(movies,ratings,on='movie_id')

In [16]:
user_movie_rating = pd.merge(users,movies_ratings,on='user_id')

In [17]:
user_movie_rating.head()

Unnamed: 0,user_id,gender,age,occupation,zip,movie_id,title,genres,year,short_title,rating,timestamp
0,1,F,1,10,48067,1,Toy Story (1995),Animation|Children's|Comedy,1995,Toy Story,5,978824268
1,1,F,1,10,48067,48,Pocahontas (1995),Animation|Children's|Musical|Romance,1995,Pocahontas,5,978824351
2,1,F,1,10,48067,150,Apollo 13 (1995),Drama,1995,Apollo 13,5,978301777
3,1,F,1,10,48067,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,4,978300760
4,1,F,1,10,48067,527,Schindler's List (1993),Drama|War,1993,Schindler's List,5,978824195


##### What's the highest rated movie? (20 pts))

Let's group the data by title and rating, and count the no of times it has been given that rating

In [51]:
no_of_ratings=user_movie_rating.groupby(["short_title", "rating","genres"]).size().reset_index(name="no_of_ratings")
no_of_ratings

Unnamed: 0,short_title,rating,genres,no_of_ratings
0,"$1,000,000 Duck",1,Children's|Comedy,3
1,"$1,000,000 Duck",2,Children's|Comedy,8
2,"$1,000,000 Duck",3,Children's|Comedy,15
3,"$1,000,000 Duck",4,Children's|Comedy,7
4,"$1,000,000 Duck",5,Children's|Comedy,4
...,...,...,...,...
16807,eXistenZ,1,Action|Sci-Fi|Thriller,43
16808,eXistenZ,2,Action|Sci-Fi|Thriller,61
16809,eXistenZ,3,Action|Sci-Fi|Thriller,109
16810,eXistenZ,4,Action|Sci-Fi|Thriller,142


Let us now sort this data to get highest rated movie which is rated most number of times

In [52]:
no_of_ratings.sort_values(ascending = False,by='no_of_ratings').head()

Unnamed: 0,short_title,rating,genres,no_of_ratings
602,American Beauty,5,Comedy|Drama,1963
14332,Star Wars: Episode IV - A New Hope,5,Action|Adventure|Fantasy|Sci-Fi,1826
12350,Raiders of the Lost Ark,5,Action|Adventure,1500
14337,Star Wars: Episode V - The Empire Strikes Back,5,Action|Adventure|Drama|Sci-Fi|War,1483
13213,Schindler's List,5,Drama|War,1475


#### hence American Beauty is the highest rated movie

###### What is a good rated movie for date night? (60 pts)

- Hint - highly rated movie by 
    - both partners (might be the same gender or not),
    - based on genre preferences,
    - age group can also be combined

Let us look at the genres

In [49]:
user_movie_rating.genres.unique()

array(["Animation|Children's|Comedy",
       "Animation|Children's|Musical|Romance", 'Drama',
       'Action|Adventure|Fantasy|Sci-Fi', 'Drama|War', "Children's|Drama",
       "Animation|Children's|Comedy|Musical",
       "Animation|Children's|Musical", 'Crime|Drama|Thriller',
       'Animation', 'Animation|Comedy|Thriller', 'Musical|Romance',
       "Adventure|Children's|Drama|Musical", 'Musical',
       "Children's|Comedy|Musical", "Children's|Drama|Fantasy|Sci-Fi",
       'Action|Adventure|Comedy|Romance', 'Comedy|Sci-Fi',
       'Action|Adventure|Drama',
       "Adventure|Animation|Children's|Comedy|Musical", 'Drama|Romance',
       "Animation|Children's", 'Action|Drama|War', 'Comedy', 'Romance',
       'Action|Crime|Romance', 'Thriller', 'Comedy|Fantasy',
       'Comedy|Drama', 'Action|Comedy|Drama', 'Action|Thriller',
       'Action|Romance|Thriller', 'Action|Drama|Thriller',
       'Action|Adventure|Thriller', 'Comedy|Romance|War',
       'Action|Comedy|Western', 'Action|Adventu

#### My Preferences and Partner's preferences
Mine: Genre-Adventure, Male
hers: Genre- Sci-Fi, Female.
Age Group - 22-27

In [59]:
My_Movies = user_movie_rating[(user_movie_rating.gender == 'M')&(user_movie_rating.age >=22) & (user_movie_rating.age <= 27) & 
                           (user_movie_rating.genres=='Action|Adventure|Fantasy|Sci-Fi')]
My_Movies

Unnamed: 0,user_id,gender,age,occupation,zip,movie_id,title,genres,year,short_title,rating,timestamp
183,3,M,25,15,55117,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,978297512
1506,15,M,25,7,22903,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,4,978212645
3405,26,M,25,7,23112,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,3,978271884
3664,26,M,25,7,23112,2628,Star Wars: Episode I - The Phantom Menace (1999),Action|Adventure|Fantasy|Sci-Fi,1999,Star Wars: Episode I - The Phantom Menace,2,978143313
3769,27,M,25,11,19130,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,978129642
...,...,...,...,...,...,...,...,...,...,...,...,...
997507,6024,M,25,12,53705,2628,Star Wars: Episode I - The Phantom Menace (1999),Action|Adventure|Fantasy|Sci-Fi,1999,Star Wars: Episode I - The Phantom Menace,3,956749699
998010,6030,M,25,17,32618,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,956718342
998076,6030,M,25,17,32618,2105,Tron (1982),Action|Adventure|Fantasy|Sci-Fi,1982,Tron,5,956718610
998096,6030,M,25,17,32618,2628,Star Wars: Episode I - The Phantom Menace (1999),Action|Adventure|Fantasy|Sci-Fi,1999,Star Wars: Episode I - The Phantom Menace,3,956718105


In [60]:
her_Movies = user_movie_rating[(user_movie_rating.gender == 'F')&(user_movie_rating.age >=22) & (user_movie_rating.age <= 27) & 
                           (user_movie_rating.genres=='Action|Adventure|Fantasy|Sci-Fi')]
her_Movies

Unnamed: 0,user_id,gender,age,occupation,zip,movie_id,title,genres,year,short_title,rating,timestamp
3152,24,F,25,7,10023,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,978135156
3244,24,F,25,7,10023,2628,Star Wars: Episode I - The Phantom Menace (1999),Action|Adventure|Fantasy|Sci-Fi,1999,Star Wars: Episode I - The Phantom Menace,3,978135358
3845,28,F,25,1,14607,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,4,978125450
10934,81,F,25,0,60640,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,977786694
11138,83,F,25,2,94609,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,977720825
...,...,...,...,...,...,...,...,...,...,...,...,...
998567,6035,F,25,1,78734,2105,Tron (1982),Action|Adventure|Fantasy|Sci-Fi,1982,Tron,5,956711420
998598,6035,F,25,1,78734,2628,Star Wars: Episode I - The Phantom Menace (1999),Action|Adventure|Fantasy|Sci-Fi,1999,Star Wars: Episode I - The Phantom Menace,5,956711024
998685,6036,F,25,15,32603,260,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1977,Star Wars: Episode IV - A New Hope,5,956710703
999134,6036,F,25,15,32603,2105,Tron (1982),Action|Adventure|Fantasy|Sci-Fi,1982,Tron,3,956754653


#### "Star Wars: Episode IV - A New Hope" has been the most favourite movie of people similar to us and also it is top rated, Let us now check its number of ratings

In [65]:
top_rated = no_of_ratings.sort_values(ascending = False,by='no_of_ratings').head()

In [66]:
top_rated

Unnamed: 0,short_title,rating,genres,no_of_ratings
602,American Beauty,5,Comedy|Drama,1963
14332,Star Wars: Episode IV - A New Hope,5,Action|Adventure|Fantasy|Sci-Fi,1826
12350,Raiders of the Lost Ark,5,Action|Adventure,1500
14337,Star Wars: Episode V - The Empire Strikes Back,5,Action|Adventure|Drama|Sci-Fi|War,1483
13213,Schindler's List,5,Drama|War,1475


#### As we can see it is rated my most number of people in the Adventure/Sci-Fi genre which we are interested in, I believe Star Wars: Episode IV - A New Hope is a perfect date night movie!!!