# More Practice with Movies

In this assignment we introduce a third table in the movie database, the release_date table. Thus in the postgresql database we now have three tables:

* moviecast
* title
* release_date

The tables have the following schema:

```
 Table "public.title"
 Column |  Type  | Modifiers 
--------+--------+-----------
 index  | bigint | 
 title  | text   | 
 year   | bigint | 


  Table "public.moviecast"
  Column   |       Type       | Modifiers 
-----------+------------------+-----------
 index     | bigint           | 
 title     | text             | 
 year      | bigint           | 
 name      | text             | 
 type      | text             | 
 character | text             | 
 n         | double precision | 



 Table "public.release_date"
 Column  |  Type  | Modifiers 
---------+--------+-----------
 index   | bigint | 
 title   | text   | 
 year    | bigint | 
 country | text   | 
 date    | date   | 
 month   | bigint |
 day     | bigint |
 dow     | bigint |
```


### Hints

* Warning, some of these queries can create very large cartesian products! You may want to use query to help reduce the size of both relations.
* The ``njoin`` operator is more efficient with memory than cartesian product. You should prefer njoin where you can.


In [2]:
from reframe import Relation
import warnings
warnings.filterwarnings('ignore')
from sols5 import *

moviecast = Relation('/home/faculty/millbr02/pub/cast.csv',sep=',')
title = Relation('/home/faculty/millbr02/pub/titles.csv',sep=',')
release_date = Relation('/home/faculty/millbr02/pub/release_dates.csv',sep=',')

In [3]:
%load_ext sql



In [4]:
%sql postgresql://aljamo01:@localhost/movies

'Connected: aljamo01@movies'

## 1.  How many movies were released in each country in the year 2014?

Your result should be a table that contains the columns country and releases.

In [4]:
release_date.query('year == 2014').groupby(['country']).count('year').rename('count_year', 'release').project(['country', 'release'])

Unnamed: 0,country,release
0,Afghanistan,3
1,Albania,30
2,Algeria,3
3,Andorra,1
4,Angola,4
5,Antigua and Barbuda,1
6,Argentina,216
7,Armenia,16
8,Aruba,26
9,Australia,230


In [8]:
sol_cp1r(release_date,moviecast,_)

AttributeError: 'ResultSet' object has no attribute 'columns'

In [5]:
%%sql
SELECT country,COUNT(year) AS release
From release_date
WHERE year = 2014
GROUP BY country;

152 rows affected.


country,release
Costa Rica,19
Cambodia,79
Turkey,316
Cyprus,57
Samoa,1
Slovenia,113
Vietnam,143
Kuwait,346
Jamaica,24
Antigua and Barbuda,1


In [7]:
sol_cp1s(_)

Error! Did you name the columns correctly?
Columns' names expected: country, count.


AssertionError: 

## 2.  Show all of the countries in alphabetical order that released a movie starring Brad Pitt before 2000.

Your answer should be a single column with the names of the countries. The country names should not be duplicated.

In [6]:
moviecast.njoin(release_date).query('name == "Brad Pitt" & n == 1 & year < 2000').project(['country']).sort(['country'])

Unnamed: 0,country
3891495,Argentina
3891522,Australia
3891502,Austria
3891492,Belgium
3891489,Brazil
10442831,Bulgaria
9565644,Canada
9565663,Czech Republic
3891519,Denmark
3891514,Estonia


In [None]:
sol_cp2r(release_date,moviecast,_)

In [7]:
%%sql
SELECT country
FROM moviecast NATURAL JOIN release_date
WHERE n = 1 and name = 'Brad Pitt' and year < 2000
GROUP BY country
ORDER BY country;

41 rows affected.


country
Argentina
Australia
Austria
Belgium
Brazil
Bulgaria
Canada
Czech Republic
Denmark
Estonia


In [None]:
sol_cp2s(_)

## 3.  Show the name of the lead actor/actress and title of the movie where the movie was released on a Sunday in the USA during 2014. Hint: Assume Sunday is day 0.

In [24]:
release_date.njoin(moviecast)\
.query('country == "USA" & year == 2014 & dow == 0 & n == 1')\
.project(['name', 'title']).sort(['name'])

Unnamed: 0,name,title
9767199,April Hollingsworth,Prosper
9989298,Ashley (VII) James,Rebound (III)
8356616,Barry Corbin,Mountain Top
13219994,Brandon Jacobs,The Grievance Group
13220045,Cabrina Collesides,The Grievance Group
4192379,Cameron Bender,Find Me
10150990,Christopher (VI) Hunt,Right to Believe
4707169,Daniel (II) Gilchrist,Ghost in da hood
5443969,Dean Cain,Holiday Miracle
16456454,Donnie Yen,Xi you ji: Da nao tian gong


In [None]:
sol_cp3r(release_date,moviecast,_)

In [6]:
%%sql
SELECT distinct(name), title
FROM moviecast NATURAL JOIN release_date
WHERE country = 'USA' and year = 2014 and dow = 0 and n = 1
ORDER BY name;

42 rows affected.


name,title
April Hollingsworth,Prosper
Ashley (VII) James,Rebound (III)
Barry Corbin,Mountain Top
Brandon Jacobs,The Grievance Group
Cabrina Collesides,The Grievance Group
Cameron Bender,Find Me
Christopher (VI) Hunt,Right to Believe
Daniel (II) Gilchrist,Ghost in da hood
Dean Cain,Holiday Miracle
Donnie Yen,Xi you ji: Da nao tian gong


In [None]:
sol_cp3s(_)

## 4. Show the top 10 actors/actresses that have starred in the most movies released in Germany in the Summer season since (and not including) the year 2000.

Let us define the Summer season as the months of June, July and August.

In [5]:
release_date.njoin(moviecast)\
.query('country == "Germany" & month== (6,7,8) & year > 2000 & n==1')\
.groupby(['name'])\
.count('title')\
.sort('count_title' ,ascending = False)\
.head(10)

Unnamed: 0,name,count_title
184,Dan Castellaneta,21
399,John Cleese,17
206,Denis Lavant,11
579,Michael Herbig,9
109,Brad Pitt,6
7,Adam Sandler,6
540,Mark Wahlberg,6
735,Sandra Bullock,5
169,Colin Farrell,5
723,Ryan Reynolds,5


In [None]:
sol_cp4r(release_date,moviecast,_)

In [12]:
%%sql
SELECT name, count(title)
FROM release_date NATURAL JOIN moviecast
WHERE country = 'Germany' and (month = 6 or month= 7 or month = 8) and year > 2000 and n = 1
GROUP BY name
ORDER BY count(title) DESC
LIMIT(10);

10 rows affected.


name,count
Dan Castellaneta,21
John Cleese,17
Denis Lavant,11
Michael Herbig,9
Adam Sandler,6
Mark Wahlberg,6
Brad Pitt,6
Will Smith,5
Eddie Murphy,5
Ryan Reynolds,5


In [8]:
sol_cp4s(_)

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

## 5.  Show the title of all of the movies that were released in Germany before they were released in the USA in the year 2014. Hint: Use a Cartesian Product.

In [32]:
release_date.query('(country == "Germany" | country == "USA") & year == 2014')\
.cartesian_product(release_date.query('(country == "Germany" | country == "USA") & year== 2014'))\
.query('date_x < date_y & country_x == "Germany" & country_y == "USA" & title_x == title_y')\
.project(['title_x']).rename('title_x', 'title')              

Unnamed: 0,title
42726,300: Rise of an Empire
83742,A Little Chaos
99123,A Million Ways to Die in the West
357182,Big Game
370854,Billy Elliot the Musical Live
396489,Blended
430669,Boyhood
488775,Captain America: The Winter Soldier
540045,Clouds of Sils Maria
685310,Der 7bte Zwerg


In [None]:
sol_cp5r(release_date,moviecast,_)


In [19]:
%%sql
SELECT A.title
FROM release_date A, release_date B
WHERE A.country = 'Germany' and B.country = 'USA' and A.date < B.date and A.title = B.title and A.year = 2014 and B.year = 2014;

59 rows affected.


title
Mr. Peabody & Sherman
Neighbors
Night at the Museum: Secret of the Tomb
Northmen - A Viking Saga
Paddington
Paranormal Activity: The Marked Ones
Phoenix (II)
Praia do Futuro
Rio 2
RoboCop


In [None]:
sol_cp5s(_)

In [13]:
%%sql
SELECT distinct(name), title
FROM moviecast NATURAL JOIN release_date
WHERE country = 'Germany' and year = 2015 and n = 1 and name!= 'Adam Sandler'
ORDER BY name
LIMIT(10);

10 rows affected.


name,title
Alan Cumming,Strange Magic
Alex Ozerov,Coconut Hero
Amy Poehler,Inside Out
Andy Herzog,Wintergast
Ange Dargent,Microbe et Gasoil
Angelababy,Hitman: Agent 47
Anna Kendrick,Pitch Perfect 2
Anne-Marie Duff,Suffragette
Antonio Banderas,The SpongeBob Movie: Sponge Out of Water
Anton Petzold,"Rico, Oskar und das Herzgebreche"


In [14]:
%%sql
SELECT title,name
FROM moviecast
WHERE year = 2011 and n = 1 and name = character;

10 rows affected.


title,name
Hair is Falling: A Serious Comedy Film,Rajesh Bhardwaj
Dar Emtedad Shahr,Mohammad Reza Golzar
Box Office 3D: Il film dei film,Ezio Greggio
Pater,Vincent Lindon
Den Sidste Rejse,Finn N?rbygaard
Madea's Big Happy Family,Tyler Perry
We Will Be Strong in Our Weakness,Slawomir Sierakowski
How to Eat Eggnog,Dulvlu Spa
Armin Only: Mirage,Armin van Buuren
Che bella giornata,Checco Zalone


In [25]:
%%sql
SELECT '1900' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1900' and year <= '1910' and type ='actress' and country = 'USA'
UNION
SELECT '1910' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1910' and year <= '1920' and type ='actress' and country = 'USA'
UNION
SELECT '1920' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1920' and year <= '1930' and type ='actress' and country = 'USA'
UNION
SELECT '1930' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1930' and year <= '1940' and type ='actress' and country = 'USA'
UNION
SELECT '1940' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1940' and year <= '1950' and type ='actress' and country = 'USA'
UNION 
SELECT '1950' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1950' and year <= '1960' and type ='actress' and country = 'USA'
UNION
SELECT '1960' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1960' and year <= '1970' and type ='actress' and country = 'USA'
UNION
SELECT '1970' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1970' and year <= '1980' and type ='actress' and country = 'USA'
UNION
SELECT '1980' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1980' and year <= '1990' and type ='actress' and country = 'USA'
UNION
SELECT '1990' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '1990' and year <= '2000' and type ='actress' and country = 'USA'
UNION
SELECT '2000' decade, COUNT(*) FROM moviecast NATURAL JOIN release_date WHERE year >= '2000' and year <= '2010' and type ='actress' and country = 'USA'
ORDER BY decade





11 rows affected.


decade,count
1900,17
1910,12943
1920,17411
1930,39642
1940,32574
1950,18344
1960,12393
1970,11779
1980,30476
1990,45632


In [16]:
%%sql
SELECT *
FROM moviecast
LIMIT(5)

5 rows affected.


index,title,year,name,type,character,n
1072563,Enemy of the State,1998,Eric Keung,actor,Mambo Kitchen Worker #1,62.0
1072564,Chung Nam Hoi bo biu,1994,Hak-Shing Keung,actor,Assassin in Shopping Center,
1072565,Dai si gin,2004,Hak-Shing Keung,actor,Xin,
1072566,Duo shuai,2008,Hak-Shing Keung,actor,Kwok Wing Ching,20.0
1072567,Mit moon,2010,Hak-Shing Keung,actor,Policeman,27.0


In [17]:
%%sql
SELECT *
FROM release_date
LIMIT(5)

5 rows affected.


title,year,country,date,month,day,dow
w Delta z,2007,UK,2008-02-22 00:00:00,2,22,4
w Delta z,2007,Spain,2008-02-29 00:00:00,2,29,4
w Delta z,2007,Sweden,2008-03-28 00:00:00,3,28,4
w Delta z,2007,Turkey,2008-05-02 00:00:00,5,2,4
w Delta z,2007,Russia,2008-06-26 00:00:00,6,26,3
