# Joins Exercise


Please remember to use the `EXPLAIN` before you execute a query to help avoid unnecessary load on the DBMS and indefinite waits by you for results.

Therefore, for each question, we are providing a cell for the `EXPLAIN` as well as the final SQL.


## Our practice schema:

We will use the same database as in the related [practice](../practices/JoinsPt1.ipynb).

A PDF of the _Entity-Relationship Diagrams_ (ERD) is available [here](https://web.dsa.missouri.edu/static/PDF/DVD_Rental_ERD2.pdf).   
Printing it out is recommended.


In [1]:
%load_ext sql
%sql postgres://dsa_ro_user:readonly@pgsql.dsa.lan/dvdrental

'Connected: dsa_ro_user@dvdrental'

# 1

### List the category name with actors who have been in films in that category, in order of category name and actor last name.

In [44]:
%%sql
EXPLAIN 
SELECT name, first_name, last_name
FROM category
INNER JOIN film_category ON (film_category.category_id = category.category_id)
INNER JOIN film_actor ON (film_actor.film_id = film_category.film_id)
INNER JOIN actor ON (actor.actor_id = film_actor.actor_id)
GROUP BY category.name, actor.last_name, actor.first_name
ORDER BY category.name, actor.last_name;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
17 rows affected.


QUERY PLAN
Sort (cost=388.12..393.24 rows=2048 width=162)
"Sort Key: category.name, actor.last_name"
-> HashAggregate (cost=255.00..275.48 rows=2048 width=162)
"Group Key: category.name, actor.last_name, actor.first_name"
-> Hash Join (cost=39.67..214.04 rows=5462 width=81)
Hash Cond: (film_actor.actor_id = actor.actor_id)
-> Hash Join (cost=33.17..192.90 rows=5462 width=70)
Hash Cond: (film_actor.film_id = film_category.film_id)
-> Seq Scan on film_actor (cost=0.00..84.62 rows=5462 width=4)
-> Hash (cost=20.67..20.67 rows=1000 width=70)


In [43]:
%%sql
SELECT name, first_name, last_name
FROM category
INNER JOIN film_category ON (film_category.category_id = category.category_id)
INNER JOIN film_actor ON (film_actor.film_id = film_category.film_id)
INNER JOIN actor ON (actor.actor_id = film_actor.actor_id)
GROUP BY category.name, actor.last_name, actor.first_name
ORDER BY category.name, actor.last_name;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
2596 rows affected.


name,first_name,last_name
Action,Christian,Akroyd
Action,Kirsten,Akroyd
Action,Kim,Allen
Action,Meryl,Allen
Action,Angelina,Astaire
Action,Russell,Bacall
Action,Jessica,Bailey
Action,Audrey,Bailey
Action,Renee,Ball
Action,Julia,Barrymore


# 2

### List the category name, language name, and actors who have been in films in that category, in order of category name and actor last name.

In [51]:
%%sql
EXPLAIN 
SELECT category.name, language.name, first_name, last_name
FROM category
INNER JOIN film_category ON (film_category.category_id = category.category_id)
INNER JOIN film ON (film.film_id = film_category.film_id)
INNER JOIN language ON (language.language_id = film.language_id)
INNER JOIN film_actor ON (film_actor.film_id = film_category.film_id)
INNER JOIN actor ON (actor.actor_id = film_actor.actor_id)
GROUP BY category.name, actor.last_name, language.name, actor.first_name
ORDER BY category.name, actor.last_name;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
25 rows affected.


QUERY PLAN
Group (cost=637.85..706.12 rows=5462 width=178)
"Group Key: category.name, actor.last_name, language.name, actor.first_name"
-> Sort (cost=637.85..651.50 rows=5462 width=165)
"Sort Key: category.name, actor.last_name, language.name, actor.first_name"
-> Hash Join (cost=124.43..298.79 rows=5462 width=165)
Hash Cond: (film_actor.actor_id = actor.actor_id)
-> Hash Join (cost=117.93..277.65 rows=5462 width=154)
Hash Cond: (film_actor.film_id = film.film_id)
-> Seq Scan on film_actor (cost=0.00..84.62 rows=5462 width=4)
-> Hash (cost=105.43..105.43 rows=1000 width=158)


In [50]:
%%sql
SELECT category.name, language.name, first_name, last_name
FROM category
INNER JOIN film_category ON (film_category.category_id = category.category_id)
INNER JOIN film ON (film.film_id = film_category.film_id)
INNER JOIN language ON (language.language_id = film.language_id)
INNER JOIN film_actor ON (film_actor.film_id = film_category.film_id)
INNER JOIN actor ON (actor.actor_id = film_actor.actor_id)
GROUP BY category.name, actor.last_name, language.name, actor.first_name
ORDER BY category.name, actor.last_name;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
2596 rows affected.


name,name_1,first_name,last_name
Action,English,Christian,Akroyd
Action,English,Kirsten,Akroyd
Action,English,Kim,Allen
Action,English,Meryl,Allen
Action,English,Angelina,Astaire
Action,English,Russell,Bacall
Action,English,Audrey,Bailey
Action,English,Jessica,Bailey
Action,English,Renee,Ball
Action,English,Julia,Barrymore


[Helpful Hints](https://youtu.be/Kvj04-g4yRs)  
 

--- 

# 3

### List the customer name, customer address, film rented, rental date, and duration of rental, in order of longest rental to shortest.

**HINT**: PostgreSQL can do math on _timestamps_ natively.

In [75]:
%%sql
EXPLAIN 
SELECT first_name, last_name, address, title, rental_duration, return_date, rental_date
FROM customer
INNER JOIN rental ON (rental.customer_id = customer.customer_id)
INNER JOIN address ON (customer.address_id = address.address_id)
INNER JOIN city ON (address.city_id = city.city_id)
INNER JOIN inventory ON (rental.inventory_id = inventory.inventory_id)
INNER JOIN film ON (inventory.film_id = film.film_id)
WHERE return_date IS NOT NULL
ORDER BY ((rental.return_date - rental.rental_date)) DESC, film.rental_duration DESC;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
24 rows affected.


QUERY PLAN
Sort (cost=1933.02..1972.68 rows=15861 width=84)
"Sort Key: ((rental.return_date - rental.rental_date)) DESC, film.rental_duration DESC"
-> Hash Join (cost=267.12..826.46 rows=15861 width=84)
Hash Cond: (inventory.film_id = film.film_id)
-> Hash Join (cost=190.62..668.50 rows=15861 width=51)
Hash Cond: (rental.inventory_id = inventory.inventory_id)
-> Hash Join (cost=62.55..498.76 rows=15861 width=53)
Hash Cond: (address.city_id = city.city_id)
-> Hash Join (cost=44.05..438.33 rows=15861 width=55)
Hash Cond: (customer.address_id = address.address_id)


In [90]:
%%sql
SELECT first_name, last_name, address, title, rental_duration,(return_date - rental_date) as time_rented, rental_date
FROM customer
INNER JOIN rental ON (rental.customer_id = customer.customer_id)
INNER JOIN address ON (customer.address_id = address.address_id)
INNER JOIN city ON (address.city_id = city.city_id)
INNER JOIN inventory ON (rental.inventory_id = inventory.inventory_id)
INNER JOIN film ON (inventory.film_id = film.film_id)
WHERE return_date IS NOT NULL
ORDER BY ((rental.return_date - rental.rental_date)) DESC, film.rental_duration DESC;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
15861 rows affected.


first_name,last_name,address,title,rental_duration,time_rented,rental_date
Elaine,Stevens,801 Hagonoy Drive,Holocaust Highball,6,"9 days, 5:59:00",2005-06-18 16:58:58
Martin,Bales,368 Hunuco Boulevard,Highball Potter,6,"9 days, 5:59:00",2005-08-21 20:12:43
James,Gannon,1635 Kuwana Boulevard,Coneheads Smoochy,7,"9 days, 5:58:00",2005-08-17 04:27:24
Vera,Mccoy,1168 Najafabad Parkway,Mask Peach,6,"9 days, 5:58:00",2005-08-20 17:46:06
Pearl,Garza,60 Poos de Caldas Street,Tramp Others,4,"9 days, 5:58:00",2005-08-21 13:07:10
Jacqueline,Long,870 Ashqelon Loop,Panic Club,3,"9 days, 5:58:00",2005-07-28 10:21:52
Ashley,Richardson,1214 Hanoi Way,Notorious Reunion,7,"9 days, 5:56:00",2005-07-31 15:28:47
Brittany,Riley,140 Chiayi Parkway,Attacks Hate,5,"9 days, 5:56:00",2005-07-10 11:50:51
Connie,Wallace,1867 San Juan Bautista Tuxtepec Avenue,Jersey Sassy,6,"9 days, 5:55:00",2005-08-18 00:36:09
Chris,Brothers,331 Bydgoszcz Parkway,Tracy Cider,3,"9 days, 5:55:00",2005-07-06 18:03:16


# 4

### List the staff name, customer name, film rented, rental date, and duration of rental, showing only the 20 longest rentals.

In [81]:
%%sql
EXPLAIN 
SELECT staff.first_name, staff.last_name, customer.first_name, customer.last_name, title, rental_date, return_date, rental_duration
FROM staff
INNER JOIN rental ON (rental.staff_id = staff.staff_id)
INNER JOIN inventory ON (inventory.inventory_id = rental.inventory_id)
INNER JOIN film ON (inventory.film_id = film.film_id)
INNER JOIN customer ON (rental.customer_id = customer.customer_id)
WHERE return_date IS NOT NULL
ORDER BY ((rental.return_date - rental.rental_date)) DESC, film.rental_duration DESC
LIMIT 20;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
21 rows affected.


QUERY PLAN
Limit (cost=1255.51..1255.56 rows=20 width=280)
-> Sort (cost=1255.51..1295.16 rows=15861 width=280)
"Sort Key: ((rental.return_date - rental.rental_date)) DESC, film.rental_duration DESC"
-> Hash Join (cost=228.10..833.45 rows=15861 width=280)
Hash Cond: (rental.customer_id = customer.customer_id)
-> Hash Join (cost=205.62..729.40 rows=15861 width=251)
Hash Cond: (inventory.film_id = film.film_id)
-> Hash Join (cost=129.12..611.08 rows=15861 width=236)
Hash Cond: (rental.inventory_id = inventory.inventory_id)
-> Hash Join (cost=1.04..441.34 rows=15861 width=238)


In [89]:
%%sql
SELECT staff.first_name, staff.last_name, customer.first_name, customer.last_name, title, rental_date, (return_date - rental_date) as time_rented, rental_duration
FROM staff
INNER JOIN rental ON (rental.staff_id = staff.staff_id)
INNER JOIN inventory ON (inventory.inventory_id = rental.inventory_id)
INNER JOIN film ON (inventory.film_id = film.film_id)
INNER JOIN customer ON (rental.customer_id = customer.customer_id)
WHERE return_date IS NOT NULL
ORDER BY ((rental.return_date - rental.rental_date)) DESC, film.rental_duration DESC
LIMIT 20;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
20 rows affected.


first_name,last_name,first_name_1,last_name_1,title,rental_date,time_rented,rental_duration
Jon,Stephens,Martin,Bales,Highball Potter,2005-08-21 20:12:43,"9 days, 5:59:00",6
Mike,Hillyer,Elaine,Stevens,Holocaust Highball,2005-06-18 16:58:58,"9 days, 5:59:00",6
Jon,Stephens,James,Gannon,Coneheads Smoochy,2005-08-17 04:27:24,"9 days, 5:58:00",7
Jon,Stephens,Vera,Mccoy,Mask Peach,2005-08-20 17:46:06,"9 days, 5:58:00",6
Jon,Stephens,Pearl,Garza,Tramp Others,2005-08-21 13:07:10,"9 days, 5:58:00",4
Mike,Hillyer,Jacqueline,Long,Panic Club,2005-07-28 10:21:52,"9 days, 5:58:00",3
Jon,Stephens,Ashley,Richardson,Notorious Reunion,2005-07-31 15:28:47,"9 days, 5:56:00",7
Mike,Hillyer,Brittany,Riley,Attacks Hate,2005-07-10 11:50:51,"9 days, 5:56:00",5
Mike,Hillyer,Connie,Wallace,Jersey Sassy,2005-08-18 00:36:09,"9 days, 5:55:00",6
Jon,Stephens,Chris,Brothers,Tracy Cider,2005-07-06 18:03:16,"9 days, 5:55:00",3


# Save your Notebook, then `File > Close and Halt`

---