# Joins Exercise


Please remember to use the `EXPLAIN` before you execute a query to help avoid unnecessary load on the DBMS and indefinite waits by you for results.

Therefore, for each question, we are providing a cell for the `EXPLAIN` as well as the final SQL.


## Our practice schema:

We will use the same database as we used in join practice part 2 [practice](../practices/JoinsPt2.ipynb).

A PDF of the _Entity-Relationship Diagrams_ (ERD) is available [here](https://web.dsa.missouri.edu/static/PDF/DVD_Rental_ERD2.pdf).   
Printing it out is recommended.


In [1]:
%load_ext sql
%sql postgres://dsa_ro_user:readonly@pgsql.dsa.lan/dvdrental

'Connected: dsa_ro_user@dvdrental'

# 1

### List the category name with actors who have been in films in that category, in order of category name and actor last name.

In [6]:
%%sql
EXPLAIN 
SELECT name, first_name, last_name
FROM category as C JOIN  film_category as FC
ON C.category_id = FC.category_id
JOIN film as F
ON F.film_id = FC.film_id
JOIN film_actor as FA 
ON F.film_id = FA.film_id
JOIN actor as A 
ON FA.actor_id = A.actor_id
GROUP BY name, first_name , last_name
ORDER BY name, last_name

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
21 rows affected.


QUERY PLAN
Sort (cost=467.26..472.38 rows=2048 width=81)
"Sort Key: c.name, a.last_name"
-> HashAggregate (cost=334.14..354.62 rows=2048 width=81)
"Group Key: c.name, a.last_name, a.first_name"
-> Hash Join (cost=118.81..293.17 rows=5462 width=81)
Hash Cond: (fa.actor_id = a.actor_id)
-> Hash Join (cost=112.31..272.03 rows=5462 width=70)
Hash Cond: (fa.film_id = f.film_id)
-> Seq Scan on film_actor fa (cost=0.00..84.62 rows=5462 width=4)
-> Hash (cost=99.81..99.81 rows=1000 width=74)


In [5]:
%%sql
SELECT name, first_name, last_name
FROM category as C JOIN  film_category as FC
ON C.category_id = FC.category_id
JOIN film as F
ON F.film_id = FC.film_id
JOIN film_actor as FA 
ON F.film_id = FA.film_id
JOIN actor as A 
ON FA.actor_id = A.actor_id
GROUP BY name, first_name , last_name
ORDER BY name, last_name






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
2596 rows affected.


name,first_name,last_name
Action,Christian,Akroyd
Action,Kirsten,Akroyd
Action,Kim,Allen
Action,Meryl,Allen
Action,Angelina,Astaire
Action,Russell,Bacall
Action,Jessica,Bailey
Action,Audrey,Bailey
Action,Renee,Ball
Action,Julia,Barrymore


# 2

### List the category name, language name, and actors who have been in films in that category, in order of category name and actor last name.

In [16]:
%%sql
EXPLAIN 
SELECT C.name, L.name, first_name, last_name
FROM category as C JOIN  film_category as FC
ON C.category_id = FC.category_id
JOIN film as F
ON F.film_id = FC.film_id
JOIN film_actor as FA 
ON F.film_id = FA.film_id
JOIN language as L 
ON F.language_id = L.language_id
JOIN actor as A 
ON FA.actor_id = A.actor_id
GROUP BY C.name, L.name, first_name , last_name
ORDER BY C.name, last_name





 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
25 rows affected.


QUERY PLAN
Group (cost=637.85..706.12 rows=5462 width=165)
"Group Key: c.name, a.last_name, l.name, a.first_name"
-> Sort (cost=637.85..651.50 rows=5462 width=165)
"Sort Key: c.name, a.last_name, l.name, a.first_name"
-> Hash Join (cost=124.43..298.79 rows=5462 width=165)
Hash Cond: (fa.actor_id = a.actor_id)
-> Hash Join (cost=117.93..277.65 rows=5462 width=154)
Hash Cond: (fa.film_id = f.film_id)
-> Seq Scan on film_actor fa (cost=0.00..84.62 rows=5462 width=4)
-> Hash (cost=105.43..105.43 rows=1000 width=158)


In [19]:
%%sql
SELECT C.name, L.name, first_name, last_name
FROM category as C JOIN  film_category as FC
ON C.category_id = FC.category_id
JOIN film as F
ON F.film_id = FC.film_id
JOIN film_actor as FA 
ON F.film_id = FA.film_id
JOIN language as L 
ON F.language_id = L.language_id
JOIN actor as A 
ON FA.actor_id = A.actor_id
GROUP BY C.name, L.name, first_name, last_name
ORDER BY C.name, last_name






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
2596 rows affected.


name,name_1,first_name,last_name
Action,English,Christian,Akroyd
Action,English,Kirsten,Akroyd
Action,English,Kim,Allen
Action,English,Meryl,Allen
Action,English,Angelina,Astaire
Action,English,Russell,Bacall
Action,English,Audrey,Bailey
Action,English,Jessica,Bailey
Action,English,Renee,Ball
Action,English,Julia,Barrymore


[Helpful Hints](https://youtu.be/Kvj04-g4yRs)  
 

--- 

# 3

### List the customer name, customer address, film rented, rental date, and duration of rental, in order of longest rental to shortest.

**HINT**: PostgreSQL can do math on _timestamps_ natively.

In [46]:
%%sql
EXPLAIN
SELECT first_name, last_name, title, address, rental_date, (return_date::date - rental_date::date) as rental_duration
FROM film as F 
JOIN inventory as I 
ON F.film_id  = I.film_id
JOIN rental as R 
ON I.inventory_id = R.inventory_id
JOIN customer as C
ON R.customer_id = C.customer_id
JOIN address as A 
ON C.address_id = A.address_id
WHERE (rental_date, return_date) IS NOT NULL 
GROUP BY first_name, last_name, title, address, rental_date, rental_duration, return_date
ORDER BY rental_duration desc ;


 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
22 rows affected.


QUERY PLAN
Sort (cost=2388.08..2427.73 rows=15861 width=70)
Sort Key: (((r.return_date)::date - (r.rental_date)::date)) DESC
-> HashAggregate (cost=1003.96..1281.52 rows=15861 width=70)
"Group Key: c.first_name, c.last_name, f.title, a.address, r.rental_date, f.rental_duration, r.return_date"
-> Hash Join (cost=248.62..726.39 rows=15861 width=66)
Hash Cond: (c.address_id = a.address_id)
-> Hash Join (cost=227.05..662.90 rows=15861 width=48)
Hash Cond: (r.customer_id = c.customer_id)
-> Hash Join (cost=204.57..598.49 rows=15861 width=35)
Hash Cond: (i.film_id = f.film_id)


In [7]:
%%sql
SELECT first_name, last_name, title, address, rental_date, (return_date::timestamp - rental_date::timestamp) as rental_duration
FROM film as F 
JOIN inventory as I 
ON F.film_id  = I.film_id
JOIN rental as R 
ON I.inventory_id = R.inventory_id
JOIN customer as C
ON R.customer_id = C.customer_id
JOIN address as A 
ON C.address_id = A.address_id
WHERE (rental_date, return_date) IS NOT NULL 
GROUP BY first_name, last_name, title, address, rental_date, rental_duration, return_date
ORDER BY rental_duration desc ;





 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
15861 rows affected.


first_name,last_name,title,address,rental_date,rental_duration
Martin,Bales,Highball Potter,368 Hunuco Boulevard,2005-08-21 20:12:43,"9 days, 5:59:00"
Elaine,Stevens,Holocaust Highball,801 Hagonoy Drive,2005-06-18 16:58:58,"9 days, 5:59:00"
Pearl,Garza,Tramp Others,60 Poos de Caldas Street,2005-08-21 13:07:10,"9 days, 5:58:00"
James,Gannon,Coneheads Smoochy,1635 Kuwana Boulevard,2005-08-17 04:27:24,"9 days, 5:58:00"
Jacqueline,Long,Panic Club,870 Ashqelon Loop,2005-07-28 10:21:52,"9 days, 5:58:00"
Vera,Mccoy,Mask Peach,1168 Najafabad Parkway,2005-08-20 17:46:06,"9 days, 5:58:00"
Brittany,Riley,Attacks Hate,140 Chiayi Parkway,2005-07-10 11:50:51,"9 days, 5:56:00"
Ashley,Richardson,Notorious Reunion,1214 Hanoi Way,2005-07-31 15:28:47,"9 days, 5:56:00"
Chris,Brothers,Tracy Cider,331 Bydgoszcz Parkway,2005-07-06 18:03:16,"9 days, 5:55:00"
Connie,Wallace,Jersey Sassy,1867 San Juan Bautista Tuxtepec Avenue,2005-08-18 00:36:09,"9 days, 5:55:00"


# 4

### List the staff name, customer name, film rented, rental date, and duration of rental, showing only the 20 longest rentals.

In [6]:
%%sql
EXPLAIN 
SELECT title, (return_date::timestamp - rental_date::timestamp) as rental_duration, rental_date, s.first_name as staff_first_name, s.last_name as staff_last_name, c.first_name, c.last_name
FROM film as F 
JOIN inventory as I 
ON F.film_id  = I.film_id
JOIN rental as R 
ON I.inventory_id = R.inventory_id
JOIN payment as P
ON R.rental_id = P.rental_id 
JOIN staff as S 
ON P.staff_id = S.staff_id 
JOIN customer as C
ON R.customer_id = C.customer_id
JOIN address as A 
ON C.address_id = A.address_id
WHERE (rental_date, return_date) IS NOT NULL 
GROUP BY staff_first_name, staff_last_name, title, C.first_name, C.last_name, rental_date, rental_duration, return_date
ORDER BY rental_duration desc
LIMIT 20 ;






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
33 rows affected.


QUERY PLAN
Limit (cost=3062.64..3062.69 rows=20 width=278)
-> Sort (cost=3062.64..3098.72 rows=14430 width=278)
Sort Key: ((r.return_date - r.rental_date)) DESC
-> Group (cost=2317.92..2678.67 rows=14430 width=278)
"Group Key: s.first_name, s.last_name, f.title, c.first_name, c.last_name, r.rental_date, f.rental_duration, r.return_date"
-> Sort (cost=2317.92..2353.99 rows=14430 width=262)
"Sort Key: s.first_name, s.last_name, f.title, c.first_name, c.last_name, r.rental_date, f.rental_duration, r.return_date"
-> Hash Join (cost=758.37..1321.03 rows=14430 width=262)
Hash Cond: (c.address_id = a.address_id)
-> Hash Join (cost=736.80..1261.32 rows=14430 width=264)


In [5]:
%%sql
SELECT title, (return_date::timestamp - rental_date::timestamp) as rental_duration, rental_date, s.first_name as staff_first_name, s.last_name as staff_last_name, c.first_name, c.last_name
FROM film as F 
JOIN inventory as I 
ON F.film_id  = I.film_id
JOIN rental as R 
ON I.inventory_id = R.inventory_id
JOIN payment as P
ON R.rental_id = P.rental_id 
JOIN staff as S 
ON P.staff_id = S.staff_id 
JOIN customer as C
ON R.customer_id = C.customer_id
JOIN address as A 
ON C.address_id = A.address_id
WHERE (rental_date, return_date) IS NOT NULL 
GROUP BY staff_first_name, staff_last_name, title, C.first_name, C.last_name, rental_date, rental_duration, return_date
ORDER BY rental_duration desc
LIMIT 20 ;






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
20 rows affected.


title,rental_duration,rental_date,staff_first_name,staff_last_name,first_name,last_name
Highball Potter,"9 days, 5:59:00",2005-08-21 20:12:43,Jon,Stephens,Martin,Bales
Holocaust Highball,"9 days, 5:59:00",2005-06-18 16:58:58,Jon,Stephens,Elaine,Stevens
Coneheads Smoochy,"9 days, 5:58:00",2005-08-17 04:27:24,Mike,Hillyer,James,Gannon
Tramp Others,"9 days, 5:58:00",2005-08-21 13:07:10,Jon,Stephens,Pearl,Garza
Mask Peach,"9 days, 5:58:00",2005-08-20 17:46:06,Jon,Stephens,Vera,Mccoy
Panic Club,"9 days, 5:58:00",2005-07-28 10:21:52,Mike,Hillyer,Jacqueline,Long
Attacks Hate,"9 days, 5:56:00",2005-07-10 11:50:51,Mike,Hillyer,Brittany,Riley
Notorious Reunion,"9 days, 5:56:00",2005-07-31 15:28:47,Jon,Stephens,Ashley,Richardson
Tracy Cider,"9 days, 5:55:00",2005-07-06 18:03:16,Mike,Hillyer,Chris,Brothers
Jersey Sassy,"9 days, 5:55:00",2005-08-18 00:36:09,Mike,Hillyer,Connie,Wallace


# Save your Notebook, then `File > Close and Halt`

---