# Nested Query Exercise

Please remember to use the `EXPLAIN` before you execute a query to help avoid unnecessary load on the DBMS and indefinite waits by you for results.

Therefore, for each question, we are providing a cell for the `EXPLAIN` as well as the final SQL.


## Our practice schema:

We will be using the DVD rental schema for this exercise.

The ERD is available [here](../images/ERD-Rental.pdf).  
Printing is recommended.


<span style="font-weight:900; background:yellow">Each query should be implemented with at least one nested query.</span>

In [66]:
%load_ext sql
%sql postgres://dsa_ro_user:readonly@pgsql.dsa.lan/dvdrental

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


'Connected: dsa_ro_user@dvdrental'

# 1

### Which films have no rentals on the date of 2005-05-31

**HINT:** PostgreSQL can cast a _timestamp_ to a _date_ as so: `rental.rental_date::date`.

In [14]:
%%sql
EXPLAIN 
SELECT DISTINCT film_id FROM inventory 
WHERE NOT EXISTS (SELECT rental.rental_date::date FROM rental WHERE inventory.inventory_id = rental.inventory_id
  AND rental_date = '2005-05-31'
                 )       
ORDER BY film_id ;

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
10 rows affected.


QUERY PLAN
Sort (cost=205.42..207.82 rows=958 width=2)
Sort Key: inventory.film_id
-> HashAggregate (cost=148.40..157.98 rows=958 width=2)
Group Key: inventory.film_id
-> Hash Anti Join (cost=8.31..136.95 rows=4580 width=2)
Hash Cond: (inventory.inventory_id = rental.inventory_id)
-> Seq Scan on inventory (cost=0.00..70.81 rows=4581 width=6)
-> Hash (cost=8.30..8.30 rows=1 width=4)
-> Index Only Scan using idx_unq_rental_rental_date_inventory_id_customer_id on rental (cost=0.29..8.30 rows=1 width=4)
Index Cond: (rental_date = '2005-05-31 00:00:00'::timestamp without time zone)


In [15]:
%%sql
SELECT DISTINCT film_id FROM inventory 
WHERE NOT EXISTS (SELECT * FROM rental WHERE inventory.inventory_id = rental.inventory_id
  AND rental_date::date = '2005-05-31'
                 )        
ORDER BY film_id ;




 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
957 rows affected.


film_id
1
2
3
4
5
6
7
8
9
10


[Helpful Hints](https://youtu.be/MWpp2ioeAb8)  
 

--- 

# 2

### Which customers (name, phone number) have outstanding rentals (film name, rental_date)?

In [48]:
%%sql
EXPLAIN 
SELECT title, rental_date, first_name, last_name, phone
FROM film 
JOIN inventory ON film.film_id = inventory.film_id
JOIN rental ON inventory.inventory_id = rental.inventory_id
JOIN customer ON rental.customer_id = customer.customer_id
JOIN address ON customer.address_id = address.address_id 
WHERE EXISTS (SELECT 'x' FROM rental WHERE inventory.inventory_id = rental.inventory_id
                 AND return_date IS NULL )









 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
20 rows affected.


QUERY PLAN
Hash Join (cost=357.33..612.51 rows=641 width=48)
Hash Cond: (customer.address_id = address.address_id)
-> Hash Join (cost=335.77..589.24 rows=641 width=38)
Hash Cond: (rental.customer_id = customer.customer_id)
-> Nested Loop (cost=313.29..565.07 rows=641 width=25)
-> Nested Loop (cost=313.00..461.47 rows=183 width=23)
-> Hash Semi Join (cost=312.73..398.51 rows=183 width=10)
Hash Cond: (inventory.inventory_id = rental_1.inventory_id)
-> Seq Scan on inventory (cost=0.00..70.81 rows=4581 width=6)
-> Hash (cost=310.44..310.44 rows=183 width=4)


In [53]:
%%sql
SELECT title, rental_date, first_name, last_name, phone
FROM film 
JOIN inventory ON film.film_id = inventory.film_id
JOIN rental ON inventory.inventory_id = rental.inventory_id
JOIN customer ON rental.customer_id = customer.customer_id
JOIN address ON customer.address_id = address.address_id 
WHERE EXISTS (SELECT 'x' FROM rental WHERE inventory.inventory_id = rental.inventory_id
                 AND return_date IS NULL )









 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
625 rows affected.


title,rental_date,first_name,last_name,phone
Academy Dinosaur,2005-05-27 07:03:28,Sergio,Stanfield,387448063440
Academy Dinosaur,2005-06-21 00:30:26,Freddie,Duggan,644021380889
Academy Dinosaur,2005-07-07 20:59:06,Marie,Turner,177727722820
Academy Dinosaur,2005-07-27 07:51:11,Mattie,Hoffman,246810237916
Academy Dinosaur,2005-08-21 00:30:32,Dwayne,Olvera,62127829280
Ace Goldfinger,2005-08-01 04:24:47,Penny,Neal,271149517630
Ace Goldfinger,2006-02-14 15:16:03,Brandon,Huey,99883471275
Affair Prejudice,2005-07-11 20:21:18,Janice,Ward,663449333709
Affair Prejudice,2005-07-30 18:39:28,Ruben,Geary,52709222667
Affair Prejudice,2006-02-14 15:16:03,Carmen,Owens,272234298332


# 3

### List the movies that are not categorized as children's movies.

In [56]:
%%sql
EXPLAIN 
SELECT title FROM film as f JOIN film_category as fc ON f.film_id = fc.film_id
WHERE NOT EXISTS (SELECT * FROM category WHERE fc.category_id = category.category_id
                 AND name = 'Children')






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
10 rows affected.


QUERY PLAN
Hash Join (cost=77.71..108.26 rows=938 width=15)
Hash Cond: (fc.film_id = f.film_id)
-> Hash Anti Join (cost=1.21..29.29 rows=938 width=2)
Hash Cond: (fc.category_id = category.category_id)
-> Seq Scan on film_category fc (cost=0.00..16.00 rows=1000 width=4)
-> Hash (cost=1.20..1.20 rows=1 width=4)
-> Seq Scan on category (cost=0.00..1.20 rows=1 width=4)
Filter: ((name)::text = 'Children'::text)
-> Hash (cost=64.00..64.00 rows=1000 width=19)
-> Seq Scan on film f (cost=0.00..64.00 rows=1000 width=19)


In [63]:
%%sql
SELECT title FROM film as f JOIN film_category as fc ON f.film_id = fc.film_id
WHERE NOT EXISTS (SELECT * FROM category WHERE fc.category_id = category.category_id
                 AND name = 'Children') ;








 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
940 rows affected.


title
Academy Dinosaur
Ace Goldfinger
Adaptation Holes
Affair Prejudice
African Egg
Agent Truman
Airplane Sierra
Airport Pollock
Alabama Devil
Aladdin Calendar


[Helpful Hints](https://youtu.be/9WR0ByMn__E)  
 

--- 

# 4

### List the names of the customers who have rented the 5 least popular movies.

**The five least popular movies are those movies with the least film rentals**

(Do not include movies that have never been rented, also do not worry about ties go with the 5 even though there may be other movies rented the same number of times as some in the 5 least popular.)

In [167]:
%%sql
EXPLAIN
select first_name, last_name from customer WHERE customer_id in  (
select customer_id from rental WHERE inventory_id in (
select inventory_id from rental GROUP BY inventory_id 
                                HAVING COUNT(inventory_id) >= 1 
                                ORDER BY COUNT(inventory_id) ASC LIMIT 5 ))



 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
17 rows affected.


QUERY PLAN
Nested Loop (cost=645.26..651.89 rows=18 width=13)
-> HashAggregate (cost=644.98..645.16 rows=18 width=2)
Group Key: rental.customer_id
-> Nested Loop (cost=556.96..644.94 rows=18 width=2)
-> Limit (cost=552.64..552.65 rows=5 width=12)
-> Sort (cost=552.64..564.09 rows=4580 width=12)
Sort Key: (count(rental_1.inventory_id))
-> HashAggregate (cost=430.77..476.57 rows=4580 width=12)
Group Key: rental_1.inventory_id
Filter: (count(rental_1.inventory_id) >= 1)


In [169]:
%%sql
select first_name, last_name from customer WHERE customer_id in  (
select customer_id from rental WHERE inventory_id in (
select inventory_id from rental GROUP BY inventory_id 
                                HAVING COUNT(inventory_id) >= 1 
                                ORDER BY COUNT(inventory_id) ASC LIMIT 5 ))



 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
6 rows affected.


first_name,last_name
Judy,Gray
Holly,Fox
Maria,Miller
Myrtle,Fleming
Jason,Morrissey
Jonathan,Scarborough


# 5

### List the movies that have been rented by the top ten renters.

In [171]:
%%sql
EXPLAIN 
SELECT title from film WHERE film_id in (
SELECT film_id from inventory WHERE inventory_id in (
SELECT inventory_id from rental WHERE customer_id in ( 
SELECT customer_id from rental GROUP BY customer_id 
                               ORDER BY count(customer_id) DESC LIMIT 10)))





 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
19 rows affected.


QUERY PLAN
Hash Semi Join (cost=855.22..924.82 rows=268 width=15)
Hash Cond: (film.film_id = inventory.film_id)
-> Seq Scan on film (cost=0.00..64.00 rows=1000 width=19)
-> Hash (cost=851.87..851.87 rows=268 width=2)
-> Hash Semi Join (cost=766.05..851.87 rows=268 width=2)
Hash Cond: (inventory.inventory_id = rental.inventory_id)
-> Seq Scan on inventory (cost=0.00..70.81 rows=4581 width=6)
-> Hash (cost=762.70..762.70 rows=268 width=4)
-> Hash Join (cost=409.84..762.70 rows=268 width=4)
"Hash Cond: (rental.customer_id = ""ANY_subquery"".customer_id)"


In [172]:
%%sql
SELECT title from film WHERE film_id in (
SELECT film_id from inventory WHERE inventory_id in (
SELECT inventory_id from rental WHERE customer_id in ( 
SELECT customer_id from rental GROUP BY customer_id 
                               ORDER BY count(customer_id) DESC LIMIT 10)))






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
324 rows affected.


title
Chamber Italian
Airport Pollock
Adaptation Holes
Affair Prejudice
Alabama Devil
Date Speed
Ali Forever
Alone Trip
Alter Victory
American Circus


# 6

### Consider the previous question and the answer SQL.  Now add a column to the result that is the total number of movie rentals for the _top-ten renters_ per film.

In [186]:
%%sql
EXPLAIN 
SELECT title, count(*) from film JOIN inventory on film.film_id = inventory.film_id
JOIN rental ON inventory.inventory_id = rental.inventory_id WHERE film.film_id in (
SELECT film_id from inventory WHERE inventory_id in (
SELECT inventory_id from rental WHERE customer_id in ( 
SELECT customer_id from rental GROUP BY customer_id 
                               ORDER BY count(customer_id) DESC LIMIT 10)))
GROUP BY title




 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
29 rows affected.


QUERY PLAN
HashAggregate (cost=1478.91..1488.91 rows=1000 width=23)
Group Key: film.title
-> Hash Join (cost=1043.79..1457.41 rows=4301 width=15)
Hash Cond: (rental.inventory_id = inventory.inventory_id)
-> Seq Scan on rental (cost=0.00..310.44 rows=16044 width=4)
-> Hash (cost=1028.44..1028.44 rows=1228 width=19)
-> Hash Join (cost=928.17..1028.44 rows=1228 width=19)
Hash Cond: (inventory.film_id = film.film_id)
-> Seq Scan on inventory (cost=0.00..70.81 rows=4581 width=6)
-> Hash (cost=924.82..924.82 rows=268 width=21)


In [187]:
%%sql
SELECT title, count(*) from film JOIN inventory on film.film_id = inventory.film_id
JOIN rental ON inventory.inventory_id = rental.inventory_id WHERE film.film_id in (
SELECT film_id from inventory WHERE inventory_id in (
SELECT inventory_id from rental WHERE customer_id in ( 
SELECT customer_id from rental GROUP BY customer_id 
                               ORDER BY count(customer_id) DESC LIMIT 10)))
GROUP BY title







 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
324 rows affected.


title,count
Graceland Dynamite,6
Wonderful Drop,9
Purple Movie,13
Fantasy Troopers,26
Sunset Racer,8
Carol Texas,18
Stepmom Dream,20
Arabia Dogma,13
Minds Truman,20
Baked Cleopatra,16


# 7

### List the city of rental stores, `store_id` and the movies that have not been rented from that store.

**Note:** A video walk through for this challenging SQL is provided below.

In [188]:
%%sql
EXPLAIN 
SELECT c.city, s.store_id, f.title, f.film_id
FROM store s 
JOIN address a using (address_id)
JOIN city c using (city_id)
, film f
WHERE NOT EXISTS(
    SELECT 'Z'
    FROM film f2 JOIn inventory i USING (film_id)
    JOIN rental r USING (inventory_id)
    JOIN payment p USING (rental_id, staff_id)
    JOIN staff ss USING (staff_id)
    WHERE f2.film_id = f.film_id 
    AND s.store_id = ss.store_id
)

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
31 rows affected.


QUERY PLAN
Hash Anti Join (cost=1296.26..1427.01 rows=1 width=32)
Hash Cond: ((s.store_id = ss.store_id) AND (f.film_id = f2.film_id))
-> Nested Loop (cost=1.32..107.07 rows=2000 width=32)
-> Seq Scan on film f (cost=0.00..64.00 rows=1000 width=19)
-> Materialize (cost=1.32..18.07 rows=2 width=13)
-> Nested Loop (cost=1.32..18.06 rows=2 width=13)
-> Hash Join (cost=1.04..17.36 rows=2 width=6)
Hash Cond: (a.address_id = s.address_id)
-> Seq Scan on address a (cost=0.00..14.03 rows=603 width=6)
-> Hash (cost=1.02..1.02 rows=2 width=6)


In [189]:
%%sql
SELECT c.city, s.store_id, f.title, f.film_id
FROM store s 
JOIN address a using (address_id)
JOIN city c using (city_id)
, film f
WHERE NOT EXISTS(
    SELECT 'Z'
    FROM film f2 JOIn inventory i USING (film_id)
    JOIN rental r USING (inventory_id)
    JOIN payment p USING (rental_id, staff_id)
    JOIN staff ss USING (staff_id)
    WHERE f2.film_id = f.film_id 
    AND s.store_id = ss.store_id
)






 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
145 rows affected.


city,store_id,title,film_id
Lethbridge,1,Grosse Wonderful,384
Lethbridge,1,Alice Fantasia,14
Woodridge,2,Alice Fantasia,14
Lethbridge,1,Apollo Teen,33
Woodridge,2,Apollo Teen,33
Lethbridge,1,Argonauts Town,36
Woodridge,2,Argonauts Town,36
Lethbridge,1,Ark Ridgemont,38
Woodridge,2,Ark Ridgemont,38
Lethbridge,1,Arsenic Independence,41


#### Helpful Hints
  1. For the first hint watch only the first 5:57 of the video where the conceptual aspects of the task are discussed.
  1. Then attempt to construct SQL based on the video explanation of the concept.
  1. If you get stuck again, the remainder of the video after that looks directly at the SQL construction.
  
[Helpful Hints](https://youtu.be/GyMODTEDfu4)  


# Save your Notebook, then `File > Close and Halt`