# Nested Query Exercise

Please remember to use the `EXPLAIN` before you execute a query to help avoid unnecessary load on the DBMS and indefinite waits by you for results.

Therefore, for each question, we are providing a cell for the `EXPLAIN` as well as the final SQL.


## Our practice schema:

We will use the same database as in the Day 1 practice.

A PDF of the _Entity-Relationship Diagrams_ (ERD) is available [here](https://indigo.sgn.missouri.edu/static/PDF/DVD_Rental_ERD2.pdf).  
Printing is recommended.


<span style="font-weight:900; background:yellow">Each query should be implemented with at least one nested query.</span>

In [2]:
%load_ext sql
%sql postgres://dsa_ro_user:readonly@pgsql.dsa.lan/dvdrental

'Connected: dsa_ro_user@dvdrental'

# 1

### Which films have no rentals on the date of 2005-05-31

**HINT:** PostgreSQL can cast a _timestamp_ to a _date_ as so: `rental.rental_date::date`.

In [3]:
%%sql
EXPLAIN 
SELECT DISTINCT film_id
FROM inventory
WHERE film_id NOT IN (
    SELECT film_id
    FROM inventory JOIN rental USING (inventory_id)
    WHERE rental_date::date = '2005-05-31')
ORDER BY film_id

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
12 rows affected.


QUERY PLAN
Sort (cost=717.54..719.84 rows=923 width=2)
Sort Key: inventory.film_id
-> HashAggregate (cost=662.85..672.08 rows=923 width=2)
Group Key: inventory.film_id
-> Seq Scan on inventory (cost=574.86..657.12 rows=2290 width=2)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Nested Loop (cost=0.28..574.66 rows=80 width=2)
-> Seq Scan on rental (cost=0.00..390.66 rows=80 width=4)
Filter: ((rental_date)::date = '2005-05-31'::date)


In [4]:
%%sql 
SELECT DISTINCT film_id
FROM inventory
WHERE film_id NOT IN (
    SELECT film_id
    FROM inventory JOIN rental USING (inventory_id)
    WHERE rental_date::date = '2005-05-31')
ORDER BY film_id

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
809 rows affected.


film_id
1
2
5
6
7
9
10
11
12
13


[Helpful Hints](https://youtu.be/MWpp2ioeAb8)  
 

--- 

# 2

### Which customers (name, phone number) have outstanding rentals (film name, rental_date)?

In [5]:
%%sql
EXPLAIN 
SELECT DISTINCT first_name, last_name, phone
FROM address
JOIN customer USING (address_id)
JOIN rental USING (customer_id)
WHERE return_date IS NULL


 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
12 rows affected.


QUERY PLAN
HashAggregate (cost=356.82..358.65 rows=183 width=25)
"Group Key: customer.first_name, customer.last_name, address.phone"
-> Hash Join (cost=44.05..355.45 rows=183 width=25)
Hash Cond: (customer.address_id = address.address_id)
-> Hash Join (cost=22.48..333.40 rows=183 width=15)
Hash Cond: (rental.customer_id = customer.customer_id)
-> Seq Scan on rental (cost=0.00..310.44 rows=183 width=2)
Filter: (return_date IS NULL)
-> Hash (cost=14.99..14.99 rows=599 width=19)
-> Seq Scan on customer (cost=0.00..14.99 rows=599 width=19)


In [6]:
%%sql
SELECT DISTINCT first_name, last_name, phone
FROM address
JOIN customer USING (address_id)
JOIN rental USING (customer_id)
WHERE return_date IS NULL







 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
159 rows affected.


first_name,last_name,phone
Yolanda,Weaver,352469351088
Dwayne,Olvera,62127829280
Christian,Jung,122981120653
Charlie,Bess,962020153680
Freddie,Duggan,644021380889
Felix,Gaffney,107092893983
Tyler,Wren,211256301880
Louise,Jenkins,800716535041
Gilbert,Sledge,959467760895
Lawrence,Lawton,845378657301


# 3

### List the movies that are not categorized as children's movies.

In [7]:
%%sql
EXPLAIN 
SELECT film_id, title
FROM category
JOIN film_category USING (category_id)
JOIN film USING (film_id)
WHERE name NOT IN (
    SELECT name
    FROM category
    WHERE name = 'Children')



 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
13 rows affected.


QUERY PLAN
Hash Join (cost=79.00..99.63 rows=500 width=19)
Hash Cond: (film_category.film_id = film.film_id)
-> Hash Join (cost=2.50..21.82 rows=500 width=2)
Hash Cond: (film_category.category_id = category.category_id)
-> Seq Scan on film_category (cost=0.00..16.00 rows=1000 width=4)
-> Hash (cost=2.40..2.40 rows=8 width=4)
-> Seq Scan on category (cost=1.20..2.40 rows=8 width=4)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on category category_1 (cost=0.00..1.20 rows=1 width=68)


In [8]:
%%sql
SELECT film_id, title
FROM category
JOIN film_category USING (category_id)
JOIN film USING (film_id)
WHERE name NOT IN (
    SELECT name
    FROM category
    WHERE name = 'Children')





 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
940 rows affected.


film_id,title
1,Academy Dinosaur
2,Ace Goldfinger
3,Adaptation Holes
4,Affair Prejudice
5,African Egg
6,Agent Truman
7,Airplane Sierra
8,Airport Pollock
9,Alabama Devil
10,Aladdin Calendar


[Helpful Hints](https://youtu.be/9WR0ByMn__E)  
 

--- 

# 4

### List the names of the customers who have rented the 5 least popular movies.

**The five least populat movies are those movies with the least film rentals**

(Do not include movies that have never been rented, also do not worry about ties go with the 5 even though there may be other movies rented the same amount of times as some in the 5 least popular.)

In [9]:
%%sql
EXPLAIN 
SELECT first_name, last_name
FROM customer
JOIN rental USING (customer_id)
JOIN inventory USING (inventory_id)
WHERE film_id IN (
    SELECT film_id
    FROM rental JOIN inventory USING (inventory_id)
    GROUP BY film_id
    ORDER BY COUNT(rental_id)
    LIMIT 5)



 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
21 rows affected.


QUERY PLAN
Nested Loop (cost=587.07..707.83 rows=84 width=13)
-> Nested Loop (cost=586.79..682.99 rows=84 width=2)
-> Hash Join (cost=586.51..669.40 rows=24 width=4)
"Hash Cond: (inventory.film_id = ""ANY_subquery"".film_id)"
-> Seq Scan on inventory (cost=0.00..70.81 rows=4581 width=6)
-> Hash (cost=586.45..586.45 rows=5 width=2)
"-> Subquery Scan on ""ANY_subquery"" (cost=586.38..586.45 rows=5 width=2)"
-> Limit (cost=586.38..586.40 rows=5 width=10)
-> Sort (cost=586.38..588.78 rows=958 width=10)
Sort Key: (count(rental_1.rental_id))


In [10]:
%%sql
SELECT first_name, last_name
FROM customer
JOIN rental USING (customer_id)
JOIN inventory USING (inventory_id)
WHERE film_id IN (
    SELECT film_id
    FROM rental JOIN inventory USING (inventory_id)
    GROUP BY film_id
    ORDER BY COUNT(rental_id)
    LIMIT 5)


 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
22 rows affected.


first_name,last_name
Ramon,Choate
William,Satterfield
Raul,Fortier
Hector,Poindexter
Craig,Morrell
Marsha,Douglas
April,Burns
Bob,Pfeiffer
Julia,Flores
Pauline,Henry


# 5

### List the movies that have been rented by the top ten renters.

In [11]:
%%sql
EXPLAIN 
SELECT title
FROM film
JOIN inventory USING (film_id)
JOIN rental USING (inventory_id)
WHERE customer_id IN (
    SELECT customer_id
    FROM rental
    GROUP BY customer_id
    ORDER BY COUNT(rental_id) DESC
    LIMIT 10)

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
18 rows affected.


QUERY PLAN
Hash Join (cost=486.63..922.98 rows=268 width=15)
Hash Cond: (inventory.film_id = film.film_id)
-> Nested Loop (cost=410.13..845.77 rows=268 width=2)
-> Hash Join (cost=409.84..762.70 rows=268 width=4)
"Hash Cond: (rental.customer_id = ""ANY_subquery"".customer_id)"
-> Seq Scan on rental (cost=0.00..310.44 rows=16044 width=6)
-> Hash (cost=409.72..409.72 rows=10 width=2)
"-> Subquery Scan on ""ANY_subquery"" (cost=409.59..409.72 rows=10 width=2)"
-> Limit (cost=409.59..409.62 rows=10 width=10)
-> Sort (cost=409.59..411.09 rows=599 width=10)


In [12]:
%%sql
SELECT title
FROM film
JOIN inventory USING (film_id)
JOIN rental USING (inventory_id)
WHERE customer_id IN (
    SELECT customer_id
    FROM rental
    GROUP BY customer_id
    ORDER BY COUNT(rental_id) DESC
    LIMIT 10)

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
413 rows affected.


title
Goldfinger Sensibility
Champion Flatliners
Moon Bunch
Movie Shakespeare
Half Outfield
Christmas Moonshine
Uprising Uptown
Graffiti Love
Stone Fire
Lebowski Soldiers


# 6

### Consider the previous question and the answer SQL.  Now add two columns to the result: a) number of rentals per film and b) number of rentals by the _top-ten renters_ per film.

In [13]:
%%sql
EXPLAIN 
SELECT a.title, b.cnt AS total_count, a.tt_cnt AS top_ten_count    
    
FROM (    
    SELECT title, COUNT(rental_id) AS tt_cnt
    FROM film
    JOIN inventory USING (film_id)
    JOIN rental USING (inventory_id)
    WHERE customer_id IN (
        SELECT customer_id
        FROM rental
        GROUP BY customer_id
        ORDER BY COUNT(rental_id) DESC
        LIMIT 10)
    GROUP BY title
) AS a

JOIN (
    SELECT title, COUNT(rental_id) AS cnt
    FROM film
    JOIN inventory USING (film_id)
    JOIN rental USING (inventory_id)
    GROUP BY title
) AS b
USING (title)


 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
34 rows affected.


QUERY PLAN
Hash Join (cost=1612.71..1635.39 rows=1340 width=31)
Hash Cond: ((film.title)::text = (film_1.title)::text)
-> HashAggregate (cost=679.69..689.69 rows=1000 width=23)
Group Key: film.title
-> Hash Join (cost=204.57..599.47 rows=16044 width=19)
Hash Cond: (inventory.film_id = film.film_id)
-> Hash Join (cost=128.07..480.67 rows=16044 width=6)
Hash Cond: (rental.inventory_id = inventory.inventory_id)
-> Seq Scan on rental (cost=0.00..310.44 rows=16044 width=8)
-> Hash (cost=70.81..70.81 rows=4581 width=6)


In [14]:
%%sql
SELECT a.title, b.cnt AS total_count, a.tt_cnt AS top_ten_count    
    
FROM (    
    SELECT title, COUNT(rental_id) AS tt_cnt
    FROM film
    JOIN inventory USING (film_id)
    JOIN rental USING (inventory_id)
    WHERE customer_id IN (
        SELECT customer_id
        FROM rental
        GROUP BY customer_id
        ORDER BY COUNT(rental_id) DESC
        LIMIT 10)
    GROUP BY title
) AS a

JOIN (
    SELECT title, COUNT(rental_id) AS cnt
    FROM film
    JOIN inventory USING (film_id)
    JOIN rental USING (inventory_id)
    GROUP BY title
) AS b
USING (title)

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
324 rows affected.


title,total_count,top_ten_count
Graceland Dynamite,6,1
Wonderful Drop,9,1
Purple Movie,13,2
Fantasy Troopers,26,1
Sunset Racer,8,1
Carol Texas,18,1
Stepmom Dream,20,1
Arabia Dogma,13,1
Minds Truman,20,1
Baked Cleopatra,16,1


# 7

### List the city of rental stores, `store_id` and the movies that have not been rented from that store.

**Note:** A video walk through for this challenging SQL is provided below.

In [15]:
%%sql
EXPLAIN 
SELECT c.city, s.store_id, f.title, f.film_id
FROM store s
JOIN address a USING (address_id)
JOIN city c USING (city_id)
-- Cross Product against film
, film f

-- Remove rows that match store and film

WHERE NOT EXISTS (
    SELECT 'Z'
    FROM film f2 JOIN inventory i USING (film_id)
    JOIN rental r USING (inventory_id)
    JOIN payment p USING (rental_id, staff_id)
    JOIN staff ss USING (staff_id)
    WHERE f2.film_id = f.film_id
    AND s.store_id = ss.store_id
)

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
31 rows affected.


QUERY PLAN
Hash Anti Join (cost=1296.26..1427.01 rows=1 width=32)
Hash Cond: ((s.store_id = ss.store_id) AND (f.film_id = f2.film_id))
-> Nested Loop (cost=1.32..107.07 rows=2000 width=32)
-> Seq Scan on film f (cost=0.00..64.00 rows=1000 width=19)
-> Materialize (cost=1.32..18.07 rows=2 width=13)
-> Nested Loop (cost=1.32..18.06 rows=2 width=13)
-> Hash Join (cost=1.04..17.36 rows=2 width=6)
Hash Cond: (a.address_id = s.address_id)
-> Seq Scan on address a (cost=0.00..14.03 rows=603 width=6)
-> Hash (cost=1.02..1.02 rows=2 width=6)


In [17]:
%%sql
 
SELECT c.city, s.store_id, f.title, f.film_id
FROM store s
JOIN address a USING (address_id)
JOIN city c USING (city_id)
-- Cross Product against film
, film f

-- Remove rows that match store and film

WHERE NOT EXISTS (
    SELECT 'Z'
    FROM film f2 JOIN inventory i USING (film_id)
    JOIN rental r USING (inventory_id)
    JOIN payment p USING (rental_id, staff_id)
    JOIN staff ss USING (staff_id)
    WHERE f2.film_id = f.film_id
    AND s.store_id = ss.store_id
)

 * postgres://dsa_ro_user:***@pgsql.dsa.lan/dvdrental
145 rows affected.


city,store_id,title,film_id
Lethbridge,1,Grosse Wonderful,384
Lethbridge,1,Alice Fantasia,14
Woodridge,2,Alice Fantasia,14
Lethbridge,1,Apollo Teen,33
Woodridge,2,Apollo Teen,33
Lethbridge,1,Argonauts Town,36
Woodridge,2,Argonauts Town,36
Lethbridge,1,Ark Ridgemont,38
Woodridge,2,Ark Ridgemont,38
Lethbridge,1,Arsenic Independence,41


#### Helpful Hints
  1. For the first hint watch only the first 5:57 of the video where the conceptual aspects of the task are discussed.
  1. Then attempt to construct SQL based on the video explanation of the concept.
  1. If you get stuck again, the remainder of the video after that looks directly at the SQL construction.
  
[Helpful Hints](https://youtu.be/GyMODTEDfu4)  



# SAVE YOUR NOTEBOOK