<h1 align = center> Sub Queries </h1>
<h4 align = center> Facilitator: Kelvin Oyanna</h4>
<h4 align = center> Email: dotkelplus@gmail.com</h4>

Subqueries (also known as inner queries or nested queries) are a tool for performing operations in multiple steps. For example, if you wanted to take the sums of several columns, then average all of those values, you'd need to do each aggregation in a distinct step.

In this lesson, you will learn how to write common subqueries in SQL.

### Load Ipython SQL

In [2]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


### Connect to the dvdrental database from the last lesson

In [3]:
# We're connecting to the PostgreSQL database using the PostgreSQL connection string
# The PostgreSQL string is in this format: %sql dialect+driver://username:password@host:port/databaseName

%sql postgresql://chris:admin1234@localhost/dvdrental

## Basic Subquery
Suppose we want to find the films whose rental rate is higher than the average rental rate. We can do it in two steps:

    - Find the average rental rate by using the SELECT statement and average function ( AVG).
    - Use the result of the first query in the second SELECT statement to find the films that we want

The following query gets the average rental rate:

In [4]:
%sql select * from film limit 5

 * postgresql://chris:***@localhost/dvdrental
5 rows affected.


film_id,title,description,release_year,language_id,rental_duration,rental_rate,length,replacement_cost,rating,last_update,special_features,fulltext
133,Chamber Italian,A Fateful Reflection of a Moose And a Husband who must Overcome a Monkey in Nigeria,2006,1,7,4.99,117,14.99,NC-17,2013-05-26 14:50:58.951000,['Trailers'],'chamber':1 'fate':4 'husband':11 'italian':2 'monkey':16 'moos':8 'must':13 'nigeria':18 'overcom':14 'reflect':5
384,Grosse Wonderful,A Epic Drama of a Cat And a Explorer who must Redeem a Moose in Australia,2006,1,5,4.99,49,19.99,R,2013-05-26 14:50:58.951000,['Behind the Scenes'],'australia':18 'cat':8 'drama':5 'epic':4 'explor':11 'gross':1 'moos':16 'must':13 'redeem':14 'wonder':2
8,Airport Pollock,A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India,2006,1,6,4.99,54,15.99,R,2013-05-26 14:50:58.951000,['Trailers'],'airport':1 'ancient':18 'confront':14 'epic':4 'girl':11 'india':19 'monkey':16 'moos':8 'must':13 'pollock':2 'tale':5
98,Bright Encounters,A Fateful Yarn of a Lumberjack And a Feminist who must Conquer a Student in A Jet Boat,2006,1,4,4.99,73,12.99,PG-13,2013-05-26 14:50:58.951000,['Trailers'],'boat':20 'bright':1 'conquer':14 'encount':2 'fate':4 'feminist':11 'jet':19 'lumberjack':8 'must':13 'student':16 'yarn':5
1,Academy Dinosaur,A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies,2006,1,6,0.99,86,20.99,PG,2013-05-26 14:50:58.951000,"['Deleted Scenes', 'Behind the Scenes']",'academi':1 'battl':15 'canadian':20 'dinosaur':2 'drama':5 'epic':4 'feminist':8 'mad':11 'must':14 'rocki':21 'scientist':12 'teacher':17


In [5]:
%%sql
SELECT AVG (rental_rate)
FROM film

 * postgresql://chris:***@localhost/dvdrental
1 rows affected.


avg
2.98


Now, we can get films whose rental rate is higher than the average rental rate:

In [6]:
%%sql
SELECT film_id, title, rental_rate
FROM film
WHERE rental_rate > 2.98
LIMIT 10

 * postgresql://chris:***@localhost/dvdrental
10 rows affected.


film_id,title,rental_rate
133,Chamber Italian,4.99
384,Grosse Wonderful,4.99
8,Airport Pollock,4.99
98,Bright Encounters,4.99
2,Ace Goldfinger,4.99
3,Adaptation Holes,2.99
4,Affair Prejudice,2.99
5,African Egg,2.99
6,Agent Truman,2.99
7,Airplane Sierra,4.99


The code above is not so elegant, which requires two steps. We want a way to pass the result of the first query to the second query in one query. The solution is to use a subquery.

Let's get this done using subquery:

In [7]:
%%sql
SELECT film_id, title, rental_rate
FROM film
WHERE rental_rate >(
                    SELECT AVG(rental_rate)
                    FROM film )
LIMIT 10

 * postgresql://chris:***@localhost/dvdrental
10 rows affected.


film_id,title,rental_rate
133,Chamber Italian,4.99
384,Grosse Wonderful,4.99
8,Airport Pollock,4.99
98,Bright Encounters,4.99
2,Ace Goldfinger,4.99
3,Adaptation Holes,2.99
4,Affair Prejudice,2.99
5,African Egg,2.99
6,Agent Truman,2.99
7,Airplane Sierra,4.99


## Using IN operator in a Subquery
Let's get films that have their returned date between 2005-05-29 and 2005-05-30

In [8]:
%%sql
SELECT film_id, title
FROM film
WHERE film_id IN (
                SELECT film_id
                FROM inventory inv
                JOIN rental ren
                USING(inventory_id)
                WHERE return_date BETWEEN '2005-05-29' AND '2005-05-30'
                )
LIMIT 10

 * postgresql://chris:***@localhost/dvdrental
10 rows affected.


film_id,title
307,Fellowship Autumn
255,Driving Polish
388,Gunfight Moon
130,Celebrity Horn
563,Massacre Usual
397,Hanky October
898,Tourist Pelican
228,Detective Vision
347,Games Bowfinger
1000,Zorro Ark


## Referencing a subquery with an Alias
A subquery can be denoted with an alias which makes it easy for the subquery to be refrenced and re-used.

Let's get the data of customers living in California or Texas using a subquery

In [9]:
%%sql
SELECT first_name, last_name, district
FROM (
    SELECT address_id, address, district
    FROM address
    WHERE district IN ('California', 'Texas')
) subq

JOIN customer c
USING(address_id)

LIMIT 10

 * postgresql://chris:***@localhost/dvdrental
10 rows affected.


first_name,last_name,district
Patricia,Johnson,California
Jennifer,Davis,Texas
Betty,White,California
Alice,Stewart,California
Rosa,Reynolds,California
Kim,Cruz,Texas
Renee,Lane,California
Kristin,Johnston,California
Cassandra,Walters,California
Richard,Mccrary,Texas


## Creating subqueries with the WITH Keyword
The SQL WITH clause allows you to give a sub query block a name (a process also called sub-query refactoring), which can be referenced in several places within the main SQL query.

The clause is used for defining a temporary relation such that the output of this temporary relation is available and is used by the query that is associated with the WITH clause.

<b>Let's get the full name, district and phone numbers of all active customers</b>

In [10]:
%%sql 
WITH q1 AS (SELECT address_id, first_name || ' '|| last_name as full_name, active
           FROM customer
           WHERE active = 1)

SELECT full_name, district, phone, active
FROM q1
JOIN address
USING(address_id)
LIMIT 10

 * postgresql://chris:***@localhost/dvdrental
10 rows affected.


full_name,district,phone,active
Mary Smith,Nagasaki,28303384290,1
Patricia Johnson,California,838635286649,1
Linda Williams,Attika,448477190408,1
Barbara Jones,Mandalay,705814003527,1
Elizabeth Brown,Nantou,10655648674,1
Jennifer Davis,Texas,860452626434,1
Maria Miller,Central Serbia,716571220373,1
Susan Wilson,Hamilton,657282285970,1
Margaret Moore,Masqat,380657522649,1
Dorothy Taylor,Esfahan,648856936185,1
