# Nested Queries, Type I Subquery

Nested queries are **subqueries** that exist within a larger (aka _outer_) query.

**Conceptual Type I / II Subquery**
![Subquery](../images/subquery-syntax.gif)



# Use-Case

Imagine you are asked to report the City and Country from the `cities` table with the lowest and highest population. 

How would you do this? We could first find the MIN() and MAX() of the populations, then secondly construct a query to use those values to select cities.

In [None]:
%load_ext sql
%sql postgres://dsa_ro_user:readonly@pgsql.dsa.lan/dsa_ro

In [None]:
%sql SELECT * FROM cities LIMIT 5;

In [None]:
%sql SELECT MIN(population) FROM cities;

In [None]:
%sql SELECT MAX(population) FROM cities;

We should find the following values:
 * Minimum is 1001600
 * Maximum is 22315500


In [None]:
%%sql 
SELECT city, country, population 
FROM cities
WHERE population in (1001600,22315500)
ORDER BY population

Notice that to get our answer, we constructed a set of values, `(1001600, 22315500)`, and tested each row to have the population value be one of those two values.

This query could also have been written as 

```SQL
SELECT city, country, population 
FROM cities
WHERE population = 1001600
  OR  population = 22315500
ORDER BY population
```

The nested query allows us to use a query within the parentheses to generate a list.

In [None]:
%%sql 
SELECT city, country, population 
FROM cities
WHERE population = (SELECT MIN(population) FROM cities)
  OR  population = (SELECT MAX(population) FROM cities)
ORDER BY population

 --   Alternatively  --

In [None]:
%%sql 
SELECT city, country, population 
FROM cities
WHERE population IN ( 
    (SELECT MIN(population) FROM cities), (SELECT MAX(population) FROM cities) 
    )
ORDER BY population

<img src="../images/subquery-type1.png" width="500" >

## Type I Subqueries

When the subqueries can be computed **one time**, then the result reused for each row of the _outer_ query, we have a Type I (one). In contrast, for Type II subqueries must be run for each row of the outer query.

Looking at the plan the database develops for the query, we see two `InitPlan` queries.

The queries are _uncorrelated_ to the output query rows.

In [None]:
%%sql 
EXPLAIN
SELECT city, country, population 
FROM cities
WHERE population in ( 
    (SELECT MIN(population) FROM cities), (SELECT MAX(population) FROM cities) 
    )
ORDER BY population

You can see that the `InitPlan`s each store their values into a variable, `$0` and `$1`, respectively.  

These values are then used in the sequential table scan and the test of `population IN ($0,$1)`, written in the plan as 
```
Filter: (population = ANY (ARRAY[0,1]))
```


**Now run the SQL command!**

In [None]:
%%sql 
SELECT city, country, population 
FROM cities
WHERE population in ( 
    (SELECT MIN(population) FROM cities), (SELECT MAX(population) FROM cities) 
    )
ORDER BY population

# Save your Notebook, then `File > Close and Halt`