# SQL SubQueries in Where

### Introduction

One topic that we should become familiar with is the use of subqueries.  As we may know, subqueries allow us to break our SQL queries into steps.  

Now there are multiple use cases for subqueries, but let's start with a single one: using a subquery in a WHERE clause.

### Loading Our Data

Once you have copied the SQL statements above, then run the following to create the database, and execute the code. 

Then we can connect to our database with the following.

In [1]:
import sqlite3
conn = sqlite3.connect('./moes_bar.db')
cursor = conn.cursor()

In [2]:
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
cursor.fetchall()

[('bartenders',),
 ('customers',),
 ('drinks',),
 ('orders',),
 ('ingredients',),
 ('ingredients_drinks',)]

Next we load our data.

In [5]:
import pandas as pd
root_url = "https://raw.githubusercontent.com/data-eng-10-21/mod-1-sql-curriculum/master/2-sql-relations/5-has-many-through-bar/"
names = ['bartenders', 'customers', 'drinks', 'orders', 'ingredients', 'ingredients_drinks']
loaded_dfs = [pd.read_csv(f'{root_url}{name}.csv') for name in names]

In [26]:
for index, name in enumerate(names):
    loaded_dfs[index].to_sql(f'{name}', conn, index = False)

### Loading our Data

Now we list of all of the tables with the following.

In [3]:
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
cursor.fetchall()

[('bartenders',),
 ('customers',),
 ('drinks',),
 ('orders',),
 ('ingredients',),
 ('ingredients_drinks',)]

And then we can see the details of a particular table with the following.

In [4]:
cursor.execute("pragma table_info(drinks)")
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 1),
 (1, 'name', 'TEXT', 0, None, 0),
 (2, 'calories', 'INTEGER', 0, None, 0),
 (3, 'price', 'INTEGER', 0, None, 0),
 (4, 'alcoholic', 'INTEGER', 0, None, 0)]

### Subqueries as an entry

Let's begin by using a subquery to find the most expensive drink.

In [7]:
cursor.execute("select * from drinks where price = (select max(price) from drinks)")
cursor.fetchall()

[(4, 'ice cream float', 250, 8, 0)]

So breaking down the above we can see that we first select find the maximum price, which returns to us 8, and then from there we find all of the drinks that have that price.

> Yes, another way to find this is with something like the following:

In [10]:
cursor.execute("select *, max(price) from drinks")
cursor.fetchall()

[(4, 'ice cream float', 250, 8, 0, 8)]

But this last query will only return one drink, whereas the first query will return multiple drinks if there are ties.

### Subqueries as a vector

In the above query, we used subqueries in the where clause that returned only a single value, such as the price of 8.  But we can also use subqueries that return a list of values (which we refer to as a vector).

For example, let's say that we want to find all of the drinks that have the top two prices including any ties.  We can do so with something like the following.

In [12]:
cursor.execute("""select * from drinks where price IN
               (select price from drinks order by price desc limit 2)""")
cursor.fetchall()

[(4, 'ice cream float', 250, 8, 0),
 (5, 'duff beer', 200, 7, 1),
 (6, 'gin and tonic', 200, 7, 1)]

So this time, our subquery returned a vector of values, `(7, 8)` and then from there we saw all of the drinks that matched that price.  

> Notice that if we tried this without a subquery we would not have seen the tie.

In [14]:
cursor.execute("""select * from drinks order by price desc limit 2""")
cursor.fetchall()

[(4, 'ice cream float', 250, 8, 0), (5, 'duff beer', 200, 7, 1)]

### Joining on a vector

Now that we've seen how to return a vector from a subquery, we can also see how we can join a table on a returned vector.

For example, let's say that we want to find all of the orders involving a drink that costs either 7 or 8 dollars.  For that we can do something like the following.

In [18]:
cursor.execute("""select orders.* from orders
join (select drinks.id as id from drinks order by price desc limit 2) top_drinks
            on orders.drink_id = top_drinks.id order by orders.id""")
cursor.fetchall()

[(3, 2, 5, 2), (4, 2, 5, 1), (5, 2, 5, 1), (9, 3, 4, 3)]

Another way to do this is with a WHERE clause.

In [20]:
cursor.execute("""select orders.* from orders
WHERE orders.drink_id IN (select drinks.id as id from drinks
order by price desc limit 2) order by orders.id""")
cursor.fetchall()

[(3, 2, 5, 2), (4, 2, 5, 1), (5, 2, 5, 1), (9, 3, 4, 3)]

### Summary

In this lesson, we saw how to use subqueries in the where clause.  We saw that our how subqueries can return a single entry or a vector of entries.  When our subquery returns a single entry, we can then see where our records match that value.

In [21]:
cursor.execute("select * from drinks where price = (select max(price) from drinks)")
cursor.fetchall()

[(4, 'ice cream float', 250, 8, 0)]

And where our records return a vector, we can then pair this with a WHERE IN clause.

In [None]:
cursor.execute("""select * from drinks where price IN
               (select price from drinks order by price desc limit 2)""")
cursor.fetchall()

Or we can also use our vector in a join clause.

In [22]:
cursor.execute("""select orders.* from orders
join (select drinks.id as id from drinks order by price desc limit 2) top_drinks
            on orders.drink_id = top_drinks.id order by orders.id""")
cursor.fetchall()

[(3, 2, 5, 2), (4, 2, 5, 1), (5, 2, 5, 1), (9, 3, 4, 3)]