# Subquery as Table

### Introduction

In the previous lessons, we used subqueries in the WHERE clause, and saw how our subquery can return either a single entry or a vector.  In this lesson, we'll see how we can use our subquery to return an entire table.

### Loading Our Data

Once you have copied the SQL statements above, then run the following to create the database, and execute the code. 

Then we can connect to our database with the following.

In [3]:
import sqlite3
conn = sqlite3.connect('./moes_bar.db')
cursor = conn.cursor()

In [6]:
import pandas as pd
root_url = "https://raw.githubusercontent.com/data-eng-10-21/sql-interview-questions/main/2-subqueries-in-where/data/"
names = ['bartenders', 'customers', 'drinks', 'orders', 'ingredients', 'ingredients_drinks']
loaded_dfs = [pd.read_csv(f'{root_url}{name}.csv') for name in names]

In [4]:
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
cursor.fetchall()

[('bartenders',),
 ('customers',),
 ('drinks',),
 ('orders',),
 ('ingredients',),
 ('ingredients_drinks',)]

Next we load our data.

In [8]:
# for index, name in enumerate(names):
#     loaded_dfs[index].to_sql(f'{name}', conn, index = False)

### Loading our Data

Now we list of all of the tables with the following.

In [9]:
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
cursor.fetchall()

[('bartenders',),
 ('customers',),
 ('drinks',),
 ('orders',),
 ('ingredients',),
 ('ingredients_drinks',)]

And then we can see the details of a particular table with the following.

In [10]:
cursor.execute("pragma table_info(drinks)")
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 1),
 (1, 'name', 'TEXT', 0, None, 0),
 (2, 'calories', 'INTEGER', 0, None, 0),
 (3, 'price', 'INTEGER', 0, None, 0),
 (4, 'alcoholic', 'INTEGER', 0, None, 0)]

### Subqueries for Aggregates

One use case for subqueries is to change the dimension of our data.  For example, let's say that we would like to return all customers who ordered more than two drinks.

In [18]:
pd.read_sql("""select * from customers join 
(select count(*) as amount, orders.customer_id from orders group by orders.customer_id having amount > 2)  
buying_customers on buying_customers.customer_id = customers.id""", conn)


Unnamed: 0,id,name,hometown,birthyear,amount,customer_id
0,1,bart simpson,springfield,2008,3,1
1,2,maggie simpson,milwaukee,2016,4,2


So above, we can see that bart and maggie are both returned as bart and maggie both purchased more than two orders.

Study the sql subquery above.  Notice that it follows the following pattern:

```sql
SELECT * FROM some_table 
JOIN (subquery here) subquery_alias 
ON subquery_alias.some_id = some_table.some_id
```

### Your turn

Now, without looking at query above, write a SQL query that returns all of the customers who made exactly two orders.

In [20]:
pd.read_sql("""

""", conn)

# 	id	name	hometown	birthyear	amount	customer_id
# 0	3	lisa simpson	philly	2006	2	3

Unnamed: 0,id,name,hometown,birthyear,amount,customer_id
0,3,lisa simpson,philly,2006,2,3


### Count Distinct

Finally, while not a subquery, let's assume that we only had the orders table, and wanted to find the number of distinct customers who made orders.  We could do so with the following:

In [22]:
pd.read_sql("""select COUNT(distinct orders.customer_id) unique_customers from orders""", conn)

Unnamed: 0,unique_customers
0,3


Now write a query that returns columns of `unique_customers` and `unique_drinks` to return the number of unique customers who made orders, and the number of unique drinks involved across all orders.

> Do not simply copy and paste the code above.  Try to write it without referencing the above so you make sure you know it.

In [24]:
pd.read_sql("""SELECT COUNT(distinct orders.customer_id) as unique_customers, COUNT(distinct orders.drink_id) unique_drinks from orders""", conn)

# 	unique_customers	unique_drinks
# 0	3	6

Unnamed: 0,unique_customers,unique_drinks
0,3,6


### Resources


[Customer largest orders](https://leetcode.com/problems/customer-placing-the-largest-number-of-orders/)