<div align="right" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/ExploreAI_logos/Logo blue_dark.png"  style="width:25px" align="right";/>
</div>

# Subqueries and CTEs
© ExploreAI Academy

In this exercise, we will apply subqueries and CTEs in different parts of a query and for different use cases. Ensure that you have downloaded the database file, Northwind.db.

## Learning objectives

By the end of this train, you should:
- Know how to use CTEs to simplify subqueries.
- Understand when to use subqueries and when to use CTEs by comparing their performance and readability.

First, let's load our sample database:

In [2]:
# Load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook.
%load_ext sql


In [3]:
# Load the Northwind database stored in your local machine. 
# Make sure the file is saved in the same folder as this notebook.
%sql sqlite:///Northwind.db
    

Here is a view of all of our tables in the database:

<div align="center" style=" font-size: 90%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/Northwind_ERD.png"  style="width:900px";/>
<br>
<br>
    <em>Figure 1: Northwind ERD</em>
</div>

## Exercise

Run the necessary queries that will provide us with the following information. Compare your queries with the solutions at the end of this notebook.

### Exercise 1

Retrieve product details from products that have been ordered by customers from the UK.

In [7]:
%%sql
SELECT
    p.productid,
    p.productname,
    p.unitprice, 
    o.orderid, 
    c.customerid, 
    c.country
FROM 
    products p
JOIN 
    orderdetails od 
ON 
    p.productid = od.productid
JOIN 
    orders o 
ON 
    od.orderid = o.orderid
JOIN 
    customers c 
ON 
    o.customerid = c.customerid
WHERE c.country = 'UK'
LIMIT 10;

 * sqlite:///Northwind.db
Done.


ProductID,ProductName,UnitPrice,OrderID,CustomerID,Country
3,Aniseed Syrup,10.0,10289,BSBEV,UK
64,Wimmers gute Semmelkndel,33.25,10289,BSBEV,UK
34,Sasquatch Ale,14.0,10315,ISLAT,UK
70,Outback Lager,15.0,10315,ISLAT,UK
41,Jack's New England Clam Chowder,9.65,10318,ISLAT,UK
76,Lakkalikri,18.0,10318,ISLAT,UK
35,Steeleye Stout,18.0,10321,ISLAT,UK
24,Guaran Fantstica,4.5,10355,AROUT,UK
57,Ravioli Angelo,19.5,10355,AROUT,UK
16,Pavlova,17.45,10359,SEVES,UK


### Exercise 2


Find out the names of customers who have ordered products of more than the average order value.

In [21]:
%%sql
WITH avg_order AS (
    SELECT AVG(orderdetails.unitprice * orderdetails.quantity) AS avg_value
    FROM orderdetails
)
SELECT DISTINCT c.companyname
FROM customers AS c
JOIN orders AS o ON c.customerid = o.customerid
JOIN orderdetails ON o.orderid = orderdetails.orderid
WHERE (orderdetails.UnitPrice * orderdetails.Quantity) > (SELECT avg_value FROM avg_order)
LIMIT 10;


 * sqlite:///Northwind.db
Done.


CompanyName
Toms Spezialitten
Hanari Carnes
Suprmes dlices
Richter Supermarkt
HILARION-Abastos
Ernst Handel
Ottilies Kseladen
Blondesddsl pre et fils
Frankenversand
GROSELLA-Restaurante


### Exercise 3


Write a CTE to find the most ordered product by each customer.

In [26]:
%%sql
WITH customer_total_quantity AS (
    SELECT c.contactname, SUM(od.quantity) AS total_quantity
    FROM customers AS c
    JOIN orders AS o ON c.customerid = o.customerid
    JOIN orderdetails AS od ON o.orderid = od.orderid
    GROUP BY c.contactname
)
SELECT contactname, total_quantity
FROM customer_total_quantity
ORDER BY total_quantity DESC
LIMIT 10;


 * sqlite:///Northwind.db
Done.


contactname,total_quantity
Jose Pavarotti,4958
Roland Mendel,4543
Horst Kloss,3961
Patricia McKenna,1684
Peter Franken,1525
Paula Wilson,1383
Maria Larsson,1234
Carlos Hernndez,1096
Pascale Cartrain,1072
Karl Jablonski,1063


### Exercise 4

Using a CTE, list employees who have more than the average number of reports.

In [30]:
%%sql

WITH avg_reports AS (
    SELECT AVG(report_count) AS average_count
    FROM (
        SELECT COUNT(*) AS report_count
        FROM employees
        JOIN employees AS reports ON employees.EmployeeID = reports.ReportsTo
        GROUP BY employees.EmployeeID
    ) AS report_counts
)
SELECT employees.*
FROM employees
JOIN employees AS reports ON employees.EmployeeID = reports.ReportsTo
GROUP BY employees.EmployeeID
HAVING COUNT(*) > (SELECT average_count FROM avg_reports);

 * sqlite:///Northwind.db
Done.


EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,Address,City,Region,PostalCode,Country,HomePhone,Extension,Notes,ReportsTo,PhotoPath,Salary
2,Fuller,Andrew,"Vice President, Sales",Dr.,1952-02-19 00:00:00,1992-08-14 00:00:00,908 W. Capital Way,Tacoma,WA,98401,USA,(206) 555-9482,3457,"Andrew received his BTS commercial in 1974 and a Ph.D. in international marketing from the University of Dallas in 1981. He is fluent in French and Italian and reads German. He joined the company as a sales representative, was promoted to sales manager in January 1992 and to vice president of sales in March 1993. Andrew is a member of the Sales Management Roundtable, the Seattle Chamber of Commerce, and the Pacific Rim Importers Association.",,http://accweb/emmployees/fuller.bmp,2254.49


## Solutions

### Exercise 1

SQL solution with a subquery:

In [None]:
%%sql

SELECT customers.*
FROM customers
WHERE customers.CustomerID IN (
    SELECT orders.CustomerID
    FROM orders
    GROUP BY orders.CustomerID
    ORDER BY COUNT(*) DESC
    LIMIT 1
);

SQL solution with a CTE:

In [5]:
%%sql

WITH most_orders AS (
    SELECT orders.CustomerID
    FROM orders
    GROUP BY orders.CustomerID
    ORDER BY COUNT(*) DESC
    LIMIT 1
)
SELECT customers.*
FROM customers
JOIN most_orders
ON customers.CustomerID = most_orders.CustomerID;

 * sqlite:///Northwind.db
Done.


CustomerID,CompanyName,ContactName,ContactTitle,Address,City,Region,PostalCode,Country,Phone,Fax
SAVEA,Save-a-lot Markets,Jose Pavarotti,Sales Representative,187 Suffolk Ln.,Boise,ID,83720,USA,(208) 555-8097,


Both solutions will return the customer with the most orders. The subquery solution nests the logic inside the main query, which can become difficult to read for more complex queries. The CTE solution separates the logic into a different part of the query, which can be more readable, especially for more complex queries.

### Exercise 2

In [None]:
%%sql

WITH avg_order_value AS (
    SELECT AVG(OrderDetails.UnitPrice * OrderDetails.Quantity) AS average_value
    FROM OrderDetails
)
SELECT DISTINCT customers.CompanyName
FROM customers
JOIN orders ON customers.CustomerID = orders.CustomerID
JOIN OrderDetails ON orders.OrderID = OrderDetails.OrderID
WHERE (OrderDetails.UnitPrice * OrderDetails.Quantity) > (SELECT average_value FROM avg_order_value);


### Exercise 3

In [None]:
%%sql

WITH most_ordered_products AS (
    SELECT customers.CustomerID, OrderDetails.ProductID, COUNT(*) AS order_count
    FROM customers
    JOIN orders ON customers.CustomerID = orders.CustomerID
    JOIN OrderDetails ON orders.OrderID = OrderDetails.OrderID
    GROUP BY customers.CustomerID, OrderDetails.ProductID
)
SELECT customers.CompanyName, products.ProductName, max_order_count
FROM (
    SELECT CustomerID, MAX(order_count) AS max_order_count
    FROM most_ordered_products
    GROUP BY CustomerID
) AS max_order_count
JOIN most_ordered_products ON max_order_count.CustomerID = most_ordered_products.CustomerID AND max_order_count.max_order_count = most_ordered_products.order_count
JOIN customers ON most_ordered_products.CustomerID = customers.CustomerID
JOIN products ON most_ordered_products.ProductID = products.ProductID;


### Exercise 4

In [31]:
%%sql

WITH avg_reports AS (
    SELECT AVG(report_count) AS average_count
    FROM (
        SELECT COUNT(*) AS report_count
        FROM employees
        JOIN employees AS reports ON employees.EmployeeID = reports.ReportsTo
        GROUP BY employees.EmployeeID
    ) AS report_counts
)
SELECT employees.*
FROM employees
JOIN employees AS reports ON employees.EmployeeID = reports.ReportsTo
GROUP BY employees.EmployeeID
HAVING COUNT(*) > (SELECT average_count FROM avg_reports);

 * sqlite:///Northwind.db
Done.


EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,Address,City,Region,PostalCode,Country,HomePhone,Extension,Notes,ReportsTo,PhotoPath,Salary
2,Fuller,Andrew,"Vice President, Sales",Dr.,1952-02-19 00:00:00,1992-08-14 00:00:00,908 W. Capital Way,Tacoma,WA,98401,USA,(206) 555-9482,3457,"Andrew received his BTS commercial in 1974 and a Ph.D. in international marketing from the University of Dallas in 1981. He is fluent in French and Italian and reads German. He joined the company as a sales representative, was promoted to sales manager in January 1992 and to vice president of sales in March 1993. Andrew is a member of the Sales Management Roundtable, the Seattle Chamber of Commerce, and the Pacific Rim Importers Association.",,http://accweb/emmployees/fuller.bmp,2254.49


<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/ExploreAI_logos/EAI_Blue_Dark.png"  style="width:200px";/>
</div>