# Assignment 5.  
## Queries combining multiple tables

In this assigment, we will fetch data from a relational database server directly into Python variables. Using these new skills, we will complete the problems from the "Problems for you to Solve" in Chapters 8, 9, and 11 in *SQL Queries for Mere Mortals*.

### Increasing complexity of queries

When working with queries that address multiple tables, rely on the structure of the foreign keys and primary keys to guide the design of your queries. 

We will work with two basic patterns to solve most problems in the book:

1. Using a subquery in the `WHERE` clause
2. Inner natural join.

The database schemas in the book are designed so that foreign key columns share the same names as the primary key of their parent tables. This allows the use of natural joins for logically meaningful joins of tables related by the foreign key. 

First, let's connect to the database and activate the SQL magic for jupyter.

In [1]:
import json
import pymysql 

pymysql.install_as_MySQLdb()

with open('cred.json') as f:
    creds = json.load(f)

connection_string = "mysql://{user}:{password}@{host}".format(**creds)

In [2]:
%load_ext sql
%config SqlMagic.autocommit=True
%sql $connection_string

'Connected: dimitri@None'

Let's solve the following query in the `shared_sales` database:
> Show all the customers who have bought a bicycle, i.e. products in category "Bike"

Review the schema diagram to follow how we build the query step-by-step.

In [3]:
%%sql

USE shared_sales

 * mysql://dimitri:***@db.data-science-ust.net
0 rows affected.


[]

In [4]:
%%sql

show tables


 * mysql://dimitri:***@db.data-science-ust.net
14 rows affected.


Tables_in_shared_sales
~log
categories
customers
employees
order_details
orders
product_vendors
products
regulars
vendors


In [5]:
%%sql

-- Step 1:
-- category for Bike

SELECT * 
    FROM categories 
    WHERE CategoryDescription = "Bikes"

 * mysql://dimitri:***@db.data-science-ust.net
1 rows affected.


CategoryID,CategoryDescription
2,Bikes


In [6]:
%%sql

-- Step 2
-- Products from that category

SELECT * 
FROM products
WHERE CategoryID in (
    SELECT CategoryID 
    FROM categories
    WHERE CategoryDescription = "Bikes"
)

 * mysql://dimitri:***@db.data-science-ust.net
4 rows affected.


ProductNumber,ProductName,ProductDescription,RetailPrice,QuantityOnHand,CategoryID
1,Trek 9000 Mountain Bike,,1200.0,6,2
2,Eagle FS-3 Mountain Bike,,1800.0,8,2
6,Viscount Mountain Bike,,635.0,5,2
11,GT RTS-2 Mountain Bike,,1650.0,5,2


In [7]:
%%sql

-- Step 2
-- Products from that category, USING JOIN

SELECT * 
    FROM products NATURAL JOIN categories
    WHERE CategoryDescription = "Bikes"

 * mysql://dimitri:***@db.data-science-ust.net
4 rows affected.


CategoryID,ProductNumber,ProductName,ProductDescription,RetailPrice,QuantityOnHand,CategoryDescription
2,1,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,2,Eagle FS-3 Mountain Bike,,1800.0,8,Bikes
2,6,Viscount Mountain Bike,,635.0,5,Bikes
2,11,GT RTS-2 Mountain Bike,,1650.0,5,Bikes


In [8]:
%%sql

-- Step 3
-- Order details that contain such products

SELECT * 
FROM order_details
WHERE ProductNumber in (
    SELECT ProductNumber 
    FROM products NATURAL JOIN categories
    WHERE CategoryDescription = "Bikes")

 * mysql://dimitri:***@db.data-science-ust.net
909 rows affected.


OrderNumber,ProductNumber,QuotedPrice,QuantityOrdered
1,1,1200.0,2
3,1,1164.0,5
4,1,1200.0,4
5,1,1200.0,4
10,1,1200.0,2
11,1,1200.0,1
13,1,1200.0,2
14,1,1164.0,5
15,1,1200.0,2
17,1,1200.0,2


In [9]:
%%sql

-- Step 3
-- Order details that contain such products using JOINS

SELECT * 
FROM order_details NATURAL JOIN products NATURAL JOIN categories
WHERE CategoryDescription = "Bikes"

 * mysql://dimitri:***@db.data-science-ust.net
909 rows affected.


CategoryID,ProductNumber,OrderNumber,QuotedPrice,QuantityOrdered,ProductName,ProductDescription,RetailPrice,QuantityOnHand,CategoryDescription
2,1,1,1200.0,2,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,3,1164.0,5,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,4,1200.0,4,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,5,1200.0,4,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,10,1200.0,2,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,11,1200.0,1,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,13,1200.0,2,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,14,1164.0,5,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,15,1200.0,2,Trek 9000 Mountain Bike,,1200.0,6,Bikes
2,1,17,1200.0,2,Trek 9000 Mountain Bike,,1200.0,6,Bikes


In [10]:
%%sql

-- Step 4
-- Orders that contain such products 

SELECT * 
FROM orders 
WHERE OrderNumber IN (
    SELECT OrderNumber
    FROM order_details NATURAL JOIN products NATURAL JOIN categories
    WHERE CategoryDescription = "Bikes")

 * mysql://dimitri:***@db.data-science-ust.net
586 rows affected.


OrderNumber,OrderDate,ShipDate,CustomerID,EmployeeID
1,2017-09-02,2017-09-05,1018,707
3,2017-09-02,2017-09-05,1002,707
4,2017-09-02,2017-09-04,1009,703
5,2017-09-02,2017-09-02,1024,708
6,2017-09-02,2017-09-06,1014,702
10,2017-09-02,2017-09-05,1012,701
11,2017-09-03,2017-09-05,1020,706
13,2017-09-03,2017-09-03,1024,704
14,2017-09-03,2017-09-04,1013,704
15,2017-09-03,2017-09-07,1004,701


Why can we not convert this into a pure join query?

In [11]:
%%sql

-- Step 5
-- Customers on these orders

SELECT * 
FROM customers
WHERE CustomerID in (
    SELECT CustomerID    
    FROM orders 
    WHERE OrderNumber IN (
        SELECT OrderNumber
        FROM order_details NATURAL JOIN products NATURAL JOIN categories
        WHERE CategoryDescription = "Bikes"))

 * mysql://dimitri:***@db.data-science-ust.net
23 rows affected.


CustomerID,CustFirstName,CustLastName,CustStreetAddress,CustCity,CustState,CustZipCode,CustAreaCode,CustPhoneNumber
1018,David,Smith,311 20th Ave. N.E.,Fremont,CA,94538,510,555-2646
1002,William,Thompson,122 Spring River Drive,Duvall,WA,98019,425,555-2681
1009,Andrew,Cencini,507 - 20th Ave. E. Apt. 2A,Seattle,WA,98105,206,555-2601
1024,Mark,Rosales,323 Advocate Lane,El Paso,TX,79915,915,555-2286
1012,Liz,Keyser,13920 S.E. 40th Street,Bellevue,WA,98006,425,555-2556
1020,Joyce,Bonnicksen,2424 Thames Drive,Bellevue,WA,98006,425,555-2726
1013,Rachel,Patterson,2114 Longview Lane,San Diego,CA,92199,619,555-2546
1004,Robert,Brown,672 Lamont Ave,Houston,TX,77201,713,555-2491
1014,Sam,Abolrous,611 Alpine Drive,Palm Springs,CA,92263,760,555-2611
1027,Luke,Patterson,877 145th Ave SE,Portland,OR,97208,503,555-2316


These two types of queries: equi-joins (including natural joins) and subqueries in the `WHERE` clauses can be used to solve the majority of problems in the assignment.