# Relationships Between Tables

## **Getting Started**

This week we will be learning how to how to pull data from multiple tables.

**Before beginning:** It is recommended that you open/download the _guitar store ERD_ that is available in Github so you can review it during the assignment.

## **Joins**

Joins are how we relate tables during query execution to obtain a specific result. Joins are different from relationships in the sense that relationships exist for the purpose of maintaining data integrity between tables. The relationship that exist in the _guitar store_ database between _Categories_ and _Products_ is in place to ensure that there are no products that don't have a corresponding record in the _Categories_ table. To return related data between these tables we can use the JOIN statement in our SQL.

During this exercise, we will be using the _Products_ and _Categories_ tables. Run the below code cells (2) to see the structure and data in each table.

In [None]:
--Select all records form gs_products
SELECT *
FROM gs_products

In [None]:
--Select all records form gs_categories
SELECT *
FROM gs_categories

We can tell from the results of our queries and the _guitar shop_ ERD that instead of storing the category name inside of the product table, we store a category ID and use a relationship/join to look for that category ID in the categories table. There are two main reasons we do this:

- It takes more space to store the word "Basses" than the integer 2. Please see the storage sizes below. This is a massive difference. Instead of storing the word "Basses" everytime that we need it, we can store it one time in the category table and use the ID it was assigned to pull it into our result.
    - 2 in binary: 10
    - "Basses" in binary: 01100010 01100001 01110011 01110011 01100101 01110011
    - So, it takes 24 times as much storage to store the word "Basses" than it does to store the number 2.
- The other reason is that it helps us preserve data integrity. Due to referencial integrity, anything we put in Products.CategoryID must have a matching value in Categories.CategoryID. This prevents erraneous values from sneaking in like "Bassses", "Bases", etc...

### **Inner Join**

Using the JOIN statement in SQL to pull data from multiple tables results in an _inner join._ An inner join will only return records/rows of the selected fields from tables where the joined fields are equal. For example, _Categories_ and _Products_ can be joined on _CategoryID_ since that field is a FK in the _Products_ table. In the below code cell we are selecting all records from both tables where the join fields are equal.

In [None]:
--Return all records from products where there is a match in categories
SELECT p.*, c.*
FROM gs_products as p 
JOIN gs_categories as c ON p.category_id = c.category_id

When running the above code cell we can see that the result returns all fields from both tables in the order that we entered them into the select statatement. The below code cell contains a query that will just pull in a few fields from each table to evaluate our join.

In [None]:
--Return specific columns from products where there is a match in categories
SELECT p.product_id, p.product_code, p.category_id AS products_category_id, c.category_id AS categories_category_id, c.category_name
FROM gs_products AS p
JOIN gs_categories AS c ON p.category_id = c.category_id

The above result shows us that we are brining in data from the categories table where Categories.CategoryID matches Products.CategoryID. We know from our earlier example that Category ID 2 is "Basses", so "Basses" is returned as the CategoryName for any products with a 2 in the categoryID. We can also use a WHERE clause to limit the return of data in the products table when joining on the categories table.

In [None]:
--Return all records from products where the product is a Base
SELECT *
FROM gs_products AS p
JOIN gs_categories AS c ON p.category_id = c.category_id
WHERE c.category_name = 'Basses'

The above code cell returns all products where the category name is "Basses".  Even though we aren't brining in any of the data from the categories table into our select statement, we can still refernce it in our from clause through a join and filter it with criteria in our where clause.

✏️ **Practice**

Edit the below code cell so that the query ruturns all data from the products table that are in the _guitar_ category by using a join.

In [None]:
SELECT *
FROM gs_products AS p

**Expected Result**



### **Left/Right Joins**

While inner joins return all records from both tables where there is a match in the joining fields, a left or right join will return all records from the specified table, and only matching records from the joined table. Using the next few code cells, we will build a query that shows all customers who have never placed an order using a left join. Before we do so, run the code cells (2) below to view the data and structure of the customers and orders tables.

In [None]:
--Return all records from customers
SELECT *
FROM gs_customers

In [None]:
--Return all records from orders
SELECT *
FROM gs_orders

We can see from the above queries that there are 486 customers and 41 orders. Let's start working on our "Customers without orders" query by creating a query that joins the two tables in customerID to see which customers have placed an order.

In [None]:
--Return all customers who appear in the orders table
SELECT c.*, o.*
FROM gs_customers as c
JOIN gs_orders as o ON c.customer_id = o.customer_id

The above code cell only returns 41 records, which makes sense since that is how many orders were placed, but we definitely aren't seeing all of our customers since our first set of queries showed 485 different customers. We can use a left join to pull those customers back in. The way that this works is that we are telling the query to return all records from the "left" side of the join statement, so we will leave customers on the left side for now.

In [None]:
--Use left join to show all customers as well as order information for customers that have ordered
SELECT c.*, o.*
FROM gs_customers as c
LEFT JOIN gs_orders as o ON c.customer_id = o.customer_id

We can see that the query is now showing information for all customers regardless of if they placed an order or not, and there are a few duplicates for customers who placed multiple orders. Please scroll down to customer 37, and then scroll over to view the order data. Notice that all of their order data is Null because they never placed an order. Since there were no orders placed for these customers, there is nothing for the orders table to return here.

What would happen if we made this a right join instead?

In [None]:
--Use a right join to show all orders as well as customer information if there is a match
SELECT c.*, o.*
FROM gs_customers as c
RIGHT JOIN gs_orders as o ON c.customer_id = o.customer_id

We are back to seeing 41 records and we don't see as many Null values. This is because we are now showing all records on the right side of the join statement (orders) and only those on the left side that have a related record.

Putting together everything that we have learned, we now know enough to create a query that shows which customers haven't placed an order.

- Show all customers- Left join between customers and orders
- Only show customers who don't have an order- set criteria to Null for order data

Let's incoroporate this into our next query.

In [None]:
--Use left join to show only customers who have never placed an order
SELECT c.*, o.*
FROM gs_customers as c
LEFT JOIN gs_orders as o ON c.customer_id = o.customer_id
WHERE o.customer_id is NULL

By adding a where clause to only show Nulls from Orders.CustomerID we were able to filter out any customers that had related data in the orders table, or in other words, had placed an order. Now we know that there are 450 customers in our database that have yet to place an order.

✏️ **Practice**

Edit the below query to show all customers who do not have an address in the addresses table.

In [None]:
SELECT c.*
FROM gs_customers AS c

## **Junction/Join Tables**

Junciton, or join, tables are a special type of table that helps us resolve a many-to-many relationship. In the _guitar store_ example we have a many-to-many relationship between products and orders. This is because an individual product can be part of many orders, and an order can be made of many products. For example, Order #31 may contain products 1 and 6. At first this looks like a one-to-many relationship, however, Order # 41 contains products 6 and 8. We have multiple products appearing in different orders, which makes this a many-to-many relationship. To resolve this, we use a junction table. This junctino table, _OrderItems_, lists the OrderID and each associated ProductID in their own row so that we can query them. If we run the below code cells (2) we can see that there are no fields that we can use to join Prducts and Orders. They have no common data.

In [None]:
SELECT *
FROM gs_products AS p

In [None]:
SELECT *
FROM gs_orders as o

Now, if we look at the _OrderItems_ table then we can see that it contains information from both tables. This is showing which items (item\_id) were part of which orders (order\_id)

In [None]:
SELECT *
FROM gs_order_items AS oi

This table also includes useful information such as what the discount on the product was at the time of the order, the price that was paid, and the quantity purchased. We can think of this as a transaction table that is storing transaction level data for each product in each order. It is the most granular stage of our data.

If we want to connect products to orders, then we will need to join each of our products and orders tables to the OrderItems table. This can be done by using multiple join statements like in the cell below. At the end of one join statement, we just start the next one. The below code cell is showing the productID and orderID for each product/order pairing. If you scroll up, you will notice that these are the same combinations as what is in the ItemOrders table, but we are pulling from the Products and Orders table now instead!

In [None]:
SELECT p,product_id, o.order_id
FROM gs_products as p
JOIN gs_order_items as oi ON p.product_id = oi.product_id
JOIN gs_orders as o ON oi.order_id = o.order_id

**✏️ Practice**

Edit the below code cell so that you are returning the name of each product, as well as the customerID of the customer who placed the order, and the OrderDate.

In [None]:
SELECT p.product_id
FROM gs_products as p
JOIN gs_order_items as oi ON p.product_id = oi.product_id

## **Exercises**

If you have not already, I would strongly recommend having access to the ERD Linked in Canvas for the Exercises.

1\. Using the below code cell, return the order date, ship amount, and ship date for every order, as well as the name of the customer that placed the order.

**First 5 rows of correct solution**
|order_date|ship_amount|ship_date|first_name|last_name|
|---|---|---|---|---|
|2016-03-28 09:40:28.000|5.00|2016-03-31 09:41:11.000|Allan|Sherwood|
|2016-03-28 11:23:20.000|5.00|2016-03-31 11:24:03.000|Barry|Zimmer|
|2016-03-29 09:44:58.000|10.00|2016-04-01 09:45:41.000|Allan|Sherwood|
|2016-03-30 15:22:31.000|10.00|2016-04-02 15:23:14.000|Christine|Brown|
|2016-03-31 05:43:11.000|5.00|2016-04-03 05:43:54.000|David|Goldstein|

In [3]:
--Insert your code below this line. You can make your own comments by using two hyphens
SELECT 
    gs_orders.order_date,
    gs_orders.ship_amount,
    gs_orders.ship_date,
    gs_customers.first_name,
    gs_customers.last_name
FROM gs_orders
INNER JOIN gs_customers
    ON gs_orders.customer_id = gs_customers.customer_id
ORDER BY gs_orders.order_date;





IndentationError: unexpected indent (1272646875.py, line 2)

2\. Using the below code cell, return the first name, last name, and email address for each customer who has placed and order and has a yahoo email address.

**First 5 rows of correct solution**
|first_name|last_name|email_address|
|---|---|---|
|Allan|Sherwood|allan.sherwood@yahoo.com|
|Allan|Sherwood|allan.sherwood@yahoo.com|
|Gary|Hernandez|gary_hernandez@yahoo.com|
|Mitsue|Tollner|mitsue_tollner@yahoo.com|
|Minna|Amigon|minna_amigon@yahoo.com|

In [4]:
--Insert your code below this line. You can make your own comments by using two hyphens
SELECT 
    c.first_name,
    c.last_name,
    c.email_address AS email_address
FROM gs_customers AS c
INNER JOIN gs_orders AS o
    ON c.customer_id = o.customer_id
WHERE c.email_address LIKE '%@yahoo.com'
ORDER BY c.first_name, c.last_name


SyntaxError: invalid syntax (428737021.py, line 1)

3\. Using the below code cell, return the order\_id and order date for items ordered by Kris Marrier

**First 5 rows of correct solution**

| order\_id | order\_date |
| --- | --- |
| 20 | 2016-04-10 09:33:23.000 |
| 32 | 2016-05-01 01:23:23.000 |

In [None]:
SELECT 
    c.first_name,
    c.last_name,
    o.order_date
FROM gs_customers AS c
INNER JOIN gs_orders AS o
    ON c.customer_id = o.customer_id
WHERE c.first_name = 'Kris'
  AND c.last_name = 'Marrier';


4\. Expand the query from exercise 3 to also include the product name, price, and quantity. You will have to join a total of four tables.

**First 5 rows of correct solution**
|order_id|order_date|product_name|item_price|quantity|
|---|---|---|---|---|
|20|2016-04-10 09:33:23.000|Yamaha FG700S|799.99|1|
|32|2016-05-01 01:23:23.000|Fender Stratocaster|2517.00|1|


In [None]:
--Insert your code below this line. You can make your own comments by using two hyphens
SELECT 
    o.order_id,
    o.order_date,
    p.product_name,
    oi.item_price,
    oi.quantity
FROM gs_customers AS c
INNER JOIN gs_orders AS o
    ON c.customer_id = o.customer_id
INNER JOIN gs_order_items AS oi
    ON o.order_id = oi.order_id
INNER JOIN gs_products AS p
    ON oi.product_id = p.product_id
WHERE c.first_name = 'Kris'
  AND c.last_name = 'Marrier'
ORDER BY o.order_date;


## **Scenario**

The owner of Guitar Store wants you to create a query that shows all of the needed information for creating a batch of invoices. Your query should contain the Cateegory Name, Products Name, Description, List Price, Item Price, Discount Amount, Quantity, OrderDate, Shipping and tax costs, Customer name, Customer Shipping address, and their email.

Create your query in the code cell below:

**First 5 rows of correct solution**
|category_name|product_name|product_description|list_price|item_price|discount_amount|quantity|order_date|ship_amount|tax_amount|first_name|last_name|Line1|line2|city|state_code|zip_code|email_address|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|Guitars|Gibson Les Paul|This Les Paul guitar offers a carved top and humbucking pickups. It has a simple yet elegant design. Cutting-yet-rich tone?the hallmark of the Les Paul?pours out of the 490R and 498T Alnico II magnet humbucker pickups, which are mounted on a carved maple top with a mahogany back. The faded finish models are equipped with BurstBucker Pro pickups and a mahogany top. This guitar includes a Gibson hardshell case (Faded and satin finish models come with a gig bag) and a limited lifetime warranty.\r\n\r\nFeatures:\r\n\r\n* Carved maple top and mahogany back (Mahogany top on faded finish models)\r\n* Mahogany neck, &#39;59 Rounded Les Paul\r\n* Rosewood fingerboard (Ebony on Alpine white)\r\n* Tune-O-Matic bridge with stopbar\r\n* Chrome or gold hardware\r\n* 490R and 498T Alnico 2 magnet humbucker pickups (BurstBucker Pro on faded finish models)\r\n* 2 volume and 2 tone knobs, 3-way switch|1199.00|1199.00|359.70|1|2016-03-28 09:40:28.000|5.00|58.75|Allan|Sherwood|100 East Ridgewood Ave.||Paramus|NJ|07652|allan.sherwood@yahoo.com|
|Basses|Hofner Icon|With authentic details inspired by the original, the Hofner Icon makes the legendary violin bass available to the rest of us. Don&#39;t get the idea that this a just a &quot;nowhere man&quot; look-alike. This quality instrument features a real spruce top and beautiful flamed maple back and sides. The semi-hollow body and set neck will give you the warm, round tone you expect from the violin bass.\r\n\r\nFeatures:\r\n\r\n* Authentic details inspired by the original\r\n* Spruce top\r\n* Flamed maple back and sides\r\n* Set neck\r\n* Rosewood fretboard\r\n* 30&quot; scale\r\n* 22 frets\r\n* Dot inlay|499.99|489.99|186.20|1|2016-03-28 11:23:20.000|5.00|21.27|Barry|Zimmer|16285 Wendell St.||Omaha|NE|68135|barryz@gmail.com|
|Guitars|Fender Stratocaster|The Fender Stratocaster is the electric guitar design that changed the world. New features include a tinted neck, parchment pickguard and control knobs, and a &#39;70s-style logo. Includes select alder body, 21-fret maple neck with your choice of a rosewood or maple fretboard, 3 single-coil pickups, vintage-style tremolo, and die-cast tuning keys. This guitar features a thicker bridge block for increased sustain and a more stable point of contact with the strings. At this low price, why play anything but the real thing?\r\n\r\nFeatures:\r\n\r\n* New features:\r\n* Thicker bridge block\r\n* 3-ply parchment pick guard\r\n* Tinted neck|699.00|2517.00|1308.84|1|2016-03-29 09:44:58.000|10.00|102.29|Allan|Sherwood|100 East Ridgewood Ave.||Paramus|NJ|07652|allan.sherwood@yahoo.com|
|Drums|Ludwig 5-piece Drum Set with Cymbals|This product includes a Ludwig 5-piece drum set and a Zildjian starter cymbal pack.\r\n\r\nWith the Ludwig drum set, you get famous Ludwig quality. This set features a bass drum, two toms, a floor tom, and a snare?each with a wrapped finish. Drum hardware includes LA214FP bass pedal, snare stand, cymbal stand, hi-hat stand, and a throne.\r\n\r\nWith the Zildjian cymbal pack, you get a 14&quot; crash, 18&quot; crash/ride, and a pair of 13&quot; hi-hats. Sound grooves and round hammer strikes in a simple circular pattern on the top surface of these cymbals magnify the basic sound of the distinctive alloy.\r\n\r\nFeatures:\r\n\r\n* Famous Ludwig quality\r\n* Wrapped finishes\r\n* 22&quot; x 16&quot; kick drum\r\n* 12&quot; x 10&quot; and 13&quot; x 11&quot; toms\r\n* 16&quot; x 16&quot; floor tom\r\n* 14&quot; x 6-1/2&quot; snare drum kick pedal\r\n* Snare stand\r\n* Straight cymbal stand hi-hat stand\r\n* FREE throne|699.99|415.00|161.85|1|2016-03-29 09:44:58.000|10.00|102.29|Allan|Sherwood|100 East Ridgewood Ave.||Paramus|NJ|07652|allan.sherwood@yahoo.com|
|Guitars|Gibson Les Paul|This Les Paul guitar offers a carved top and humbucking pickups. It has a simple yet elegant design. Cutting-yet-rich tone?the hallmark of the Les Paul?pours out of the 490R and 498T Alnico II magnet humbucker pickups, which are mounted on a carved maple top with a mahogany back. The faded finish models are equipped with BurstBucker Pro pickups and a mahogany top. This guitar includes a Gibson hardshell case (Faded and satin finish models come with a gig bag) and a limited lifetime warranty.\r\n\r\nFeatures:\r\n\r\n* Carved maple top and mahogany back (Mahogany top on faded finish models)\r\n* Mahogany neck, &#39;59 Rounded Les Paul\r\n* Rosewood fingerboard (Ebony on Alpine white)\r\n* Tune-O-Matic bridge with stopbar\r\n* Chrome or gold hardware\r\n* 490R and 498T Alnico 2 magnet humbucker pickups (BurstBucker Pro on faded finish models)\r\n* 2 volume and 2 tone knobs, 3-way switch|1199.00|1199.00|359.70|2|2016-03-30 15:22:31.000|10.00|117.50|Christine|Brown|19270 NW Cornell Rd.||Beaverton|OR|97006|christineb@solarone.com|


In [None]:
--Insert your code below this line. You can make your own comments by using two hyphens
SELECT 
    cat.category_name,
    p.product_name,
    p.product_description AS product_description,
    p.list_price,
    oi.item_price,
    oi.discount_amount,
    oi.quantity,
    o.order_date,
    o.ship_amount,
    o.tax_amount,
    c.first_name,
    c.last_name,
    a.line1,
    a.line2,
    a.city,
    a.state_code,
    a.zip_code,
    c.email_address AS email_address
FROM gs_orders AS o
INNER JOIN gs_customers AS c
    ON o.customer_id = c.customer_id
INNER JOIN gs_order_items AS oi
    ON o.order_id = oi.order_id
INNER JOIN gs_products AS p
    ON oi.product_id = p.product_id
INNER JOIN gs_categories AS cat
    ON p.category_id = cat.category_id
INNER JOIN gs_addresses AS a
    ON o.ship_address_id = a.address_id
ORDER BY o.order_date, p.product_name;
