# GRIT: JOINs & Relationships - Day 4

**Learning Objectives**
- Understand table relationships and foreign keys
- Master INNER JOIN for matching data
- Use LEFT JOIN to include all records
- Combine multiple tables in single queries
- Avoid common JOIN pitfalls

**Why this matters**  
Real databases don't store everything in one table - information is split across multiple related tables. JOINs are the "glue" that connects this data together, letting you answer complex questions like "Which customers bought which products and when?"

Today you'll learn to combine data from multiple tables like a database architect!

## Setup: Connect to Our Database

Let's connect to our e-commerce database:

In [None]:
# Load the SQL extension
%load_ext sql

# Connect to our sample database
%sql sqlite:///ecommerce.db

print("✅ Connected to database!")

## Theory: Understanding Table Relationships

### Why Multiple Tables?
Imagine a library:
- **Books table**: Title, author, ISBN
- **Borrowers table**: Name, address, phone
- **Loans table**: Book ID, Borrower ID, due date

### Types of Relationships:
- **One-to-One**: One customer has one profile
- **One-to-Many**: One customer has many orders
- **Many-to-Many**: Many products in many orders

### Foreign Keys:
Foreign keys link tables together:
- `orders.customer_id` → `customers.customer_id`
- `order_items.product_id` → `products.product_id`

### JOIN Types:
- **INNER JOIN**: Only matching rows
- **LEFT JOIN**: All rows from left table + matches
- **RIGHT JOIN**: All rows from right table + matches
- **FULL JOIN**: All rows from both tables

## Examples: Basic INNER JOIN

INNER JOIN returns only rows that have matches in both tables:

In [None]:
-- Example 1: Basic customer-order relationship
SELECT c.first_name, c.last_name, o.order_id, o.order_date, o.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
LIMIT 5;

In [None]:
-- Example 2: Products in orders
SELECT p.product_name, oi.quantity, oi.unit_price, oi.total_price
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.product_id
LIMIT 5;

In [None]:
-- Example 3: Order details with customer info
SELECT o.order_id, o.order_date,
       c.first_name, c.last_name, c.city,
       o.total_amount, o.order_status
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
ORDER BY o.order_date DESC
LIMIT 10;

## Examples: LEFT JOIN (Include All Records)

LEFT JOIN keeps all records from the left table, even if no matches:

In [None]:
-- Example 4: All customers, with their orders (if any)
SELECT c.first_name, c.last_name, c.customer_status,
       o.order_id, o.order_date, o.total_amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
ORDER BY c.last_name
LIMIT 10;

In [None]:
-- Example 5: Customers who haven't ordered (NULL values)
SELECT c.first_name, c.last_name, c.registration_date,
       o.order_id
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

In [None]:
-- Example 6: Product stock vs sales
SELECT p.product_name, p.stock_quantity,
       COALESCE(SUM(oi.quantity), 0) as total_sold
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name, p.stock_quantity
ORDER BY total_sold DESC
LIMIT 10;

## Examples: Multiple Table JOINs

Combine three or more tables for complete information:

In [None]:
-- Example 7: Complete order details (3-table join)
SELECT c.first_name, c.last_name,
       o.order_id, o.order_date, o.total_amount,
       p.product_name, oi.quantity, oi.unit_price
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
ORDER BY o.order_date DESC, o.order_id
LIMIT 15;

In [None]:
-- Example 8: Customer order summary
SELECT c.first_name, c.last_name, c.state,
       COUNT(o.order_id) as total_orders,
       SUM(o.total_amount) as total_spent,
       AVG(o.total_amount) as avg_order_value
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name, c.state
ORDER BY total_spent DESC NULLS LAST;

In [None]:
-- Example 9: Product sales performance
SELECT p.product_name, p.category, p.price,
       COUNT(oi.order_item_id) as times_ordered,
       SUM(oi.quantity) as total_quantity_sold,
       SUM(oi.total_price) as total_revenue
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name, p.category, p.price
ORDER BY total_revenue DESC NULLS LAST;

## Examples: JOIN with WHERE Conditions

Combine JOINs with filtering for specific insights:

In [None]:
-- Example 10: High-value orders from California
SELECT c.first_name, c.last_name, c.city,
       o.order_id, o.order_date, o.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE c.state = 'CA' AND o.total_amount > 100
ORDER BY o.total_amount DESC;

In [None]:
-- Example 11: Electronics sales by customer
SELECT c.first_name, c.last_name,
       p.product_name, p.category,
       oi.quantity, oi.total_price,
       o.order_date
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
WHERE p.category = 'Electronics'
ORDER BY o.order_date DESC
LIMIT 10;

## Examples: Advanced JOIN Patterns

More complex relationship queries:

In [None]:
-- Example 12: Customer lifetime value analysis
SELECT c.customer_id, c.first_name, c.last_name,
       c.registration_date,
       COUNT(DISTINCT o.order_id) as order_count,
       SUM(o.total_amount) as lifetime_value,
       AVG(o.total_amount) as avg_order_value,
       MAX(o.order_date) as last_order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name, c.registration_date
ORDER BY lifetime_value DESC NULLS LAST;

In [None]:
-- Example 13: Product performance by category
SELECT p.category,
       COUNT(DISTINCT p.product_id) as products_offered,
       COUNT(oi.order_item_id) as total_sales,
       SUM(oi.total_price) as category_revenue,
       AVG(oi.total_price) as avg_sale_price
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.category
ORDER BY category_revenue DESC NULLS LAST;

In [None]:
-- Example 14: Monthly sales trend
SELECT strftime('%Y-%m', o.order_date) as month,
       COUNT(o.order_id) as orders_count,
       COUNT(DISTINCT o.customer_id) as unique_customers,
       SUM(o.total_amount) as monthly_revenue,
       AVG(o.total_amount) as avg_order_value
FROM orders o
GROUP BY strftime('%Y-%m', o.order_date)
ORDER BY month DESC;

## Examples: Common JOIN Pitfalls & Solutions

Avoid these common mistakes:

In [None]:
-- Example 15: Avoiding duplicate rows (use DISTINCT)
SELECT DISTINCT c.first_name, c.last_name, c.city
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
WHERE oi.quantity > 1
LIMIT 10;

In [None]:
-- Example 16: Proper NULL handling in LEFT JOIN
SELECT c.first_name, c.last_name,
       COALESCE(SUM(o.total_amount), 0) as total_spent,
       CASE WHEN SUM(o.total_amount) IS NULL THEN 'No orders' ELSE 'Has orders' END as order_status
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_spent DESC;

## Exercises

### Exercise 1: Basic INNER JOIN
Show all orders with customer names and order details

In [None]:
-- Your code here
SELECT c.first_name, c.last_name,
       o.order_id, o.order_date, o.total_amount, o.order_status
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
LIMIT 10;

### Exercise 2: LEFT JOIN
Show all customers and their total spending (use 0 for customers with no orders)

In [None]:
-- Your code here
SELECT c.first_name, c.last_name,
       COALESCE(SUM(o.total_amount), 0) as total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_spent DESC;

### Exercise 3: Multiple Table JOIN
Show product names, quantities, and customer names for all order items

In [None]:
-- Your code here
SELECT p.product_name,
       oi.quantity,
       c.first_name, c.last_name
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
INNER JOIN customers c ON o.customer_id = c.customer_id
LIMIT 15;

### Exercise 4: JOIN with Filtering
Find all products that have been ordered, with their order details

In [None]:
-- Your code here
SELECT DISTINCT p.product_name, p.category, p.price,
       oi.quantity, oi.unit_price
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.product_id
ORDER BY p.product_name;

### Exercise 5: Customer Analysis
Create a report showing customer order frequency and spending

In [None]:
-- Your code here
SELECT c.first_name, c.last_name, c.state,
       COUNT(o.order_id) as order_count,
       SUM(o.total_amount) as total_spent,
       AVG(o.total_amount) as avg_order_value
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name, c.state
ORDER BY total_spent DESC NULLS LAST;

### Exercise 6: Sales Performance
Show product sales performance across all categories

In [None]:
-- Your code here
SELECT p.category,
       COUNT(DISTINCT p.product_id) as products_count,
       COUNT(oi.order_item_id) as sales_count,
       SUM(oi.total_price) as category_revenue
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.category
ORDER BY category_revenue DESC NULLS LAST;

## Debug-Me Cell

This query should show customer order history but has a problem. Can you fix it?

In [None]:
-- Debug this query - it should show customer orders but duplicates names
SELECT c.first_name, c.last_name,
       o.order_id, o.order_date, o.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id  -- Unnecessary join causing duplicates!
ORDER BY c.last_name, o.order_date DESC
LIMIT 10;

-- Hint: Remove the unnecessary JOIN to avoid duplicate rows!

## Takeaways & Further Reading

### JOIN Types Mastered:
✅ **INNER JOIN**: Only matching rows from both tables  
✅ **LEFT JOIN**: All rows from left table + matching rows from right  
✅ **Multiple JOINs**: Combine 3+ tables for complete relationships  
✅ **Foreign Keys**: Links that connect table relationships  

### Key Concepts:
- **Relationships**: One-to-one, one-to-many, many-to-many
- **NULL handling**: LEFT JOIN + COALESCE for missing data
- **Duplicate avoidance**: Use DISTINCT when needed
- **Performance**: INNER JOIN is usually faster than LEFT JOIN

### Common Patterns:
- **Customer + Orders**: LEFT JOIN to include customers without orders
- **Orders + Items + Products**: INNER JOINs for complete order details
- **Sales Analysis**: LEFT JOIN products with sales for performance metrics

### SQL Best Practices:
- Use table aliases (c, o, p) for readability
- Choose INNER vs LEFT JOIN based on your needs
- Use COALESCE() for NULL handling
- Be careful with many-to-many relationships

### Tomorrow Preview:
Day 5: **Subqueries & CTEs** - Learn advanced querying techniques with nested queries and Common Table Expressions!

### Practice Resources:
- [SQL JOINs Tutorial](https://www.w3schools.com/sql/sql_join.asp)
- [JOIN Types Explained](https://www.sqlshack.com/sql-join-types-inner-join-left-join-right-join-full-join/)
- [Database Relationships](https://www.lucidchart.com/pages/database-diagram/database-design)

**Amazing! You can now connect data across multiple tables like a database expert! 🔗**