# Notebook 05: Advanced Joins

## Learning Objectives

- Use FULL OUTER JOIN for complete data
- Use SELF JOIN to compare rows within a table
- Use CROSS JOIN for cartesian products
- Find unmatched records between tables
- Use non-equi joins (joins on inequalities)

In [None]:
import os, sys
from pathlib import Path
project_root = Path.cwd().parent if Path.cwd().name == 'notebooks' else Path.cwd()
sys.path.insert(0, str(project_root / 'src'))
import duckdb
from sql_exercises import check
os.environ['SQL_NOTEBOOK_NAME'] = '05_joins_advanced'
conn = duckdb.connect(str(project_root / 'data' / 'databases' / 'practice.duckdb'), read_only=True)
print("Setup complete!")

## Quick Reference

```sql
-- FULL OUTER JOIN: All rows from both tables
SELECT * FROM a FULL OUTER JOIN b ON a.id = b.a_id;

-- SELF JOIN: Join table to itself
SELECT e.name, m.name AS manager_name
FROM employees e
JOIN employees m ON e.manager_id = m.employee_id;

-- CROSS JOIN: Every combination (cartesian product)
SELECT * FROM a CROSS JOIN b;
```

---
## Exercise 1: SELF JOIN - Employee and Manager (Easy)

**Problem:** List each employee with their manager's name.

Return columns: employee_id, employee_name (first + last), manager_name (first + last)

**Hint:** Join employees to itself using manager_id

In [None]:
ex_01 = '''

'''
conn.execute(ex_01).fetchdf()

In [None]:
check("ex_01", ex_01)

---
## Exercise 2: SELF JOIN with LEFT (Easy)

**Problem:** List ALL employees with their manager's name. Include employees without managers (show NULL).

Return columns: employee_id, employee_name, manager_name

In [None]:
ex_02 = '''

'''
conn.execute(ex_02).fetchdf()

In [None]:
check("ex_02", ex_02)

---
## Exercise 3: Find Employees Without Manager (Easy)

**Problem:** Find all employees who don't have a manager (top-level employees).

Return columns: employee_id, first_name, last_name, job_title

In [None]:
ex_03 = '''

'''
conn.execute(ex_03).fetchdf()

In [None]:
check("ex_03", ex_03)

---
## Exercise 4: Category Hierarchy (Medium)

**Problem:** List all categories with their parent category names.

Return columns: category_id, category_name, parent_category_name

**Tables:** categories (has parent_category_id)

**Hint:** Self join on parent_category_id

In [None]:
ex_04 = '''

'''
conn.execute(ex_04).fetchdf()

In [None]:
check("ex_04", ex_04)

---
## Exercise 5: Find Employees in Same Department (Medium)

**Problem:** Find pairs of employees who work in the same department. Avoid duplicates (e.g., (A,B) and (B,A)).

Return columns: emp1_name, emp2_name, department_id

**Hint:** Self join with e1.employee_id < e2.employee_id to avoid duplicates

In [None]:
ex_05 = '''

'''
conn.execute(ex_05).fetchdf().head(20)

In [None]:
check("ex_05", ex_05)

---
## Exercise 6: FULL OUTER JOIN (Medium)

**Problem:** Show all customers and all orders, even customers without orders and orders without valid customers (if any).

Return columns: customer_id, first_name, order_id, total_amount

**Tables:** customers, orders

In [None]:
ex_06 = '''

'''
conn.execute(ex_06).fetchdf()

In [None]:
check("ex_06", ex_06)

---
## Exercise 7: Customers Without Orders (Hard)

**Problem:** Find customers who have never placed an order.

Return columns: customer_id, first_name, last_name, email

In [None]:
ex_07 = '''

'''
conn.execute(ex_07).fetchdf()

In [None]:
check("ex_07", ex_07)

---
## Exercise 8: Employees Earning More Than Their Manager (Hard)

**Problem:** Find employees who earn more than their direct manager.

Return columns: employee_id, employee_name, employee_salary, manager_name, manager_salary

In [None]:
ex_08 = '''

'''
conn.execute(ex_08).fetchdf()

In [None]:
check("ex_08", ex_08)

---
## Summary

- **SELF JOIN** - Compare rows within the same table
- **FULL OUTER JOIN** - All rows from both tables
- **Finding unmatched** - LEFT JOIN + WHERE IS NULL
- **Non-equi joins** - Join on <, >, BETWEEN, etc.

### Next: Notebook 06 - Subqueries

In [None]:
conn.close()