# Notebook 07: Set Operations

## Learning Objectives
- Use UNION to combine result sets
- Understand UNION vs UNION ALL
- Use INTERSECT to find common rows
- Use EXCEPT to find differences

In [None]:
import os
import sys
from pathlib import Path

project_root = Path.cwd().parent if Path.cwd().name == "notebooks" else Path.cwd()
sys.path.insert(0, str(project_root / "src"))
import duckdb
from sql_exercises import check

os.environ["SQL_NOTEBOOK_NAME"] = "07_set_operations"
conn = duckdb.connect(
    str(project_root / "data" / "databases" / "practice.duckdb"), read_only=True
)
print("Setup complete!")

## Quick Reference
```sql
-- UNION: Combine and remove duplicates
SELECT col FROM a UNION SELECT col FROM b;

-- UNION ALL: Combine keeping duplicates
SELECT col FROM a UNION ALL SELECT col FROM b;

-- INTERSECT: Only rows in both
SELECT col FROM a INTERSECT SELECT col FROM b;

-- EXCEPT: Rows in first but not second
SELECT col FROM a EXCEPT SELECT col FROM b;
```

---
## Exercise 1: Basic UNION (Easy)

**Problem:** Get a combined list of all first names from both employees and customers tables.

Return: One column named `first_name` with unique values only

In [None]:
ex_01 = """

"""
conn.execute(ex_01).fetchdf()

In [None]:
check("ex_01", ex_01)

---
## Exercise 2: UNION ALL (Easy)

**Problem:** Get ALL email addresses from both employees and customers (including duplicates if any).

Return: One column named `email`

In [None]:
ex_02 = """

"""
conn.execute(ex_02).fetchdf()

In [None]:
check("ex_02", ex_02)

---
## Exercise 3: UNION with Different Sources (Medium)

**Problem:** Create a contact list with name, email, and source ('employee' or 'customer').

Return columns: full_name, email, source

In [None]:
ex_03 = """

"""
conn.execute(ex_03).fetchdf()

In [None]:
check("ex_03", ex_03)

---
## Exercise 4: INTERSECT (Medium)

**Problem:** Find first names that appear in both employees and customers tables.

Return: One column `first_name`

In [None]:
ex_04 = """

"""
conn.execute(ex_04).fetchdf()

In [None]:
check("ex_04", ex_04)

---
## Exercise 5: EXCEPT (Medium)

**Problem:** Find customer first names that are NOT used by any employee.

Return: One column `first_name`

In [None]:
ex_05 = """

"""
conn.execute(ex_05).fetchdf()

In [None]:
check("ex_05", ex_05)

---
## Exercise 6: UNION with ORDER BY (Medium)

**Problem:** Combine employee and customer cities, sorted alphabetically.

Return: One column `city` (from departments.location and addresses.city)

In [None]:
ex_06 = """

"""
conn.execute(ex_06).fetchdf()

In [None]:
check("ex_06", ex_06)

---
## Exercise 7: Products Ordered vs Not Ordered (Hard)

**Problem:** Create a report showing product_id and a status of 'ordered' or 'not_ordered'.

Return columns: product_id, status

**Hint:** UNION products in order_items (ordered) with products NOT in order_items

In [None]:
ex_07 = """

"""
conn.execute(ex_07).fetchdf()

In [None]:
check("ex_07", ex_07)

---
## Exercise 8: Complex Set Operation (Hard)

**Problem:** Find department locations that have no customer addresses.

Return: One column `location`

**Hint:** EXCEPT addresses.city from departments.location

In [None]:
ex_08 = """

"""
conn.execute(ex_08).fetchdf()

In [None]:
check("ex_08", ex_08)

---
## Summary

- **UNION** - Combines and removes duplicates
- **UNION ALL** - Combines keeping all rows
- **INTERSECT** - Only rows in both queries
- **EXCEPT** - Rows in first but not second

### Next: Notebook 08 - String & Date Functions

In [None]:
conn.close()