# SQL Troubleshooting: Debugging Common Issues

## Introduction

In real-world data engineering projects, you'll encounter SQL queries that don't work as expected. This notebook contains **problematic SQL queries** based on the sample data from the "SQL Joins" notebook.

**Your Task:** 
- Run each query and identify what's wrong
- Determine the root cause of the issue
- Understand why the query fails or returns incorrect results
- Think about how you would fix it (but don't implement the fix yet)

**Important:** These queries have **intentional issues**. Your job is to debug them!

**Dataset:** This uses the same tables from "2a SQL Joins":
- `customers` - Customer information
- `products` - Product catalog
- `orders` - Order information
- `order_items` - Items in each order

---

## How to Approach These Exercises

1. **Read the query carefully** - Understand what it's trying to accomplish
2. **Run the query** - See what error you get (if any)
3. **Analyze the error** - Is it a syntax error, logic error, or data issue?
4. **Identify the root cause** - What specifically is wrong?
5. **Think about the fix** - How would you correct it?

**Common Issue Categories:**
- ❌ Syntax errors (missing commas, typos, incorrect keywords)
- ❌ Join issues (wrong join type, missing conditions, incorrect keys)
- ❌ Aggregation errors (missing GROUP BY, incorrect aggregate usage)
- ❌ NULL handling problems
- ❌ Data type mismatches
- ❌ Logic errors (wrong WHERE conditions, incorrect calculations)
- ❌ Column reference errors (ambiguous columns, wrong table aliases)
- ❌ Performance issues (missing filters, inefficient joins)

---

## Setup: Verify Data is Available

Before starting, make sure you've run the setup queries from "2a SQL Joins" notebook to create and populate the tables.


## Exercise 1:

**Objective:** Find all customers and their orders.

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    o.order_id,
    o.order_date
FROM customers c
INNER JOIN orders o;


---

## Exercise 2: 

**Objective:** Show products and their order details.


**Query:**


In [None]:
SELECT 
    p.product_name,
    oi.quantity,
    oi.unit_price,
    o.order_date
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.order_id
INNER JOIN orders o ON oi.order_id = o.order_id;


---

## Exercise 3:

**Objective:** Calculate total revenue per customer.


**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    SUM(o.total_amount) AS total_revenue
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;


---

## Exercise 4:

**Objective:** Show order details with customer and product information.


**Query:**


In [None]:
SELECT 
    order_id,
    first_name,
    last_name,
    product_name,
    quantity,
    unit_price
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id;


---

## Exercise 5:

**Objective:** Find all customers and their orders, but only show orders from January 2024.


**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    o.order_id,
    o.order_date,
    o.total_amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01' AND o.order_date < '2024-02-01';


---

## Exercise 6:

**Objective:** Find customers who have never placed an order.

**Issue Type:** Wrong join type

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    c.email
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;


---

## Exercise 7: Syntax Error - Missing Comma

**Objective:** Show customer details with order information.

**Issue Type:** Syntax error

**Query:**


In [None]:
SELECT 
    c.customer_id
    c.first_name,
    c.last_name,
    o.order_id,
    o.order_date
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;


---

## Exercise 8: Incorrect Aggregate Function Usage

**Objective:** Show each product with its total quantity sold and number of orders.

**Issue Type:** Aggregate function misuse

**Query:**


In [None]:
SELECT 
    p.product_id,
    p.product_name,
    SUM(oi.quantity) AS total_quantity_sold,
    COUNT(oi.order_id) AS number_of_orders,
    oi.unit_price
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name;


---

## Exercise 9: Data Type Mismatch in JOIN

**Objective:** Join customers with orders using string comparison.

**Issue Type:** Data type mismatch

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name,
    o.order_id,
    o.order_date
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id::VARCHAR;


---

## Exercise 10: Missing NULL Handling

**Objective:** Calculate total revenue per customer, including customers with no orders.

**Issue Type:** NULL handling issue

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    SUM(o.total_amount) AS total_revenue,
    COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name;


---

## Exercise 11: Incorrect Table Alias Usage

**Objective:** Show order details with customer information.

**Issue Type:** Table alias error

**Query:**


In [None]:
SELECT 
    ord.order_id,
    ord.order_date,
    cust.first_name,
    cust.last_name,
    ord.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;


---

## Exercise 12: Wrong Column in GROUP BY

**Objective:** Show total sales per product category.

**Issue Type:** GROUP BY error

**Query:**


In [None]:
SELECT 
    p.category,
    SUM(oi.quantity * oi.unit_price) AS total_revenue,
    COUNT(DISTINCT p.product_id) AS product_count
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name;


---

## Exercise 13: Incorrect String Concatenation

**Objective:** Show full customer name with order details.

**Issue Type:** String function error

**Query:**


In [None]:
SELECT 
    c.first_name + ' ' + c.last_name AS customer_name,
    o.order_id,
    o.order_date,
    o.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;


---

## Exercise 14: Missing Table in FROM Clause

**Objective:** Show customer email and their order total.

**Issue Type:** Missing table reference

**Query:**


In [None]:
SELECT 
    c.email,
    o.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'delivered';


---

## Exercise 15: Incorrect Date Filter Logic

**Objective:** Find orders from the last 30 days.

**Issue Type:** Date function/logic error

**Query:**


In [None]:
SELECT 
    o.order_id,
    o.order_date,
    o.total_amount,
    c.first_name || ' ' || c.last_name AS customer_name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date > CURRENT_DATE - 30;


---

## Exercise 16: Duplicate Join Condition

**Objective:** Show products with their order details.

**Issue Type:** Redundant/incorrect join condition

**Query:**


In [None]:
SELECT 
    p.product_name,
    oi.quantity,
    oi.unit_price,
    o.order_date
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id AND oi.product_id = o.order_id;


---

## Exercise 17: Incorrect HAVING Clause Usage

**Objective:** Show customers who have placed more than 1 order.

**Issue Type:** HAVING vs WHERE confusion

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    COUNT(o.order_id) AS order_count
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE COUNT(o.order_id) > 1
GROUP BY c.customer_id, c.first_name, c.last_name;


---

## Exercise 18: Missing JOIN for Multi-Table Query

**Objective:** Show customer name, product name, and order date in one result.

**Issue Type:** Missing join between tables

**Query:**


In [None]:
SELECT 
    c.first_name || ' ' || c.last_name AS customer_name,
    p.product_name,
    o.order_date
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN products p;


---

## Exercise 19: Incorrect Calculation Logic

**Objective:** Calculate the average order value per customer.

**Issue Type:** Calculation logic error

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    SUM(o.total_amount) / COUNT(o.order_id) AS average_order_value
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name;


---

## Exercise 20: UNION Column Mismatch

**Objective:** Combine customer names and product names into one list.

**Issue Type:** UNION column mismatch

**Query:**


In [None]:
SELECT 
    first_name || ' ' || last_name AS name
FROM customers

UNION

SELECT 
    product_name,
    category
FROM products;


---

## Exercise 21: Incorrect NULL Comparison

**Objective:** Find products that have never been ordered.

**Issue Type:** NULL comparison error

**Query:**


In [None]:
SELECT 
    p.product_id,
    p.product_name,
    p.category,
    p.price
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE oi.order_id = NULL;


---

## Exercise 22: Missing DISTINCT in COUNT

**Objective:** Count unique customers who have placed orders.

**Issue Type:** Aggregate function precision

**Query:**


In [None]:
SELECT 
    COUNT(c.customer_id) AS total_customers_with_orders
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;


---

## Exercise 23: Incorrect ORDER BY Column

**Objective:** Show top 3 customers by total revenue.

**Issue Type:** ORDER BY error

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    SUM(o.total_amount) AS total_revenue
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY o.total_amount DESC
LIMIT 3;


---

## Exercise 24: Wrong Aggregate in SELECT

**Objective:** Show each order with its line item count.

**Issue Type:** Aggregate function in wrong context

**Query:**


In [None]:
SELECT 
    o.order_id,
    o.order_date,
    o.total_amount,
    COUNT(oi.product_id) AS item_count
FROM orders o
INNER JOIN order_items oi ON o.order_id = oi.order_id;


---

## Exercise 25: Case Sensitivity Issue

**Objective:** Find all delivered orders.

**Issue Type:** String comparison case sensitivity

**Query:**


In [None]:
SELECT 
    o.order_id,
    o.order_date,
    o.total_amount,
    o.status
FROM orders o
WHERE o.status = 'Delivered';


---

## Exercise 26: Circular Join Condition

**Objective:** Show order items with product and order information.

**Issue Type:** Incorrect join logic

**Query:**


In [None]:
SELECT 
    oi.order_id,
    oi.product_id,
    oi.quantity,
    p.product_name,
    o.order_date
FROM order_items oi
INNER JOIN products p ON oi.product_id = p.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE oi.order_id = oi.product_id;


---

## Exercise 27: Incorrect COALESCE Usage

**Objective:** Show all products with their total sales, defaulting to 0 if not sold.

**Issue Type:** COALESCE placement error

**Query:**


In [None]:
SELECT 
    p.product_id,
    p.product_name,
    COALESCE(SUM(oi.quantity), 0) AS total_sold
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name;


---

## Exercise 28: Missing Filter in JOIN

**Objective:** Show all products and only their delivered order items.

**Issue Type:** Filter placement in JOIN

**Query:**


In [None]:
SELECT 
    p.product_id,
    p.product_name,
    oi.quantity,
    oi.unit_price,
    o.order_date
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
LEFT JOIN orders o ON oi.order_id = o.order_id
WHERE o.status = 'delivered';


---

## Exercise 29: Incorrect Subquery Logic

**Objective:** Find customers who have placed orders with total amount greater than average order value.

**Issue Type:** Subquery correlation error

**Query:**


In [None]:
SELECT 
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    o.order_id,
    o.total_amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.total_amount > (
    SELECT AVG(total_amount)
    FROM orders
    WHERE customer_id = o.customer_id
);


---

## Exercise 30: Performance Issue - Missing Index Hint

**Objective:** Show all order details with customer and product information.

**Issue Type:** Inefficient join order (conceptual issue - Snowflake optimizes automatically, but the logic is still problematic)

**Query:**


In [None]:
SELECT 
    c.first_name || ' ' || c.last_name AS customer_name,
    p.product_name,
    oi.quantity,
    o.order_date,
    o.total_amount
FROM order_items oi
INNER JOIN products p ON oi.product_id = p.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date BETWEEN '2024-01-01' AND '2024-12-31';


---

## Exercise 31: Incorrect LIMIT Placement

**Objective:** Show top customer by revenue in each country.

**Issue Type:** LIMIT usage error (this is a conceptual issue - LIMIT doesn't work per group)

**Query:**


In [None]:
SELECT 
    c.country,
    c.customer_id,
    c.first_name || ' ' || c.last_name AS customer_name,
    SUM(o.total_amount) AS total_revenue
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.country, c.customer_id, c.first_name, c.last_name
ORDER BY c.country, total_revenue DESC
LIMIT 1;


---

## Exercise 32: Wrong Column Reference in WHERE

**Objective:** Show orders with their customer information, filtered by customer city.

**Issue Type:** Column reference error

**Query:**


In [None]:
SELECT 
    o.order_id,
    o.order_date,
    o.total_amount,
    c.first_name || ' ' || c.last_name AS customer_name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE city = 'New York';


---

## Exercise 33: Division by Zero Risk

**Objective:** Calculate average line item price per order.

**Issue Type:** Division by zero potential

**Query:**


In [None]:
SELECT 
    o.order_id,
    o.order_date,
    SUM(oi.quantity * oi.unit_price) / COUNT(oi.product_id) AS avg_line_item_price
FROM orders o
INNER JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY o.order_id, o.order_date;


---

## Exercise 34: Incorrect Date Range Logic

**Objective:** Find orders placed in the current month.

**Issue Type:** Date range boundary error

**Query:**


In [None]:
SELECT 
    o.order_id,
    o.order_date,
    o.total_amount
FROM orders o
WHERE EXTRACT(YEAR FROM o.order_date) = EXTRACT(YEAR FROM CURRENT_DATE)
  AND EXTRACT(MONTH FROM o.order_date) = EXTRACT(MONTH FROM CURRENT_DATE)
  AND o.order_date < CURRENT_DATE;


---

## Exercise 35: Missing JOIN for Filtered Aggregation

**Objective:** Show total revenue per product, but only count delivered orders.

**Issue Type:** Missing join for filter condition

**Query:**


In [None]:
SELECT 
    p.product_id,
    p.product_name,
    SUM(oi.quantity * oi.unit_price) AS total_revenue
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.product_id
WHERE o.status = 'delivered'
GROUP BY p.product_id, p.product_name;


---

## Summary

You've now encountered **35 different types of SQL issues** that commonly occur in production environments:

### Issue Categories Covered:

1. **Join Issues** (Exercises 1, 2, 5, 6, 16, 18, 28, 30, 35)
   - Missing join conditions
   - Wrong join keys
   - Incorrect join types
   - Missing joins between tables

2. **Aggregation Errors** (Exercises 3, 8, 12, 17, 22, 24, 48)
   - Missing GROUP BY
   - Wrong columns in GROUP BY
   - Aggregate function misuse
   - HAVING vs WHERE confusion

3. **Syntax Errors** (Exercises 7, 11, 20)
   - Missing commas
   - Table alias mismatches
   - UNION column mismatches

4. **NULL Handling** (Exercises 10, 21, 27)
   - Missing COALESCE
   - Incorrect NULL comparisons
   - NULL in aggregations

5. **Data Type & Logic Errors** (Exercises 9, 13, 15, 19, 25, 29, 33, 34)
   - Data type mismatches
   - String concatenation issues
   - Date function errors
   - Calculation logic errors
   - Case sensitivity
   - Division by zero

6. **Column Reference Errors** (Exercises 4, 23, 32)
   - Ambiguous columns
   - Wrong table references
   - Missing table aliases

7. **Query Structure Issues** (Exercises 14, 26, 31)
   - Missing tables
   - Circular logic
   - LIMIT placement

---

## Next Steps

1. **Review each exercise** - Make sure you understand why each query fails or returns incorrect results
2. **Practice fixing them** - Try to write corrected versions (but don't look up solutions yet!)
3. **Apply to real projects** - These patterns will help you debug issues in production
4. **Build debugging skills** - Learn to read error messages and trace issues systematically

**Remember:** The best way to learn is by making mistakes and understanding why they happen. These exercises simulate real-world debugging scenarios you'll encounter as a data engineer!
