# Module 02: Filtering & Sorting - Advanced Data Retrieval

**Estimated Time:** 45 minutes

## Learning Objectives

By the end of this module, you will be able to:
- Sort query results using ORDER BY
- Limit and paginate results with LIMIT and OFFSET
- Use IN operator for multiple value matching
- Filter ranges with BETWEEN
- Find unique values with DISTINCT
- Combine sorting and filtering techniques

In [None]:
# Setup
import sqlite3
import pandas as pd
from pathlib import Path

%load_ext sql

# Connect to database
DB_PATH = Path.cwd().parent / "data" / "databases" / "ecommerce.db"
conn = sqlite3.connect(DB_PATH)
%sql sqlite:///$DB_PATH

print("✓ Connected to ecommerce.db")

## 1. ORDER BY: Sorting Results

The ORDER BY clause sorts query results based on one or more columns.

### Syntax
```sql
SELECT columns
FROM table
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];
```

- **ASC** (ascending) - default, lowest to highest
- **DESC** (descending) - highest to lowest

In [None]:
# Sort by price (ascending - default)
%%sql
SELECT product_name, price
FROM products
ORDER BY price
LIMIT 10

In [None]:
# Sort by price (descending)
%%sql
SELECT product_name, price
FROM products
ORDER BY price DESC
LIMIT 10

In [None]:
# Sort by multiple columns
%%sql
SELECT product_name, category_id, price
FROM products
ORDER BY category_id ASC, price DESC
LIMIT 15

In [None]:
# Sort by calculated column
%%sql
SELECT 
    product_name,
    price,
    stock_quantity,
    price * stock_quantity AS inventory_value
FROM products
ORDER BY inventory_value DESC
LIMIT 10

## 2. LIMIT and OFFSET: Pagination

LIMIT restricts the number of rows returned. OFFSET skips a specified number of rows.

**Use Case:** Implementing pagination in web applications

```sql
-- Page 1 (rows 1-10)
LIMIT 10 OFFSET 0

-- Page 2 (rows 11-20)
LIMIT 10 OFFSET 10

-- Page 3 (rows 21-30)
LIMIT 10 OFFSET 20
```

In [None]:
# Get first 5 customers
%%sql
SELECT customer_id, first_name, last_name, email
FROM customers
ORDER BY customer_id
LIMIT 5

In [None]:
# Get next 5 customers (pagination - page 2)
%%sql
SELECT customer_id, first_name, last_name, email
FROM customers
ORDER BY customer_id
LIMIT 5 OFFSET 5

In [None]:
# Top 10 most expensive products
%%sql
SELECT product_name, price
FROM products
ORDER BY price DESC
LIMIT 10

In [None]:
# Products ranked 11-20 by price
%%sql
SELECT product_name, price
FROM products
ORDER BY price DESC
LIMIT 10 OFFSET 10

## 3. IN Operator: Multiple Value Matching

The IN operator checks if a value matches any value in a list. It's a shorthand for multiple OR conditions.

In [None]:
# Products in specific categories
%%sql
SELECT product_name, category_id, price
FROM products
WHERE category_id IN (1, 3, 5)
ORDER BY category_id, price

In [None]:
# Customers from specific cities
%%sql
SELECT first_name, last_name, city
FROM customers
WHERE city IN ('New York', 'Los Angeles', 'Chicago')
ORDER BY city, last_name

In [None]:
# Orders with specific statuses
%%sql
SELECT order_id, customer_id, status, total_amount
FROM orders
WHERE status IN ('Pending', 'Shipped')
ORDER BY status, total_amount DESC
LIMIT 15

In [None]:
# NOT IN - exclude specific categories
%%sql
SELECT product_name, category_id, price
FROM products
WHERE category_id NOT IN (1, 2)
ORDER BY price DESC
LIMIT 10

## 4. BETWEEN: Range Filtering

BETWEEN filters values within a range (inclusive).

```sql
WHERE column BETWEEN value1 AND value2
-- Same as:
WHERE column >= value1 AND column <= value2
```

In [None]:
# Products in price range $50-$100
%%sql
SELECT product_name, price
FROM products
WHERE price BETWEEN 50 AND 100
ORDER BY price

In [None]:
# Products with stock quantity in range
%%sql
SELECT product_name, stock_quantity
FROM products
WHERE stock_quantity BETWEEN 50 AND 150
ORDER BY stock_quantity

In [None]:
# Orders within date range
%%sql
SELECT order_id, order_date, total_amount
FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31'
ORDER BY order_date
LIMIT 15

In [None]:
# NOT BETWEEN
%%sql
SELECT product_name, price
FROM products
WHERE price NOT BETWEEN 20 AND 80
ORDER BY price
LIMIT 15

## 5. DISTINCT: Unique Values

DISTINCT removes duplicate rows from query results.

In [None]:
# All unique cities (without DISTINCT)
%%sql
SELECT city
FROM customers
ORDER BY city
LIMIT 20

In [None]:
# All unique cities (with DISTINCT)
%%sql
SELECT DISTINCT city
FROM customers
ORDER BY city

In [None]:
# Distinct countries
%%sql
SELECT DISTINCT country
FROM customers
ORDER BY country

In [None]:
# Distinct order statuses
%%sql
SELECT DISTINCT status
FROM orders
ORDER BY status

In [None]:
# DISTINCT with multiple columns (unique combinations)
%%sql
SELECT DISTINCT city, country
FROM customers
ORDER BY country, city
LIMIT 20

## 6. Combining Techniques

Let's combine everything we've learned for more powerful queries.

In [None]:
# Top 10 mid-range products (price $30-$100) in specific categories
%%sql
SELECT product_name, category_id, price, stock_quantity
FROM products
WHERE price BETWEEN 30 AND 100
  AND category_id IN (1, 2, 3)
  AND stock_quantity > 0
ORDER BY price DESC
LIMIT 10

In [None]:
# High-value orders from specific regions
%%sql
SELECT order_id, customer_id, order_date, total_amount, status
FROM orders
WHERE total_amount > 200
  AND status IN ('Shipped', 'Delivered')
  AND order_date >= '2024-01-01'
ORDER BY total_amount DESC
LIMIT 15

In [None]:
# Products needing restocking (pagination example)
%%sql
SELECT product_name, stock_quantity, price
FROM products
WHERE stock_quantity BETWEEN 20 AND 100
ORDER BY stock_quantity ASC, price DESC
LIMIT 10 OFFSET 0

## 7. Real-World Examples

Let's apply these concepts to practical business scenarios.

In [None]:
# Example 1: Top 10 Revenue-Generating Orders
%%sql
SELECT 
    order_id,
    customer_id,
    order_date,
    total_amount,
    status
FROM orders
WHERE status != 'Cancelled'
ORDER BY total_amount DESC
LIMIT 10

In [None]:
# Example 2: Recent Orders (Last 20)
%%sql
SELECT 
    order_id,
    customer_id,
    order_date,
    total_amount,
    status
FROM orders
ORDER BY order_date DESC
LIMIT 20

In [None]:
# Example 3: Stock Alert - Low Stock Products
%%sql
SELECT 
    product_name,
    stock_quantity,
    price,
    price * stock_quantity AS inventory_value
FROM products
WHERE stock_quantity < 50
ORDER BY stock_quantity ASC
LIMIT 15

## 8. Exercises

Practice what you've learned with these exercises.

### Exercise 1: Top 15 Products by Price
Find the 15 most expensive products, showing their name, price, and stock quantity. Sort by price (highest first).

In [None]:
# Your code here
%%sql

### Exercise 2: Customer Pagination
Retrieve customers 21-40 when sorted by last name alphabetically. Show customer_id, first_name, last_name, and email.

In [None]:
# Your code here
%%sql

### Exercise 3: Mid-Range Products
Find all products priced between $25 and $75 in categories 2, 4, or 6. Order by category, then by price descending.

In [None]:
# Your code here
%%sql

### Exercise 4: Unique Customer Locations
List all unique city and country combinations from the customers table, ordered by country then city.

In [None]:
# Your code here
%%sql

### Exercise 5: Orders in Q1 2024
Find all orders from January 1, 2024 to March 31, 2024 with total_amount over $150. Show order_id, order_date, total_amount, and status. Sort by total_amount (highest first).

In [None]:
# Your code here
%%sql

## Summary

In this module, you learned:
- ✓ How to sort results with ORDER BY (ASC/DESC)
- ✓ How to limit and paginate results with LIMIT and OFFSET
- ✓ How to match multiple values with IN operator
- ✓ How to filter ranges with BETWEEN
- ✓ How to find unique values with DISTINCT
- ✓ How to combine sorting, filtering, and pagination

**Key Takeaways:**
- Always ORDER BY when using LIMIT/OFFSET for consistent pagination
- BETWEEN is inclusive (includes both boundary values)
- IN is cleaner than multiple OR conditions
- DISTINCT applies to all selected columns
- Combine techniques for powerful, precise queries

**Next:** Module 03 - JOINs & Relationships

In [None]:
# Cleanup
conn.close()
print("✓ Database connection closed")