# Window Functions Optimization

This notebook demonstrates techniques for optimizing window functions in PostgreSQL:
* Efficient window function usage
* Frame clause optimization
* Multiple window functions
* Common performance pitfalls

## 1. Basic Window Functions

In [None]:
-- Simple running totals
EXPLAIN ANALYZE
SELECT 
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        ORDER BY order_date
    ) as running_total
FROM orders
WHERE order_date >= '2022-01-01'
ORDER BY order_date;

-- Add index to improve performance
CREATE INDEX idx_orders_date_amount ON orders(order_date, total_amount);

## 2. Optimizing Frame Clauses

In [None]:
-- Compare different frame clauses
EXPLAIN ANALYZE
SELECT 
    order_date,
    total_amount,
    -- Using ROWS (faster for small windows)
    AVG(total_amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
    ) as moving_avg_rows,
    -- Using RANGE (better for time-based windows)
    AVG(total_amount) OVER (
        ORDER BY order_date
        RANGE BETWEEN INTERVAL '3 days' PRECEDING AND CURRENT ROW
    ) as moving_avg_range
FROM orders
WHERE order_date >= '2022-01-01'
ORDER BY order_date;

## 3. Multiple Window Functions

In [None]:
-- Inefficient: Multiple window functions with similar specifications
EXPLAIN ANALYZE
SELECT 
    order_date,
    customer_id,
    total_amount,
    SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) as customer_running_total,
    AVG(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) as customer_running_avg,
    COUNT(*) OVER (PARTITION BY customer_id ORDER BY order_date) as customer_running_count
FROM orders
WHERE order_date >= '2022-01-01'
ORDER BY customer_id, order_date;

-- Optimized: Using WINDOW clause
EXPLAIN ANALYZE
SELECT 
    order_date,
    customer_id,
    total_amount,
    SUM(total_amount) OVER w as customer_running_total,
    AVG(total_amount) OVER w as customer_running_avg,
    COUNT(*) OVER w as customer_running_count
FROM orders
WHERE order_date >= '2022-01-01'
WINDOW w AS (PARTITION BY customer_id ORDER BY order_date)
ORDER BY customer_id, order_date;

## 4. Complex Window Functions

In [None]:
-- Complex analysis with multiple window functions
WITH monthly_sales AS (
    SELECT 
        DATE_TRUNC('month', o.order_date) as sale_month,
        p.category,
        SUM(oi.quantity * oi.unit_price) as revenue
    FROM orders o
    JOIN order_items oi ON o.order_id = oi.order_id
    JOIN products p ON oi.product_id = p.product_id
    WHERE o.status = 'Completed'
    GROUP BY DATE_TRUNC('month', o.order_date), p.category
)
SELECT 
    sale_month,
    category,
    revenue,
    LAG(revenue) OVER w as prev_month_revenue,
    LEAD(revenue) OVER w as next_month_revenue,
    revenue - LAG(revenue) OVER w as revenue_change,
    ROUND(
        ((revenue - LAG(revenue) OVER w) / NULLIF(LAG(revenue) OVER w, 0) * 100)::numeric,
        2
    ) as revenue_change_pct,
    AVG(revenue) OVER (
        PARTITION BY category
        ORDER BY sale_month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) as moving_avg_3month
FROM monthly_sales
WINDOW w AS (PARTITION BY category ORDER BY sale_month)
ORDER BY category, sale_month;

## 5. Performance Comparison

In [None]:
-- Compare window function vs. self-join approach
-- Window function approach
EXPLAIN ANALYZE
SELECT 
    o1.order_date,
    o1.total_amount,
    SUM(o1.total_amount) OVER (
        ORDER BY o1.order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as running_total
FROM orders o1
WHERE o1.order_date >= '2022-01-01'
ORDER BY o1.order_date;

-- Self-join approach
EXPLAIN ANALYZE
SELECT 
    o1.order_date,
    o1.total_amount,
    (SELECT SUM(o2.total_amount)
     FROM orders o2
     WHERE o2.order_date <= o1.order_date
     AND o2.order_date >= '2022-01-01'
    ) as running_total
FROM orders o1
WHERE o1.order_date >= '2022-01-01'
ORDER BY o1.order_date;

## Best Practices for Window Functions

1. **Optimization Strategies**
   - Use WINDOW clause for multiple functions
   - Choose appropriate frame clauses
   - Consider partitioning impact
   - Index columns used in OVER clause

2. **Frame Clause Selection**
   - Use ROWS for count-based windows
   - Use RANGE for value-based windows
   - Minimize frame size when possible
   - Consider memory usage

3. **Performance Considerations**
   - Avoid unnecessary sorting
   - Use appropriate indexes
   - Consider materialized views
   - Monitor memory usage

4. **Common Pitfalls**
   - Excessive frame sizes
   - Redundant window definitions
   - Missing indexes
   - Unnecessary sorting operations