# Chapter 12: Views and Materialized Views

Views provide abstraction layers over base tables, enabling simplified queries, security restrictions, and schema evolution without breaking applications. Materialized views extend this concept with physical storage for query acceleration. This chapter covers implementation patterns, refresh strategies, and security considerations for production PostgreSQL environments.

## 12.1 Regular Views (Virtual Tables)

### 12.1.1 Creating and Using Views

Views are stored queries that execute dynamically—no data is stored, only the definition.

```sql
-- Basic view: Simplify complex joins
CREATE OR REPLACE VIEW customer_order_summary AS
SELECT 
    c.customer_id,
    c.email,
    c.full_name,
    COUNT(DISTINCT o.order_id) as total_orders,
    COALESCE(SUM(o.total_cents), 0) as lifetime_value_cents,
    MAX(o.created_at) as last_order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id 
                  AND o.status != 'cancelled'
WHERE c.deleted_at IS NULL
GROUP BY c.customer_id, c.email, c.full_name;

-- Usage: Query like a table
SELECT * FROM customer_order_summary 
WHERE lifetime_value_cents > 100000
ORDER BY lifetime_value_cents DESC;

-- View expansion: PostgreSQL rewrites the query
-- The above becomes:
-- SELECT * FROM (underlying query) WHERE lifetime_value_cents > 100000;

-- Column aliases and explicit typing
CREATE OR REPLACE VIEW product_catalog AS
SELECT 
    p.product_id,
    p.sku,
    p.name as product_name,
    p.base_price_cents,
    ROUND(p.base_price_cents / 100.0, 2) as base_price_dollars,
    c.name as category_name,
    s.supplier_name
FROM products p
JOIN categories c ON p.category_id = c.category_id
JOIN suppliers s ON p.supplier_id = s.supplier_id
WHERE p.is_active = TRUE;

-- Benefits:
-- 1. API stability: Applications query catalog, not underlying tables
-- 2. Security: Hide price calculation logic, sensitive columns
-- 3. Maintenance: Change table structure, update view definition, apps unaffected
```

### 12.1.2 Updatable Views

Simple views support automatic `INSERT`, `UPDATE`, and `DELETE` operations. Complex views require triggers.

```sql
-- Automatically updatable view (simple case)
CREATE VIEW active_users AS
SELECT user_id, email, full_name, created_at
FROM users
WHERE deleted_at IS NULL;

-- These work automatically:
UPDATE active_users SET full_name = 'New Name' WHERE user_id = 1;
DELETE FROM active_users WHERE user_id = 2;  -- Deletes from underlying table
-- Insert requires all NOT NULL columns to be present or have defaults
INSERT INTO active_users (email, full_name) VALUES ('new@example.com', 'New User');

-- Conditions for automatic updatability:
-- 1. View references exactly one table (or updatable view)
-- 2. No DISTINCT, GROUP BY, HAVING, aggregate functions, window functions
-- 3. No set operations (UNION, INTERSECT, EXCEPT)
-- 4. No derived columns (calculated expressions) in target list for INSERT

-- Non-updatable view (requires triggers)
CREATE VIEW order_details AS
SELECT 
    o.order_id,
    o.customer_id,
    c.email as customer_email,
    o.total_cents,
    oi.product_sku,
    oi.quantity,
    (oi.quantity * oi.unit_price_cents) as line_total_cents
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id;

-- This will fail:
-- UPDATE order_details SET customer_email = 'new@example.com' WHERE order_id = 1;
-- ERROR: cannot update view "order_details"

-- Solution: INSTEAD OF triggers
CREATE OR REPLACE FUNCTION update_order_details()
RETURNS TRIGGER AS $$
BEGIN
    IF TG_OP = 'UPDATE' THEN
        -- Update underlying tables
        UPDATE orders 
        SET total_cents = NEW.total_cents
        WHERE order_id = OLD.order_id;
        
        UPDATE customers 
        SET email = NEW.customer_email
        WHERE customer_id = OLD.customer_id;
        
        UPDATE order_items 
        SET quantity = NEW.quantity
        WHERE order_id = OLD.order_id 
          AND product_sku = OLD.product_sku;
          
        RETURN NEW;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_order_details_update
    INSTEAD OF UPDATE ON order_details
    FOR EACH ROW
    EXECUTE FUNCTION update_order_details();
```

### 12.1.3 WITH CHECK OPTION (Data Integrity)

Prevent views from "losing" rows due to WHERE clause filtering—a critical safety mechanism.

```sql
-- Problem: Without CHECK OPTION, updates can disappear from view
CREATE VIEW high_value_orders AS
SELECT * FROM orders
WHERE total_cents > 100000;

-- This succeeds but row disappears from view (confusing for applications)
UPDATE high_value_orders 
SET total_cents = 5000  -- Now below threshold
WHERE order_id = 123;

-- Solution: WITH CHECK OPTION
CREATE OR REPLACE VIEW high_value_orders AS
SELECT * FROM orders
WHERE total_cents > 100000
WITH CHECK OPTION;  -- Enforces WHERE clause on INSERT/UPDATE

-- Now this fails:
UPDATE high_value_orders 
SET total_cents = 5000 
WHERE order_id = 123;
-- ERROR: new row violates check option for view "high_value_orders"

-- Cascading CHECK OPTION (for views on views)
CREATE VIEW domestic_orders AS
SELECT * FROM orders WHERE shipping_country = 'US' WITH LOCAL CHECK OPTION;

CREATE VIEW domestic_high_value AS
SELECT * FROM domestic_orders WHERE total_cents > 50000 WITH CASCADED CHECK OPTION;
-- CASCADED enforces all underlying check options (both country='US' and total_cents>50000)
-- LOCAL would only enforce the immediate view's condition
```

## 12.2 Security Barrier Views

### 12.2.1 Preventing Information Leakage

Security barrier views prevent malicious functions from observing rows they shouldn't see—critical for row-level security implementations.

```sql
-- Security risk: User-defined function in view can leak data
CREATE VIEW user_own_data AS
SELECT * FROM sensitive_data 
WHERE user_id = current_setting('app.current_user_id')::BIGINT;

-- If malicious_function is defined as:
-- CREATE FUNCTION malicious_function(id BIGINT) RETURNS BOOLEAN AS $$
--   -- Logs all IDs it sees to a file
-- $$ LANGUAGE plpgsql;

-- Attacker could use:
SELECT * FROM user_own_data WHERE malicious_function(user_id);
-- Function executes for ALL rows before filtering, leaking data

-- Solution: SECURITY BARRIER (PostgreSQL 9.2+)
CREATE OR REPLACE VIEW user_own_data_secure AS
SELECT * FROM sensitive_data 
WHERE user_id = current_setting('app.current_user_id')::BIGINT
WITH (security_barrier);

-- Now predicates are evaluated before user functions execute
-- malicious_function only sees rows matching user_id filter

-- Performance impact: Security barrier prevents predicate pushdown
-- Indexes may not be used as effectively, so use only when necessary

-- Alternative: Use Row-Level Security (RLS) instead of security barrier views
-- RLS is the modern, preferred approach (Chapter 7/Security chapters)
ALTER TABLE sensitive_data ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_isolation ON sensitive_data 
    USING (user_id = current_setting('app.current_user_id')::BIGINT);
```

### 12.2.2 Column-Level Security with Views

Hide sensitive columns while exposing safe ones—easier to manage than GRANT/REVOKE on columns.

```sql
-- Base table with sensitive data
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    full_name VARCHAR(255),
    ssn VARCHAR(11),           -- Highly sensitive
    salary_cents INTEGER,      -- Sensitive
    department_id INTEGER,
    hire_date DATE
);

-- Public view: Safe columns only
CREATE VIEW employees_public AS
SELECT employee_id, full_name, department_id, hire_date
FROM employees;

-- HR view: All columns but row-restricted
CREATE VIEW employees_hr AS
SELECT * FROM employees
WHERE department_id = current_setting('app.user_department_id')::INTEGER
WITH CHECK OPTION;

-- Grant access
GRANT SELECT ON employees_public TO employee_role;
GRANT SELECT ON employees_hr TO hr_role;

-- Revoke direct table access
REVOKE ALL ON employees FROM PUBLIC;

-- Applications must use views, ensuring column-level security
```

## 12.3 Materialized Views (Physical Storage)

### 12.3.1 When to Use Materialized Views

Materialized views store query results physically, trading storage and staleness for query performance.

```sql
-- Scenario: Complex report queried 1000x/day, data changes 1x/day
-- Base tables: orders (10M rows), order_items (50M rows), products (10K rows)

-- Without materialized view: Expensive aggregation every query
SELECT 
    DATE_TRUNC('week', o.created_at) as week,
    p.category_id,
    SUM(oi.quantity * oi.unit_price_cents) as revenue_cents,
    COUNT(DISTINCT o.customer_id) as unique_customers
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_sku = p.sku
WHERE o.status = 'completed'
GROUP BY 1, 2;

-- With materialized view: Pre-computed results
CREATE MATERIALIZED VIEW mv_weekly_category_revenue AS
SELECT 
    DATE_TRUNC('week', o.created_at) as week,
    p.category_id,
    SUM(oi.quantity * oi.unit_price_cents) as revenue_cents,
    COUNT(DISTINCT o.customer_id) as unique_customers,
    COUNT(*) as total_orders
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_sku = p.sku
WHERE o.status = 'completed'
GROUP BY 1, 2
WITH DATA;  -- Populate immediately (default)

-- Query becomes instant index scan
SELECT * FROM mv_weekly_category_revenue 
WHERE week >= NOW() - INTERVAL '4 weeks'
ORDER BY week DESC, revenue_cents DESC;

-- Decision matrix:
-- Use Materialized View when:
-- 1. Query is expensive (aggregation, complex joins)
-- 2. Data changes infrequently vs query frequency
-- 3. Staleness is acceptable (minutes/hours, not seconds)
-- 4. Storage overhead acceptable (duplicate data)

-- Use Regular View when:
-- 1. Real-time data required
-- 2. Underlying tables change constantly
-- 3. Query is fast enough without materialization
```

### 12.3.2 Refresh Strategies

Materialized views become stale immediately after base table changes. Choose appropriate refresh strategies based on consistency requirements.

```sql
-- Full refresh (blocks reads, requires exclusive lock)
REFRESH MATERIALIZED VIEW mv_weekly_category_revenue;
-- View is inaccessible during refresh (milliseconds to minutes depending on size)

-- Concurrent refresh (PostgreSQL 9.4+, allows reads during refresh)
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_weekly_category_revenue;
-- Requirements:
-- 1. Unique index on materialized view (required for identifying rows)
-- 2. Slightly slower than regular refresh (creates new snapshot, swaps atomically)

-- Create required unique index for concurrent refresh
CREATE UNIQUE INDEX idx_mv_weekly_unique 
ON mv_weekly_category_revenue(week, category_id);

-- Incremental refresh (PostgreSQL 17+ with built-in, or pg_ivm extension)
-- Traditional materialized views require full refresh
-- pg_ivm extension provides Incremental View Maintenance (not built-in)

-- Scheduled refresh (cron/pg_cron)
-- Every hour, refresh concurrently
SELECT cron.schedule('refresh-revenue-stats', '0 * * * *', 
    'REFRESH MATERIALIZED VIEW CONCURRENTLY mv_weekly_category_revenue');

-- Conditional refresh (application-controlled)
-- Refresh only if data changed significantly
CREATE OR REPLACE FUNCTION conditional_refresh()
RETURNS void AS $$
DECLARE
    last_refresh TIMESTAMPTZ;
    max_order_date TIMESTAMPTZ;
BEGIN
    SELECT last_refresh INTO last_refresh 
    FROM pg_stat_user_tables 
    WHERE relname = 'mv_weekly_category_revenue';
    
    SELECT MAX(created_at) INTO max_order_date FROM orders;
    
    IF max_order_date > last_refresh + INTERVAL '1 hour' THEN
        REFRESH MATERIALIZED VIEW CONCURRENTLY mv_weekly_category_revenue;
    END IF;
END;
$$ LANGUAGE plpgsql;
```

### 12.3.3 Indexing Materialized Views

Unlike regular views, materialized views support indexes—critical for query performance.

```sql
-- Create materialized view
CREATE MATERIALIZED VIEW mv_customer_metrics AS
SELECT 
    customer_id,
    COUNT(*) as total_orders,
    SUM(total_cents) as lifetime_value_cents,
    MIN(created_at) as first_order_date,
    MAX(created_at) as last_order_date,
    AVG(total_cents) as avg_order_value_cents
FROM orders
GROUP BY customer_id;

-- Primary access pattern: Look up by customer_id
CREATE UNIQUE INDEX idx_mv_customer_metrics_pk 
ON mv_customer_metrics(customer_id);

-- Secondary access patterns
CREATE INDEX idx_mv_customer_metrics_ltv 
ON mv_customer_metrics(lifetime_value_cents DESC) 
WHERE lifetime_value_cents > 1000000;

CREATE INDEX idx_mv_customer_metrics_recency 
ON mv_customer_metrics(last_order_date DESC)
INCLUDE (total_orders, lifetime_value_cents);  -- Covering index

-- Query planner uses indexes
SELECT customer_id, lifetime_value_cents 
FROM mv_customer_metrics 
WHERE customer_id = 12345;
-- Index scan on idx_mv_customer_metrics_pk

-- Automatic index maintenance during refresh
-- Indexes are maintained during REFRESH, not dropped/recreated
-- But: REFRESH CONCURRENTLY creates new table, rebuilds indexes (slower)

-- Optimization: If view is huge and refresh is rare, consider:
-- 1. Drop indexes before refresh, recreate after (faster for very large views)
-- 2. Use partial indexes to reduce maintenance cost
```

### 12.3.4 Materialized View Limitations and Workarounds

```sql
-- Limitation 1: No automatic updates
-- Must manually refresh or use triggers (not recommended for high volume)

-- Workaround: Lazy refresh with cache stampede prevention
CREATE TABLE refresh_log (
    view_name VARCHAR(100) PRIMARY KEY,
    last_refresh TIMESTAMPTZ DEFAULT NOW(),
    refresh_count INTEGER DEFAULT 0
);

-- Application checks staleness before querying
CREATE OR REPLACE FUNCTION get_customer_metrics(p_customer_id BIGINT)
RETURNS mv_customer_metrics AS $$
DECLARE
    result mv_customer_metrics;
    last_refresh TIMESTAMPTZ;
BEGIN
    SELECT last_refresh INTO last_refresh 
    FROM refresh_log WHERE view_name = 'mv_customer_metrics';
    
    -- If stale by > 5 minutes and not currently refreshing, trigger refresh
    IF last_refresh < NOW() - INTERVAL '5 minutes' THEN
        -- Use advisory lock to prevent concurrent refreshes (cache stampede)
        IF pg_try_advisory_lock(hashtext('refresh_mv_customer_metrics')) THEN
            PERFORM pg_background_launch('REFRESH MATERIALIZED VIEW CONCURRENTLY mv_customer_metrics');
            UPDATE refresh_log SET last_refresh = NOW(), refresh_count = refresh_count + 1;
            PERFORM pg_advisory_unlock(hashtext('refresh_mv_customer_metrics'));
        END IF;
    END IF;
    
    SELECT * INTO result FROM mv_customer_metrics WHERE customer_id = p_customer_id;
    RETURN result;
END;
$$ LANGUAGE plpgsql;

-- Limitation 2: No support for some data types in unique indexes
-- Cannot use JSONB or ARRAY in unique index for CONCURRENTLY
-- Workaround: Use hash or expression index on deterministic representation

-- Limitation 3: Cannot use TEMP tables or unlogged tables as source
-- Materialized views are always logged/permanent

-- Limitation 4: No foreign keys to/from materialized views
-- They are not "real" tables, just snapshots
```

## 12.4 Advanced View Patterns

### 12.4.1 Partitioned Views (Legacy Approach)

Before native declarative partitioning (PostgreSQL 10+), views were used to partition data.

```sql
-- Legacy partitioned view pattern (still valid for cross-database access)
CREATE TABLE orders_2023_q1 (CHECK (created_at >= '2023-01-01' AND created_at < '2023-04-01')) INHERITS (orders);
CREATE TABLE orders_2023_q2 (CHECK (created_at >= '2023-04-01' AND created_at < '2023-07-01')) INHERITS (orders);

-- View over partitions ( UNION ALL )
CREATE VIEW orders_all AS
SELECT * FROM orders_2023_q1
UNION ALL
SELECT * FROM orders_2023_q2
UNION ALL
SELECT * FROM orders_2023_q3;

-- Constraint exclusion pushes down predicates (if enabled)
SET constraint_exclusion = partition;
SELECT * FROM orders_all WHERE created_at = '2023-02-15';
-- Only scans orders_2023_q1 due to CHECK constraint

-- Modern approach: Use declarative partitioning (Chapter 13) instead of views for partitioning
```

### 12.4.2 Views for Schema Evolution

Views provide backward compatibility during schema migrations.

```sql
-- Migration: Splitting name into first/last
ALTER TABLE users ADD COLUMN first_name VARCHAR(100);
ALTER TABLE users ADD COLUMN last_name VARCHAR(100);

-- Migrate data
UPDATE users SET 
    first_name = SPLIT_PART(full_name, ' ', 1),
    last_name = SPLIT_PART(full_name, ' ', 2);

-- Create compatibility view for old applications
CREATE VIEW users_legacy AS
SELECT 
    user_id,
    email,
    (first_name || ' ' || last_name) as full_name,  -- Computed backward compatibility
    first_name,
    last_name,
    created_at
FROM users;

-- Old application queries view, new application queries table directly
-- After all apps migrated, drop view and rename table if needed
```

## 12.5 Performance and Maintenance

### 12.5.1 View Performance Considerations

```sql
-- Predicate pushdown (optimization)
CREATE VIEW recent_orders AS
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days';

SELECT * FROM recent_orders WHERE customer_id = 123;
-- Optimizer pushes customer_id = 123 into view, uses index on orders(customer_id)

-- But: Security barrier prevents pushdown
CREATE VIEW recent_orders_secure AS
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days'
WITH (security_barrier);

SELECT * FROM recent_orders_secure WHERE customer_id = 123;
-- May scan all recent orders then filter, or use index depending on plan

-- Optimization: CTEs in views (PostgreSQL 12+ behavior change)
-- PostgreSQL 11: CTEs always materialized (optimization fence)
-- PostgreSQL 12+: Inline CTEs unless MATERIALIZED specified
CREATE VIEW complex_calculation AS
WITH params AS (
    SELECT current_setting('app.config_value')::INTEGER as threshold
)
SELECT * FROM large_table 
WHERE value > (SELECT threshold FROM params);
-- In PG12+, this inlines efficiently. In PG11, materializes params (inefficient for single row)

-- Fix for older versions or to force materialization:
WITH params AS MATERIALIZED (...)
```

### 12.5.2 Monitoring View Usage

```sql
-- Check if views are being used (PostgreSQL tracks access)
SELECT 
    schemaname,
    viewname,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||viewname)) as size
FROM pg_views
JOIN pg_stat_user_tables ON pg_views.viewname = pg_stat_user_tables.relname
WHERE pg_stat_user_tables.seq_scan + pg_stat_user_tables.idx_scan = 0
  AND pg_views.viewname NOT LIKE 'pg_%';

-- Find unused materialized views (consuming storage without benefit)
SELECT 
    schemaname,
    relname as view_name,
    pg_size_pretty(pg_total_relation_size(relid)) as storage_size,
    seq_scan + idx_scan as total_reads,
    last_vacuum,
    last_analyze
FROM pg_stat_user_tables
WHERE relkind = 'm'  -- materialized view
ORDER BY pg_total_relation_size(relid) DESC;

-- Dependency tracking (what depends on this view?)
SELECT 
    dependent_ns.nspname as dependent_schema,
    dependent_view.relname as dependent_view,
    source_ns.nspname as source_schema,
    source_table.relname as source_table
FROM pg_depend
JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid
JOIN pg_class as dependent_view ON pg_rewrite.ev_class = dependent_view.oid
JOIN pg_class as source_table ON pg_depend.refobjid = source_table.oid
JOIN pg_namespace dependent_ns ON dependent_view.relnamespace = dependent_ns.oid
JOIN pg_namespace source_ns ON source_table.relnamespace = source_ns.oid
WHERE source_table.relname = 'orders'
  AND dependent_view.relkind = 'v';
-- Shows all views that reference the 'orders' table
```

### 12.5.3 Materialized View Maintenance Windows

```sql
-- Large materialized view refresh strategy
-- 1. Create new view version
CREATE MATERIALIZED VIEW mv_new AS 
SELECT * FROM ...;

-- 2. Build indexes concurrently
CREATE INDEX CONCURRENTLY idx_new ON mv_new(...);

-- 3. Atomic swap (using transactional DDL)
BEGIN;
ALTER MATERIALIZED VIEW mv_old RENAME TO mv_old_backup;
ALTER MATERIALIZED VIEW mv_new RENAME TO mv_old;
COMMIT;

-- 4. Drop old version
DROP MATERIALIZED VIEW mv_old_backup;

-- Note: This requires 2x storage temporarily but provides zero-downtime switchover

-- Refresh in batches (for very large views without unique index)
-- If you cannot use CONCURRENTLY due to lack of unique index:
CREATE OR REPLACE FUNCTION refresh_large_view()
RETURNS void AS $$
BEGIN
    -- Lock for brief moment to swap
    LOCK TABLE mv_large IN ACCESS EXCLUSIVE MODE;
    REFRESH MATERIALIZED VIEW mv_large;
END;
$$ LANGUAGE plpgsql;
-- Schedule during low-traffic window (maintenance window)
```

---

## Chapter Summary

In this chapter, you learned:

1. **Regular Views**: Create abstraction layers over complex joins and calculations; automatically updatable when simple (single table, no aggregates), but require `INSTEAD OF` triggers for complex multi-table views; use `WITH CHECK OPTION` to prevent updates that would exclude rows from the view's result set.

2. **Security Barriers**: Use `WITH (security_barrier)` to prevent malicious functions from accessing rows filtered by the view's WHERE clause—essential for implementing row-level security via views; note that security barriers prevent predicate pushdown and may impact performance.

3. **Materialized Views**: Store physical snapshots of expensive queries for read-heavy workloads; require `REFRESH MATERIALIZED VIEW` to update data; use `CONCURRENTLY` option (requires unique index) to allow reads during refresh without blocking; choose refresh strategies based on data volatility—scheduled cron jobs for hourly/daily refreshes, application-triggered refreshes for near-real-time requirements.

4. **Indexing Strategy**: Create unique indexes on materialized views to enable concurrent refresh; add covering indexes for common access patterns; remember that indexes increase refresh time but dramatically improve query performance.

5. **Schema Evolution**: Use views as compatibility layers during migrations—expose old column names/computed columns via views while migrating underlying schema; drop views after all dependent applications are updated.

6. **Operational Considerations**: Monitor materialized view staleness and storage consumption; use `pg_stat_user_tables` to identify unused views consuming resources; plan maintenance windows for large view refreshes that cannot use `CONCURRENTLY`; consider the atomic swap pattern for zero-downtime materialized view updates.

---

**Next:** In Chapter 13, we will explore declarative partitioning—covering range, list, and hash partitioning strategies, partition pruning, partition-wise joins, and managing rolling windows of time-series data.