# Chapter 6: Creating Tables and Constraints

Table creation is the foundation of database design. Poor decisions here cascade into application bugs, performance issues, and maintenance nightmares. This chapter establishes industry-standard patterns for table structure, constraint enforcement, and data integrity.

## 6.1 Primary Key Strategies: The Identity Crisis

The choice of primary key affects storage, performance, join efficiency, and application architecture. This decision is difficult to reverse without full table rebuilds.

### 6.1.1 Surrogate Keys (System-Generated)

**Industry Standard:** Use surrogate keys (BIGINT identity) for most tables. They never change, are compact for joins, and decouple identity from business meaning.

```sql
-- Modern standard: GENERATED ALWAYS AS IDENTITY (PostgreSQL 10+)
CREATE TABLE users (
    user_id BIGINT GENERATED ALWAYS AS IDENTITY 
        (START WITH 1000000 INCREMENT BY 1) 
        PRIMARY KEY,
    -- START WITH 1000000: Avoids conflicts with imported legacy data
    -- IDs below 1M reserved for special system accounts or data migration
    
    email VARCHAR(255) NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Detailed explanation:
-- GENERATED ALWAYS AS IDENTITY:
-- 1. SQL standard compliant (portable to Oracle, SQL Server, DB2)
-- 2. Sequence automatically owned by table (dropped with table)
-- 3. Cannot accidentally override ID without OVERRIDING clause (safety)
-- 4. Cleaner DDL than SERIAL
-- 5. Easier to modify (change sequence parameters)

-- Behind the scenes:
-- PostgreSQL creates an implicit sequence: users_user_id_seq
-- Default value: nextval('users_user_id_seq')
-- But it's cleaner and more standard than SERIAL

-- Insert (ID auto-generated)
INSERT INTO users (email) VALUES ('user@example.com');
-- Returns: user_id = 1000000

-- Attempt to specify ID (fails with GENERATED ALWAYS)
INSERT INTO users (user_id, email) VALUES (999, 'admin@example.com');
-- ERROR: cannot insert into column "user_id"
-- DETAIL: Column "user_id" is an identity column defined as GENERATED ALWAYS.
-- HINT: Use OVERRIDING SYSTEM VALUE to override.

-- Override when necessary (data migration)
INSERT INTO users (user_id, email) 
OVERRIDING SYSTEM VALUE 
VALUES (999, 'admin@example.com');
-- Succeeds, but generally avoid this

-- Alternative: GENERATED BY DEFAULT (allows manual IDs)
CREATE TABLE legacy_migration (
    id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    legacy_id INTEGER,
    data TEXT
);

-- Can insert specific IDs during migration
INSERT INTO legacy_migration (id, legacy_id, data) 
VALUES (1, 9999, 'migrated data');
-- Next auto-generated value will be 2 (or higher if sequence advanced)
```

**Legacy SERIAL Approach (Still Common):**

```sql
-- Old way (still widely used but deprecated)
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    -- SERIAL is not a real type; it's syntactic sugar for:
    -- 1. CREATE SEQUENCE orders_order_id_seq
    -- 2. ALTER COLUMN order_id SET DEFAULT nextval('orders_order_id_seq')
    -- 3. ALTER SEQUENCE orders_order_id_seq OWNED BY orders.order_id
    
    customer_id BIGINT NOT NULL,
    total_cents INTEGER NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- SERIAL vs IDENTITY differences:
-- 1. SERIAL allows manual insertion of IDs without special syntax
--    (Can be dangerous - accidental duplicates)
-- 2. SERIAL sequence is not SQL standard
-- 3. SERIAL dumps in pg_dump include sequence state (can cause issues)
-- 4. IDENTITY is cleaner and safer

-- Migration from SERIAL to IDENTITY (if needed):
-- 1. ALTER TABLE orders ALTER COLUMN order_id DROP DEFAULT;
-- 2. DROP SEQUENCE orders_order_id_seq;
-- 3. ALTER TABLE orders ALTER COLUMN order_id 
--    ADD GENERATED ALWAYS AS IDENTITY;
```

### 6.1.2 Natural Keys (Business Meaning)

Use natural keys only when:
- Immutable (never changes)
- Compact (not long strings)
- Has business meaning needed outside the system
- No risk of future changes

```sql
-- Appropriate natural key: Country codes (ISO 3166-1 alpha-2)
-- These are standardized, immutable, compact, and widely used
CREATE TABLE countries (
    iso_code CHAR(2) PRIMARY KEY,  -- 'US', 'CA', 'GB' - immutable standard
    name VARCHAR(100) NOT NULL,
    iso_numeric CHAR(3) UNIQUE,    -- '840', '124', '826'
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Why this works:
-- 1. ISO codes never change (US will always be US)
-- 2. Compact (2 bytes vs 8 bytes for BIGINT)
-- 3. Used in APIs, external systems, URLs
-- 4. Self-documenting ('US' is clearer than country_id 1)

-- Foreign key reference (compact, meaningful)
CREATE TABLE addresses (
    address_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    country_code CHAR(2) NOT NULL REFERENCES countries(iso_code),
    -- References 'US', not 1 (clearer in debugging)
    
    postal_code VARCHAR(20),
    city VARCHAR(100),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Inappropriate natural key: Email addresses
-- Even if unique, emails can change (user requests change, typo correction)
-- Never use as primary key
CREATE TABLE bad_design (
    email VARCHAR(255) PRIMARY KEY,  -- DON'T DO THIS
    name TEXT
);
-- Problem: If user changes email, you must cascade update all foreign keys
-- Email changes should be simple UPDATE, not complex migration

-- Composite natural keys (use sparingly)
CREATE TABLE order_items (
    order_id BIGINT NOT NULL,
    line_number INTEGER NOT NULL,
    product_id BIGINT NOT NULL,
    quantity INTEGER NOT NULL,
    
    PRIMARY KEY (order_id, line_number),
    -- Natural composite key: Order + Line Number
    -- Line numbers are sequential within order (1, 2, 3...)
    
    FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

-- Composite key considerations:
-- Advantages:
-- 1. Enforces business rule (no duplicate line numbers per order)
-- 2. Natural ordering (line 1, line 2, line 3)
-- 3. No extra surrogate key needed

-- Disadvantages:
-- 1. Bulky foreign keys (referencing tables need both columns)
-- 2. ORM complexity (composite keys harder to map)
-- 3. URL encoding (order_id/line_number vs single UUID)
-- 4. Changing line numbers (if reordering allowed) requires updates

-- Recommendation: Use composite keys for join tables (many-to-many)
-- Use surrogate keys for entity tables (users, orders, products)
```

### 6.1.3 UUID Strategies

Use UUIDs for external-facing identifiers or distributed systems where central ID generation is impractical.

```sql
-- UUID primary key (distributed system safe)
CREATE TABLE external_documents (
    doc_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    -- 16 bytes (vs 8 bytes for BIGINT)
    -- Random insertion order (causes index fragmentation)
    
    content TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Index fragmentation mitigation:
-- Random UUIDs cause frequent page splits in B-tree indexes
-- Mitigation strategies:
-- 1. Use UUIDv7 (time-ordered, not yet in core PostgreSQL, use extensions)
-- 2. Use bigint + uuid hybrid (internal bigint, external uuid)
-- 3. Accept fragmentation and schedule regular REINDEX

-- Hybrid approach (Industry Standard):
CREATE TABLE secure_orders (
    -- Internal: BIGINT (fast joins, compact indexes)
    order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    
    -- External: UUID (security, non-enumerable)
    order_uuid UUID UNIQUE DEFAULT gen_random_uuid() NOT NULL,
    
    customer_id BIGINT NOT NULL,
    total_cents INTEGER NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Usage:
-- Internal joins: Use order_id (fast, compact)
-- External APIs: Expose order_uuid (prevents enumeration attacks)
-- URLs: /orders/a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 (not /orders/12345)
```

## 6.2 Foreign Keys: Referential Integrity

Foreign keys enforce relationships between tables. They are your defense against orphaned data and ensure consistency across the database.

### 6.2.1 Basic Foreign Key Creation

```sql
-- Parent table
CREATE TABLE customers (
    customer_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Child table with foreign key
CREATE TABLE orders (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id BIGINT NOT NULL,  -- Foreign key column
    
    order_date DATE NOT NULL DEFAULT CURRENT_DATE,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    total_cents INTEGER NOT NULL,
    
    -- Foreign key constraint with explicit behavior
    CONSTRAINT orders_customer_id_fk 
        FOREIGN KEY (customer_id) 
        REFERENCES customers(customer_id)
        -- Default behavior: ON DELETE NO ACTION, ON UPDATE NO ACTION
);

-- Detailed explanation:
-- FOREIGN KEY (customer_id): Column(s) in this table
-- REFERENCES customers(customer_id): Target table and column
-- Must reference a PRIMARY KEY or UNIQUE column in parent table
-- Can reference multiple columns (composite foreign key)

-- Verify foreign key
SELECT
    tc.constraint_name,
    tc.table_name,
    kcu.column_name,
    ccu.table_name AS foreign_table_name,
    ccu.column_name AS foreign_column_name
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
    ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
    ON ccu.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
    AND tc.table_name = 'orders';
```

### 6.2.2 ON DELETE Behaviors

**Critical Decision:** Choose ON DELETE behavior based on business logic, not convenience.

```sql
-- Option 1: NO ACTION (default) / RESTRICT
-- Prevents deletion of parent if children exist
-- NO ACTION: Checks constraint after statement (allows deferred checks)
-- RESTRICT: Checks immediately (no deferred)

CREATE TABLE order_items_restrict (
    item_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_id BIGINT NOT NULL,
    
    CONSTRAINT fk_order_restrict
        FOREIGN KEY (order_id) 
        REFERENCES orders(order_id)
        ON DELETE RESTRICT  -- Cannot delete order if items exist
);

-- Behavior:
-- DELETE FROM orders WHERE order_id = 1;
-- ERROR: update or delete on table "orders" violates foreign key constraint 
-- "fk_order_restrict" on table "order_items_restrict"
-- DETAIL: Key (order_id)=(1) is still referenced from table "order_items_restrict".

-- Use case: Prevent accidental deletion of orders with items
-- Business rule: Must delete items first, or archive, not delete order

-- Option 2: CASCADE
-- Automatically deletes child rows when parent deleted
-- Dangerous but appropriate for truly dependent data

CREATE TABLE order_items_cascade (
    item_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_id BIGINT NOT NULL,
    
    CONSTRAINT fk_order_cascade
        FOREIGN KEY (order_id) 
        REFERENCES orders(order_id)
        ON DELETE CASCADE  -- Deleting order deletes all items
);

-- Behavior:
-- DELETE FROM orders WHERE order_id = 1;
-- Automatically executes: DELETE FROM order_items_cascade WHERE order_id = 1;
-- Then deletes order

-- Use case: Line items are part of order (no meaning without order)
-- Caution: Cascades are "invisible" in application code
-- Can cause massive deletions if parent deleted accidentally
-- Consider soft deletes (status = 'deleted') instead for important data

-- Option 3: SET NULL
-- Sets foreign key to NULL when parent deleted
-- Requires foreign key column to be nullable

CREATE TABLE customer_notes (
    note_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id BIGINT,  -- Nullable!
    
    note_text TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    
    CONSTRAINT fk_customer_set_null
        FOREIGN KEY (customer_id) 
        REFERENCES customers(customer_id)
        ON DELETE SET NULL  -- Keep note, but remove customer link
);

-- Behavior:
-- DELETE FROM customers WHERE customer_id = 1;
-- Updates customer_notes SET customer_id = NULL WHERE customer_id = 1;
-- Then deletes customer

-- Use case: Historical records that should survive parent deletion
-- Example: Support tickets, audit logs, notes

-- Option 4: SET DEFAULT
-- Sets foreign key to DEFAULT value when parent deleted
-- Rarely used, but useful for "archived" or "system" references

CREATE TABLE user_actions (
    action_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id BIGINT DEFAULT 0,  -- 0 represents "system" or "deleted user"
    
    action_type VARCHAR(50) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    
    CONSTRAINT fk_user_set_default
        FOREIGN KEY (user_id) 
        REFERENCES users(user_id)
        ON DELETE SET DEFAULT  -- Revert to system user on delete
);

-- Requires DEFAULT value to be valid (user_id 0 must exist in users table)
-- Or use ON DELETE SET NULL if no default makes sense
```

### 6.2.3 ON UPDATE Behaviors

```sql
-- ON UPDATE CASCADE: Update foreign keys when primary key changes
-- Rarely needed with surrogate keys (IDs shouldn't change)
-- Useful for natural keys that might change (rare)

CREATE TABLE products (
    sku VARCHAR(50) PRIMARY KEY,  -- Natural key (might change if typo)
    name TEXT NOT NULL,
    price_cents INTEGER NOT NULL
);

CREATE TABLE inventory (
    inventory_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    sku VARCHAR(50) NOT NULL,
    quantity INTEGER NOT NULL,
    
    CONSTRAINT fk_product_update
        FOREIGN KEY (sku) 
        REFERENCES products(sku)
        ON UPDATE CASCADE  -- If SKU corrected, update inventory
        ON DELETE RESTRICT  -- Can't delete product with inventory
);

-- Behavior:
-- UPDATE products SET sku = 'NEW-SKU-001' WHERE sku = 'OLD-SKU-001';
-- Automatically updates inventory SET sku = 'NEW-SKU-001' WHERE sku = 'OLD-SKU-001';

-- Warning: ON UPDATE CASCADE on high-traffic tables can cause lock contention
-- Better to avoid changing primary keys (use surrogate keys)
```

### 6.2.4 Foreign Key Indexing Requirement

**Critical:** Index foreign key columns manually. PostgreSQL does not auto-index them, and deletes/updates on parent tables will lock the child table for sequential scans without an index.

```sql
-- Without index: Deleting from parent requires sequential scan of child
-- With index: Deleting from parent uses index lookup (fast, minimal locks)

-- Create foreign key
ALTER TABLE orders 
    ADD CONSTRAINT orders_customer_id_fk 
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id);

-- Create supporting index (REQUIRED for performance)
CREATE INDEX idx_orders_customer_id ON orders(customer_id);

-- Explanation:
-- When you delete a customer:
-- 1. PostgreSQL must check if any orders reference that customer
-- 2. Without index: Seq scan of orders table (slow, locks table)
-- 3. With index: Index lookup (fast, minimal locking)

-- Composite foreign keys need composite indexes
CREATE TABLE order_shipments (
    shipment_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_id BIGINT NOT NULL,
    shipment_number INTEGER NOT NULL,
    
    CONSTRAINT fk_order_shipment
        FOREIGN KEY (order_id) REFERENCES orders(order_id),
    
    -- Composite index for foreign key
    CONSTRAINT idx_order_shipment_lookup 
        UNIQUE (order_id, shipment_number)
);

-- Or separate index:
CREATE INDEX idx_shipments_order ON order_shipments(order_id);
```

## 6.3 CHECK Constraints: Data Quality Gates

CHECK constraints enforce domain-specific rules at the database level. They prevent invalid data regardless of application bugs.

### 6.3.1 Basic CHECK Constraints

```sql
CREATE TABLE products (
    product_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    sku VARCHAR(50) NOT NULL UNIQUE,
    name TEXT NOT NULL,
    
    -- Price must be positive
    price_cents INTEGER NOT NULL 
        CONSTRAINT products_price_positive CHECK (price_cents > 0),
    
    -- Stock can't be negative
    stock_quantity INTEGER NOT NULL DEFAULT 0
        CONSTRAINT products_stock_non_negative CHECK (stock_quantity >= 0),
    
    -- Weight must be reasonable (grams)
    weight_grams INTEGER 
        CONSTRAINT products_weight_reasonable CHECK (weight_grams BETWEEN 1 AND 1000000),
    
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Detailed explanation:
-- CHECK (condition): Evaluates boolean expression on insert/update
-- Fails if condition is false or NULL (NULL passes CHECK, handle with care)
-- Can reference multiple columns
-- Can use functions and operators

-- Test constraints:
INSERT INTO products (sku, name, price_cents, stock_quantity) 
VALUES ('TEST-001', 'Test Product', -100, 0);
-- ERROR: new row for relation "products" violates check constraint "products_price_positive"
-- DETAIL: Failing row contains (1, TEST-001, Test Product, -100, 0, null, ...)

-- Constraint violation details help debugging:
-- Constraint name tells you exactly what failed
-- Failing row shows the data that violated the constraint
```

### 6.3.2 Complex CHECK Constraints

```sql
CREATE TABLE bookings (
    booking_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    room_id INTEGER NOT NULL,
    check_in_date DATE NOT NULL,
    check_out_date DATE NOT NULL,
    guest_count INTEGER NOT NULL,
    
    -- Multi-column constraints
    CONSTRAINT bookings_dates_valid 
        CHECK (check_out_date > check_in_date),
    -- Check-out must be after check-in
    
    CONSTRAINT bookings_duration_limit
        CHECK (check_out_date - check_in_date <= 30),
    -- Maximum 30-day stay
    
    CONSTRAINT bookings_guests_reasonable
        CHECK (guest_count BETWEEN 1 AND 10),
    
    -- Using functions in CHECK
    CONSTRAINT bookings_no_past_bookings
        CHECK (check_in_date >= CURRENT_DATE),
    -- Can't book in the past (but note: this is evaluated at insert time;
    -- yesterday's bookings inserted today would fail)
    
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Regex validation in CHECK
CREATE TABLE users (
    user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE
        CONSTRAINT users_email_format 
        CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'),
        -- Basic email format validation
    
    username VARCHAR(50) NOT NULL UNIQUE
        CONSTRAINT users_username_format
        CHECK (username ~ '^[a-zA-Z0-9_]+$'),
        -- Alphanumeric and underscore only
    
    age INTEGER
        CONSTRAINT users_age_range
        CHECK (age IS NULL OR (age >= 13 AND age <= 120)),
        -- Nullable, but if provided must be 13-120
    
    CONSTRAINT users_adult_or_minor
        CHECK (age IS NULL OR age >= 18 OR (age < 18 AND email LIKE '%@parent.%')),
        -- Complex: Minors must have parent email (silly example, but shows complexity)
    
    created_at TIMESTAMPTZ DEFAULT NOW()
);
```

### 6.3.3 Enums vs CHECK Constraints

```sql
-- Option 1: CHECK constraint for status (flexible)
CREATE TABLE orders_check (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    status VARCHAR(20) NOT NULL 
        CONSTRAINT orders_status_check 
        CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled'))
);

-- Option 2: ENUM type (compact, ordered)
CREATE TYPE order_status_enum AS ENUM (
    'pending', 'confirmed', 'shipped', 'delivered', 'cancelled'
);

CREATE TABLE orders_enum (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    status order_status_enum NOT NULL DEFAULT 'pending'
);

-- Comparison:
-- CHECK constraint:
-- - Easy to add new values (ALTER TABLE...DROP CONSTRAINT...ADD CONSTRAINT)
-- - Can add/remove values without DDL
-- - No ordering (just alphabetical in checks)
-- - Storage: Variable (text)

-- ENUM:
-- - Compact storage (4 bytes internally)
-- - Ordered (can ORDER BY status and get logical progression)
-- - Harder to modify (ALTER TYPE...ADD VALUE, can't remove values easily)
-- - Must use ALTER TYPE to add values (locks table briefly)

-- Recommendation:
-- Use ENUM for stable internal state machines (connection states, job statuses)
-- Use CHECK constraints for business values that might change (order types, categories)
-- Use lookup tables for values needing metadata (descriptions, external codes)
```

## 6.4 UNIQUE Constraints: Uniqueness Guarantees

UNIQUE constraints ensure no duplicate values exist in a column or combination of columns.

### 6.4.1 Single and Multi-Column Unique

```sql
-- Single column unique (column constraint syntax)
CREATE TABLE users (
    user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,  -- Implicit unique constraint
    -- Creates index automatically
    username VARCHAR(50) NOT NULL UNIQUE
);

-- Equivalent table constraint syntax (explicit naming)
CREATE TABLE users_explicit (
    user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email VARCHAR(255) NOT NULL,
    username VARCHAR(50) NOT NULL,
    
    CONSTRAINT users_email_unique UNIQUE (email),
    CONSTRAINT users_username_unique UNIQUE (username)
    -- Explicit names are better for debugging and migrations
);

-- Multi-column unique (combination must be unique)
CREATE TABLE user_permissions (
    user_id BIGINT NOT NULL REFERENCES users(user_id),
    resource_type VARCHAR(50) NOT NULL,
    resource_id BIGINT NOT NULL,
    permission_level VARCHAR(20) NOT NULL,
    
    -- User can have only one permission entry per resource
    CONSTRAINT user_resource_unique 
        UNIQUE (user_id, resource_type, resource_id),
    
    PRIMARY KEY (user_id, resource_type, resource_id)
    -- Often multi-column unique is also the primary key for join tables
);

-- Partial unique indexes (conditional uniqueness)
CREATE TABLE active_promotions (
    promotion_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    product_id BIGINT NOT NULL REFERENCES products(product_id),
    discount_percent INTEGER NOT NULL,
    start_date DATE NOT NULL,
    end_date DATE NOT NULL,
    is_active BOOLEAN NOT NULL DEFAULT true,
    
    -- Only one active promotion per product
    CONSTRAINT one_active_promotion_per_product 
        UNIQUE (product_id, is_active) 
        WHERE is_active = true
    -- Partial unique index: only enforces uniqueness among active=true rows
);

-- Explanation:
-- Partial unique indexes are powerful for soft deletes, active flags
-- Allows multiple inactive promotions per product
-- Only one active promotion per product enforced
```

### 6.4.2 Unique with NULL Handling

```sql
-- PostgreSQL unique constraints allow multiple NULLs (NULL != NULL)
-- This is SQL standard behavior

CREATE TABLE user_phones (
    phone_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id BIGINT NOT NULL REFERENCES users(user_id),
    phone_type VARCHAR(20) NOT NULL,  -- 'mobile', 'home', 'work'
    phone_number VARCHAR(20),
    
    -- User can have multiple NULL phone numbers (unlimited)
    -- But if phone_number is provided, must be unique per type
    CONSTRAINT unique_phone_per_type 
        UNIQUE (user_id, phone_type, phone_number)
);

-- This allows:
-- (1, 'mobile', NULL)
-- (1, 'mobile', NULL)  -- Another NULL is allowed!
-- (1, 'home', '555-1234')
-- But prevents:
-- (1, 'home', '555-1234')  -- Duplicate

-- If you need to enforce "only one NULL" (rare), use partial index:
CREATE UNIQUE INDEX idx_one_null_phone 
    ON user_phones(user_id, phone_type) 
    WHERE phone_number IS NULL;
-- This prevents multiple NULLs for same user/type combination
```

## 6.5 NOT NULL and DEFAULT Constraints

### 6.5.1 NOT NULL Constraints

```sql
-- NOT NULL ensures column cannot contain NULL
-- Critical for data integrity

CREATE TABLE orders (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id BIGINT NOT NULL,  -- Every order must have a customer
    order_date DATE NOT NULL DEFAULT CURRENT_DATE,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    total_cents INTEGER NOT NULL,
    
    -- Nullable fields (optional data)
    notes TEXT,  -- Can be NULL (no notes)
    shipped_at TIMESTAMPTZ,  -- NULL until shipped
    tracking_number VARCHAR(100)  -- NULL if not shipped
);

-- Adding NOT NULL to existing column (requires care)
-- Step 1: Update existing NULLs
UPDATE orders SET status = 'pending' WHERE status IS NULL;

-- Step 2: Add constraint
ALTER TABLE orders 
    ALTER COLUMN status SET NOT NULL;

-- Or with DEFAULT (updates existing rows automatically)
ALTER TABLE orders 
    ALTER COLUMN status SET DEFAULT 'pending';
-- Note: This doesn't change existing NULLs, just sets default for new rows

-- Adding NOT NULL with check (safer)
DO $$
BEGIN
    -- Check if any NULLs exist
    IF EXISTS (SELECT 1 FROM orders WHERE status IS NULL) THEN
        RAISE EXCEPTION 'Cannot add NOT NULL: NULL values exist in status column';
    END IF;
    
    ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
END $$;
```

### 6.5.2 DEFAULT Values

```sql
-- DEFAULT provides value when INSERT omits column

CREATE TABLE audit_log (
    log_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    action VARCHAR(50) NOT NULL,
    table_name VARCHAR(100) NOT NULL,
    record_id BIGINT NOT NULL,
    
    -- Automatic timestamp
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    
    -- Automatic user (requires setting in application or session)
    created_by VARCHAR(100) DEFAULT current_user,
    
    -- Computed default (not simple value)
    log_date DATE DEFAULT CURRENT_DATE,
    
    -- Random identifier
    correlation_id UUID DEFAULT gen_random_uuid()
);

-- Default expressions evaluation:
-- NOW(): Evaluated at INSERT time (transaction start time)
-- CURRENT_DATE: Evaluated at INSERT time
-- gen_random_uuid(): Evaluated per row (different for each row)

-- Overriding defaults:
INSERT INTO audit_log (action, table_name, record_id, created_at)
VALUES ('DELETE', 'users', 123, '2024-01-15 10:00:00');
-- Explicitly provided created_at overrides DEFAULT

-- Default with function call (advanced)
CREATE OR REPLACE FUNCTION next_order_number()
RETURNS VARCHAR AS $$
DECLARE
    next_num INTEGER;
    year_prefix VARCHAR;
BEGIN
    year_prefix := to_char(current_date, 'YYYY');
    SELECT COALESCE(MAX(NULLIF(regexp_replace(order_number, '^' || year_prefix || '-', ''), '')), '0')::INTEGER + 1
    INTO next_num
    FROM orders
    WHERE order_number LIKE year_prefix || '-%';
    
    RETURN year_prefix || '-' || LPAD(next_num::TEXT, 6, '0');
END;
$$ LANGUAGE plpgsql;

CREATE TABLE orders (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_number VARCHAR(20) NOT NULL DEFAULT next_order_number(),
    -- Generates '2024-000001', '2024-000002', etc.
    customer_id BIGINT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Caution: Functions in DEFAULT can be slow and cause serialization issues
-- Prefer sequences/identity for simple incrementing IDs
```

## 6.6 Generated Columns: Derived Data

Generated columns compute their values from other columns automatically. They are always up-to-date and cannot be modified directly.

### 6.6.1 STORED Generated Columns

```sql
-- Generated columns (PostgreSQL 12+)
CREATE TABLE products (
    product_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    price_cents INTEGER NOT NULL CHECK (price_cents > 0),
    
    -- Generated column: price in dollars (for convenience)
    price_dollars NUMERIC(10, 2) 
        GENERATED ALWAYS AS (price_cents / 100.0) STORED,
    -- STORED: Computed on write/update, stored on disk
    -- Can be indexed, used in foreign keys (unlike virtual columns)
    
    -- Generated from text
    search_vector TSVECTOR
        GENERATED ALWAYS AS (to_tsvector('english', name)) STORED,
    -- Full-text search vector automatically maintained
    
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Usage:
INSERT INTO products (name, price_cents) VALUES ('Widget', 999);
-- price_dollars automatically set to 9.99
-- search_vector automatically set to 'widget'

UPDATE products SET price_cents = 1299 WHERE product_id = 1;
-- price_dollars automatically updates to 12.99

-- Cannot modify generated column directly:
UPDATE products SET price_dollars = 15.00 WHERE product_id = 1;
-- ERROR: column "price_dollars" can only be updated to DEFAULT
-- DETAIL: Column "price_dollars" is a generated column.

-- Index on generated column (useful for search)
CREATE INDEX idx_products_search ON products USING GIN (search_vector);
```

### 6.6.2 Virtual Generated Columns (Not in PostgreSQL)

Note: PostgreSQL does not support VIRTUAL generated columns (computed on read, not stored). Other databases (MySQL, SQL Server) support this. In PostgreSQL, use views or functions for computed-on-read behavior.

```sql
-- Workaround: Use view for virtual computed columns
CREATE VIEW products_with_margin AS
SELECT 
    product_id,
    name,
    price_cents,
    price_cents / 100.0 as price_dollars,  -- Computed on read
    cost_cents,
    (price_cents - cost_cents)::NUMERIC / price_cents * 100 as margin_percent
FROM products;

-- View is always up-to-date (computed at query time)
-- But cannot be indexed directly (use materialized view if needed)
```

## 6.7 Exclusion Constraints: Advanced Uniqueness

Exclusion constraints enforce that no two rows satisfy a certain operator condition. They are essential for time-slot allocation, non-overlapping ranges, and complex uniqueness rules.

### 6.7.1 Time Slot Exclusion (No Double-Booking)

```sql
-- Enable btree_gist extension (required for scalar exclusion constraints)
CREATE EXTENSION IF NOT EXISTS btree_gist;

CREATE TABLE room_reservations (
    reservation_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    room_id INTEGER NOT NULL,
    user_id BIGINT NOT NULL REFERENCES users(user_id),
    
    -- Using tsrange (timestamp range)
    during TSTZRANGE NOT NULL,
    
    -- Exclusion constraint: No overlapping ranges for same room
    CONSTRAINT no_double_booking 
        EXCLUDE USING GIST (
            room_id WITH =,           -- Same room
            during WITH &&            -- Overlapping time range
        )
    -- && is the overlap operator for ranges
);

-- Detailed explanation:
-- EXCLUDE USING GIST: Uses GiST index for exclusion checking
-- room_id WITH =: If room_id is equal
-- during WITH &&: And during ranges overlap (&& operator)
-- Then reject the insert

-- Test:
INSERT INTO room_reservations (room_id, user_id, during)
VALUES (101, 1, '[2024-01-15 09:00, 2024-01-15 12:00)');

INSERT INTO room_reservations (room_id, user_id, during)
VALUES (101, 2, '[2024-01-15 11:00, 2024-01-15 14:00)');
-- ERROR: conflicting key value violates exclusion constraint "no_double_booking"
-- DETAIL: Key (room_id, during)=(101, ["2024-01-15 11:00:00","2024-01-15 14:00:00")) 
-- conflicts with existing key (room_id, during)=(101, ["2024-01-15 09:00:00","2024-01-15 12:00:00")).

-- But different rooms can overlap:
INSERT INTO room_reservations (room_id, user_id, during)
VALUES (102, 2, '[2024-01-15 11:00, 2024-01-15 14:00)');
-- Succeeds (different room_id)

-- Adjacent bookings (back-to-back) are allowed:
INSERT INTO room_reservations (room_id, user_id, during)
VALUES (101, 3, '[2024-01-15 12:00, 2024-01-15 15:00)');
-- Succeeds (12:00 is exclusive end of first booking, inclusive start of third)
```

### 6.7.2 Exclusion with Additional Conditions

```sql
-- Exclude overlapping ranges, but allow different statuses
-- (e.g., cancelled bookings don't block new bookings)

CREATE TABLE room_reservations_v2 (
    reservation_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    room_id INTEGER NOT NULL,
    user_id BIGINT NOT NULL,
    during TSTZRANGE NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'confirmed',
    
    -- Only confirmed/active bookings block each other
    CONSTRAINT no_active_double_booking 
        EXCLUDE USING GIST (
            room_id WITH =,
            during WITH &&
        )
        WHERE (status = 'confirmed')  -- Partial exclusion constraint
);

-- Cancelled bookings don't block:
INSERT INTO room_reservations_v2 (room_id, user_id, during, status)
VALUES (101, 1, '[2024-01-15 09:00, 2024-01-15 12:00)', 'cancelled');

INSERT INTO room_reservations_v2 (room_id, user_id, during, status)
VALUES (101, 2, '[2024-01-15 10:00, 2024-01-15 13:00)', 'confirmed');
-- Succeeds because first booking is cancelled (WHERE condition excludes it)
```

## 6.8 Deferred Constraints: Transaction-Level Enforcement

Constraints are normally checked immediately. DEFERRABLE constraints allow checking at transaction commit instead.

### 6.8.1 Deferred Foreign Keys

```sql
-- Scenario: Circular foreign keys or multi-step operations
-- Example: Employee and Department with mutual references

CREATE TABLE departments (
    dept_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    manager_id BIGINT,  -- Will reference employees, but employees need dept first
    CONSTRAINT valid_manager 
        FOREIGN KEY (manager_id) 
        REFERENCES employees(employee_id)
        DEFERRABLE INITIALLY DEFERRED
    -- DEFERRABLE: Can be deferred
    -- INITIALLY DEFERRED: Deferred by default (check at commit)
    -- INITIALLY IMMEDIATE: Immediate by default (can be deferred manually)
);

CREATE TABLE employees (
    employee_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    dept_id BIGINT NOT NULL,
    CONSTRAINT valid_department 
        FOREIGN KEY (dept_id) 
        REFERENCES departments(dept_id)
        DEFERRABLE INITIALLY DEFERRED
);

-- Without deferred constraints, this is impossible:
-- Can't insert department without manager
-- Can't insert manager without department

-- With deferred constraints:
BEGIN;
    SET CONSTRAINTS ALL DEFERRED;  -- Defer all deferrable constraints
    
    INSERT INTO departments (name, manager_id) 
    VALUES ('Engineering', NULL);  -- Temporary NULL
    
    INSERT INTO employees (name, dept_id) 
    VALUES ('Alice', 1);
    
    UPDATE departments SET manager_id = 1 WHERE dept_id = 1;
    -- Now valid, but constraint check deferred
    
COMMIT;  -- Constraints checked here, all valid

-- If we had violated constraints, COMMIT would fail and rollback
```

### 6.8.2 Deferred Uniqueness

```sql
-- Swap primary keys in a single transaction
CREATE TABLE seats (
    seat_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    row_number INTEGER NOT NULL,
    seat_number INTEGER NOT NULL,
    UNIQUE (row_number, seat_number) DEFERRABLE INITIALLY DEFERRED
);

INSERT INTO seats (row_number, seat_number) VALUES (1, 1), (1, 2);

-- Swap seat numbers
BEGIN;
    SET CONSTRAINTS ALL DEFERRED;
    
    UPDATE seats SET seat_number = 2 WHERE seat_id = 1;
    -- Would violate uniqueness immediately (two seats with number 2)
    -- But deferred until commit
    
    UPDATE seats SET seat_number = 1 WHERE seat_id = 2;
    -- Now unique again
    
COMMIT;  -- Succeeds
```

## 6.9 Constraint Naming and Documentation

Explicit constraint names are essential for debugging, migrations, and operational clarity.

### 6.9.1 Naming Conventions

```sql
-- Pattern: {table}_{column(s)}_{type}

CREATE TABLE constraint_examples (
    id BIGINT GENERATED ALWAYS AS IDENTITY,
    
    -- Primary Key: {table}_pkey
    CONSTRAINT example_pkey PRIMARY KEY (id),
    
    -- Foreign Key: {table}_{column}_fkey
    parent_id BIGINT,
    CONSTRAINT example_parent_id_fkey 
        FOREIGN KEY (parent_id) REFERENCES constraint_examples(id),
    
    -- Unique: {table}_{column}_ukey
    code VARCHAR(50),
    CONSTRAINT example_code_ukey UNIQUE (code),
    
    -- Check: {table}_{column}_check or {table}_{rule}_check
    status VARCHAR(20),
    CONSTRAINT example_status_check 
        CHECK (status IN ('active', 'inactive')),
    
    -- Not Null: (usually not named, but can be)
    name TEXT CONSTRAINT example_name_notnull NOT NULL,
    
    -- Default: (not a named constraint, but document it)
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Why naming matters:
-- 1. Error messages show constraint name (easier debugging)
-- 2. Migrations reference names (DROP CONSTRAINT, etc.)
-- 3. Documentation (constraint purpose is clear)
-- 4. Monitoring (identify which constraints cause issues)
```

### 6.9.2 Adding Constraints to Existing Tables

```sql
-- Adding constraints safely (without locking table for long time)

-- 1. NOT NULL (requires table scan if no default)
ALTER TABLE products 
    ALTER COLUMN sku SET NOT NULL;
-- Locks table while checking existing rows

-- Safer: Add CHECK constraint first, validate later
ALTER TABLE products 
    ADD CONSTRAINT products_sku_not_null 
    CHECK (sku IS NOT NULL) NOT VALID;
-- NOT VALID: Don't check existing rows (fast, minimal lock)
-- Constraint enforced for new rows only

-- Then validate in background
ALTER TABLE products 
    VALIDATE CONSTRAINT products_sku_not_null;
-- Validates existing rows without exclusive lock
-- Can be cancelled if takes too long

-- Then convert to NOT NULL (optional, for semantic clarity)
ALTER TABLE products 
    ALTER COLUMN sku SET NOT NULL;
-- Now safe because constraint ensures no NULLs

-- 2. Foreign Key (can be slow on large tables)
-- Create index first (required for performance)
CREATE INDEX idx_orders_customer_id ON orders(customer_id);

-- Add constraint
ALTER TABLE orders 
    ADD CONSTRAINT orders_customer_id_fk 
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id);
-- This validates all existing rows (can be slow)

-- Safer: Add NOT VALID first
ALTER TABLE orders 
    ADD CONSTRAINT orders_customer_id_fk 
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
    NOT VALID;
-- Added but not validated (new rows checked, old rows not)

-- Validate later (can run during low traffic)
ALTER TABLE orders 
    VALIDATE CONSTRAINT orders_customer_id_fk;
-- Scans table but doesn't lock exclusively
```

---

## Chapter Summary

In this chapter, you learned:

1. **Primary Key Strategies**: Use `BIGINT GENERATED ALWAYS AS IDENTITY` as the industry standard for surrogate keys (portable, safe, auto-cleanup). Use natural keys only for immutable, compact, business-meaningful identifiers (ISO codes, not emails). Consider UUIDs for external-facing IDs or distributed systems, preferably in hybrid mode (internal BIGINT, external UUID).

2. **Foreign Keys**: Always specify `ON DELETE` behavior explicitly (`RESTRICT` for safety, `CASCADE` for truly dependent data, `SET NULL` for historical records). Use `ON UPDATE CASCADE` only with natural keys. **Critical**: Index foreign key columns manually—PostgreSQL does not auto-index them, and missing indexes cause table locks on parent deletes.

3. **CHECK Constraints**: Enforce domain rules at the database level (positive prices, valid date ranges, regex patterns). They prevent invalid data regardless of application bugs. Use `NOT VALID` for adding to existing tables without locking, then `VALIDATE CONSTRAINT` separately.

4. **UNIQUE Constraints**: Single and multi-column uniqueness, partial unique indexes for conditional uniqueness (e.g., one active promotion per product). Remember PostgreSQL allows multiple NULLs in unique columns (SQL standard behavior).

5. **NOT NULL and DEFAULT**: Use `NOT NULL` for required fields, `DEFAULT` for automatic values (timestamps, UUIDs). Adding `NOT NULL` to existing tables requires care—update existing NULLs first or use `NOT VALID` workflow.

6. **Generated Columns**: Use `GENERATED ALWAYS AS ... STORED` for derived data (price in dollars, search vectors) that must be persisted and indexed. Computed on write/update, stored on disk, cannot be modified directly.

7. **Exclusion Constraints**: Use GiST-based exclusion constraints for complex uniqueness like non-overlapping time ranges (no double-booking). Requires `btree_gist` extension for scalar types. Supports `WHERE` clauses for partial exclusion.

8. **Deferred Constraints**: Use `DEFERRABLE INITIALLY DEFERRED` for circular foreign keys or multi-step operations that temporarily violate constraints. Checked at transaction commit, not immediately.

9. **Constraint Management**: Name constraints explicitly (`{table}_{column}_{type}`) for debugging and migrations. Use `NOT VALID` + `VALIDATE CONSTRAINT` workflow for adding constraints to large tables without long locks.

---

**Next:** In Chapter 7, we will explore CRUD operations and filtering—covering INSERT with RETURNING, UPDATE strategies, DELETE safety patterns, WHERE clause optimization, NULL handling semantics, and pagination techniques that perform at scale.

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='5. data_types_done_right.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='7. crud_and_filtering.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
