# Chapter 25: Extensions and Ecosystem

PostgreSQL's extensibility allows adding functionality through shared libraries without modifying core code. While extensions provide powerful capabilities—ranging from cryptographic functions to geospatial indexing—they introduce operational complexity regarding security, replication, and version compatibility. This chapter establishes safe patterns for extension adoption, management, and lifecycle maintenance in production environments.

## 25.1 Extension Architecture and Management

Extensions package SQL objects (functions, types, indexes, tables) together, managed through a centralized catalog system that tracks dependencies and versions.

### 25.1.1 Extension Installation and Registration

```sql
-- View available extensions (installed on filesystem but not in database):
SELECT * FROM pg_available_extensions 
WHERE name LIKE 'pg%' 
ORDER BY name;

-- Key columns:
-- name: Extension identifier
-- default_version: Version installed by default
-- installed_version: NULL if not installed in current database
-- comment: Description

-- Install an extension into current database:
CREATE EXTENSION IF NOT EXISTS pgcrypto;
-- This executes:
-- 1. Loads shared library (if required)
-- 2. Creates SQL objects (functions, types, operators)
-- 3. Registers in pg_extension catalog
-- 4. Records dependencies in pg_depend

-- Check installed extensions:
SELECT 
    e.extname,
    e.extversion,
    n.nspname as schema,
    c.rolname as owner
FROM pg_extension e
JOIN pg_namespace n ON e.extnamespace = n.oid
JOIN pg_roles c ON e.extowner = c.oid;

-- Extensions create objects in specific schemas:
-- Default: public schema (simpler but can clutter namespace)
-- Better practice: Dedicated schema per extension
CREATE SCHEMA IF NOT EXISTS extensions;
CREATE EXTENSION pgcrypto WITH SCHEMA extensions;
-- Now functions are accessed as: extensions.digest(), extensions.gen_random_uuid()

-- Set search path to include extension schema:
ALTER DATABASE mydb SET search_path = public, extensions;
-- Or set at role level:
ALTER ROLE app_user SET search_path = public, extensions;
```

### 25.1.2 Extension Control Files

```sql
-- Extensions are defined by control files (on server filesystem):
-- /usr/share/postgresql/16/extension/extension_name.control

-- Example pgcrypto.control content:
-- # pgcrypto extension
-- comment = 'cryptographic functions'
-- default_version = '1.3'
-- module_pathname = '$libdir/pgcrypto'
-- relocatable = true

-- Key control file parameters:
-- default_version: Initial version when CREATE EXTENSION runs
-- requires: Dependencies on other extensions
-- schema: Fixed schema (if not relocatable)
-- superuser: Whether superuser required to install

-- Check if extension is relocatable (can move between schemas):
SELECT 
    extname,
    extrelocatable 
FROM pg_extension 
WHERE extname = 'pgcrypto';
-- true = Can use ALTER EXTENSION SET SCHEMA
-- false = Fixed to installation schema (usually for extensions with hardcoded schema references)
```

## 25.2 Trusted vs Untrusted Extensions

Security classification determines whether extensions can be installed by non-superusers and what execution context they operate in.

### 25.2.1 Trusted Extensions (Safe)

```sql
-- Trusted extensions can be installed by non-superusers with CREATE privilege on database
-- They operate within database server process but don't access filesystem/external resources
-- Marked as trusted in control file: trusted = true

-- Common trusted extensions:
-- uuid-ossp, pgcrypto, citext, pg_trgm, btree_gin, btree_gist, pg_stat_statements

-- Installation by non-superuser:
-- As regular user 'app_owner':
CREATE EXTENSION citext;  -- Succeeds if trusted

-- Use case: Case-insensitive text type
CREATE TABLE users (
    email CITEXT PRIMARY KEY,  -- 'Hello@Example.com' = 'hello@example.com'
    username CITEXT
);
-- Indexes work case-insensitively without functional indexes
```

### 25.2.2 Untrusted Extensions (Privileged)

```sql
-- Untrusted extensions require superuser to install
-- Can access filesystem, network, or execute arbitrary code (C libraries)
-- Examples: plpython3u, plperlu, adminpack, file_fdw, postgres_fdw (sometimes)

-- Attempt by non-superuser:
CREATE EXTENSION plpython3u;
-- ERROR: permission denied to create extension "plpython3u"
-- HINT: Must be superuser to create this extension.

-- Security implications of untrusted extensions:
-- 1. plpython3u: Python code can read/write filesystem, network calls
-- 2. file_fdw: Can read any file PostgreSQL OS user can access
-- 3. adminpack: Provides file access functions to superusers

-- If application needs untrusted extension:
-- 1. Superuser installs extension
-- 2. Superuser grants EXECUTE on specific functions to application role
-- 3. Application uses wrapper functions with SECURITY DEFINER

-- Example: Wrapping plpython3u (if absolutely necessary):
CREATE EXTENSION plpython3u;  -- Superuser only

-- Superuser creates safe wrapper:
CREATE FUNCTION calculate_complex_metric(input_data JSONB)
RETURNS DECIMAL AS $$
    -- PL/Python implementation with input validation
    import json
    data = json.loads(input_data)
    # ... complex calculation ...
    return result
$$ LANGUAGE plpython3u SECURITY DEFINER;

-- Grant to application role:
GRANT EXECUTE ON FUNCTION calculate_complex_metric(JSONB) TO app_role;

-- Application cannot create arbitrary Python code, only use wrapper
```

### 25.2.3 Extension Security Scanning

```sql
-- Audit installed extensions for security risks:
SELECT 
    e.extname,
    e.extversion,
    CASE 
        WHEN e.extname IN ('plpython2u', 'plpython3u', 'plperlu', 'pltclu') 
            THEN 'UNTRUSTED: Can execute arbitrary code'
        WHEN e.extname IN ('file_fdw', 'adminpack') 
            THEN 'WARNING: File system access'
        WHEN e.extname IN ('dblink', 'postgres_fdw') 
            THEN 'WARNING: Network access to other databases'
        ELSE 'TRUSTED'
    END as security_classification,
    n.nspname as installed_schema
FROM pg_extension e
JOIN pg_namespace n ON e.extnamespace = n.oid;

-- Policy: Document all untrusted extensions in security audit
-- Require: Regular review of functions created by extensions
```

## 25.3 Essential Extensions Reference

These extensions form the standard toolkit for production PostgreSQL deployments.

### 25.3.1 pgcrypto (Cryptographic Functions)

```sql
CREATE EXTENSION IF NOT EXISTS pgcrypto WITH SCHEMA extensions;

-- Hashing (one-way):
SELECT extensions.digest('password123', 'sha256');  -- Bytea output
SELECT extensions.encode(extensions.digest('password123', 'sha256'), 'hex');  -- Hex string

-- Use for: Storing password hashes (though bcrypt/scrypt preferred, use external auth if possible)
-- Never store reversible passwords in database

-- Encryption (two-way, symmetric):
-- Encrypt with AES-256:
SELECT extensions.pgp_sym_encrypt('sensitive data', 'secret-key');

-- Decrypt:
SELECT extensions.pgp_sym_decrypt(encrypted_column, 'secret-key');

-- Best practices:
-- 1. Don't store encryption keys in database (use KMS, environment variables)
-- 2. pgcrypto uses OpenSSL (ensure system OpenSSL is updated)
-- 3. Encryption is CPU intensive (1-2ms per operation), batch if possible

-- Random UUID generation (alternative to uuid-ossp):
SELECT extensions.gen_random_uuid();  -- UUID v4 (random)
```

### 25.3.2 uuid-ossp (UUID Generation)

```sql
CREATE EXTENSION IF NOT EXISTS uuid-ossp WITH SCHEMA extensions;

-- UUID v1 (timestamp + MAC address):
SELECT extensions.uuid_generate_v1();  -- Time-ordered, contains MAC address

-- UUID v1 with random MAC (more private):
SELECT extensions.uuid_generate_v1mc();

-- UUID v4 (random):
SELECT extensions.uuid_generate_v4();  -- Same as pgcrypto.gen_random_uuid()

-- UUID v3 (namespace-based, deterministic):
SELECT extensions.uuid_generate_v3(
    extensions.uuid_ns_url(), 
    'https://example.com/user/123'
);  -- Always same output for same input

-- UUID v5 (SHA-1 namespace):
SELECT extensions.uuid_generate_v5(
    extensions.uuid_ns_dns(), 
    'example.com'
);

-- When to use which:
-- v1: Time-ordered, good for B-tree index locality, but reveals MAC and timestamp
-- v4: Random, no ordering, no privacy leaks, but index fragmentation
-- v3/v5: Deterministic (hash-based), good for merging distributed datasets
```

### 25.3.3 citext (Case-Insensitive Text)

```sql
CREATE EXTENSION IF NOT EXISTS citext;

-- Case-insensitive comparisons:
CREATE TABLE users (
    email CITEXT PRIMARY KEY,
    username CITEXT
);

-- Queries are case-insensitive without LOWER():
SELECT * FROM users WHERE email = 'User@Example.COM';
-- Matches: user@example.com, USER@EXAMPLE.COM, etc.

-- Index usage:
CREATE INDEX idx_email ON users(email);  -- Standard B-tree works case-insensitively
-- vs TEXT approach:
-- CREATE INDEX idx_email_lower ON users(LOWER(email));
-- Query: WHERE LOWER(email) = LOWER('User@Example.COM')

-- Limitations:
-- 1. Slightly slower than TEXT (case-folding overhead)
-- 2. Pattern matching (LIKE) is case-sensitive unless using specific operators
-- 3. Not collation-aware in same way as TEXT

-- Migration from TEXT to CITEXT:
ALTER TABLE users ALTER COLUMN email TYPE CITEXT;
-- Existing data preserved, comparisons now case-insensitive
```

### 25.3.4 pg_trgm (Trigram Matching)

```sql
CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- Trigrams: Breaking text into three-character groups
-- 'hello' -> '  h', ' he', 'hel', 'ell', 'llo', 'lo '

-- Similarity search (fuzzy matching):
SELECT similarity('hello', 'hallo');  -- Returns 0.5 (50% similar)

-- Finding similar words:
SELECT word 
FROM dictionary 
WHERE word % 'postgress'  -- % = similarity operator
ORDER BY word <-> 'postgress';  -- <-> = distance operator (lower is closer)

-- GIN index for fast similarity:
CREATE INDEX idx_word_trgm ON dictionary USING GIN(word gin_trgm_ops);

-- Full-text search vs Trigrams:
-- Full-text: Linguistic (stemming, dictionaries), word-based
-- Trigrams: Character-based, good for typos, code, DNA sequences

-- Use cases:
-- 1. Typo-tolerant search boxes
-- 2. Near-duplicate detection
-- 3. Pattern matching with leading wildcards (LIKE '%suffix')
```

### 25.3.5 pg_stat_statements (Query Statistics)

```sql
-- Requires shared_preload_libraries configuration (postgresql.conf):
-- shared_preload_libraries = 'pg_stat_statements'

-- Restart PostgreSQL required after adding to preload_libraries

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- View top queries by total time:
SELECT 
    substring(query, 1, 100) as query_snippet,
    calls,
    round(total_exec_time::numeric, 2) as total_time_ms,
    round(mean_exec_time::numeric, 2) as mean_time_ms,
    rows,
    100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) as hit_percent
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Reset statistics (pg_stat_reset is superuser only):
SELECT pg_stat_statements_reset();

-- Configuration in postgresql.conf:
-- pg_stat_statements.max = 10000
-- pg_stat_statements.track = all
-- pg_stat_statements.track_utility = on
-- pg_stat_statements.track_planning = off  -- Overhead in PG 14+

-- Security note: pg_stat_statements captures query text including literals
-- May capture sensitive data (PII) in query parameters
-- Consider track = top to only capture normalized queries (PG 15+)
```

### 25.3.6 btree_gin and btree_gist

```sql
-- Enable B-tree operations in GIN/GiST indexes (for hybrid indexes)

CREATE EXTENSION IF NOT EXISTS btree_gin;
CREATE EXTENSION IF NOT EXISTS btree_gist;

-- Use case: GIN index on multiple columns with different operators
CREATE INDEX idx_hybrid ON documents 
USING GIN (
    title,           -- B-tree equality (via btree_gin)
    to_tsvector('english', content),  -- Full-text
    tags             -- Array containment
);

-- Or GiST index supporting both equality and range:
CREATE INDEX idx_range_gist ON events 
USING GIST (
    event_type,      -- Equality (btree_gist)
    during           -- Range overlap (native GiST)
);
```

## 25.4 Extension Versioning and Updates

Extensions evolve independently of PostgreSQL core, requiring explicit updates that may involve data migrations.

### 25.4.1 Version Management

```sql
-- Check current version vs available:
SELECT 
    extname,
    extversion as current_version,
    (SELECT default_version FROM pg_available_extensions WHERE name = e.extname) as latest_available
FROM pg_extension e;

-- Update extension to latest version:
ALTER EXTENSION pgcrypto UPDATE;
-- Or specify version:
ALTER EXTENSION pgcrypto UPDATE TO '1.3';

-- What happens during update:
-- PostgreSQL executes update scripts: 
-- /usr/share/postgresql/16/extension/pgcrypto--1.2--1.3.sql
-- These contain ALTER FUNCTION, CREATE OR REPLACE, etc.

-- If update fails (breaking changes):
-- Extension remains at old version
-- Error message indicates conflicting objects

-- Downgrade is not supported by ALTER EXTENSION
-- To downgrade: DROP EXTENSION CASCADE, reinstall old version (risky, data loss)
```

### 25.4.2 Migration Implications

```sql
-- Breaking changes in extension updates:
-- Example: PostGIS major versions often require dump/restore

-- Pre-update safety check:
BEGIN;
ALTER EXTENSION postgis UPDATE TO '3.4.0';
-- Test application queries
ROLLBACK;  -- If issues found

-- Production update strategy:
-- 1. Test in staging with production-like data
-- 2. Create database backup before update
-- 3. Update during maintenance window
-- 4. Verify with application smoke tests

-- Extension dependencies:
-- Some extensions depend on others (e.g., postgis_raster depends on postgis)
-- Update order matters: Update dependencies first
SELECT * FROM pg_depend 
WHERE deptype = 'e' 
AND refobjid IN (SELECT oid FROM pg_extension);
```

## 25.5 Operational Considerations

Extensions affect backup, replication, and cloud deployment strategies.

### 25.5.1 Backup and Restore

```sql
-- pg_dump behavior:
-- By default, pg_dump includes CREATE EXTENSION statements
-- Restoring to fresh database automatically installs extensions if available

-- Issues:
-- 1. If extension not available on target (different PostgreSQL version or OS):
-- pg_restore fails with "extension not available"

-- Solution 1: Pre-install extensions on target
-- Solution 2: pg_dump with --no-extension (then manually install)

-- Custom objects created by extensions:
-- Extension-owned objects are dropped with DROP EXTENSION
-- But if you created indexes using extension operators, those indexes depend on extension

-- Checking extension dependencies:
SELECT 
    dependent.nspname || '.' || dependent.relname as object,
    dependent.relkind as type,
    extension.extname as depends_on
FROM pg_depend
JOIN pg_class dependent ON pg_depend.objid = dependent.oid
JOIN pg_extension extension ON pg_depend.refobjid = extension.oid
WHERE pg_depend.deptype = 'e';
```

### 25.5.2 Replication and High Availability

```sql
-- Logical replication:
-- Extensions with shared libraries (pgcrypto, etc.) must be installed on subscriber
-- Extension updates must be coordinated across primary and replicas

-- Streaming replication:
-- Physical replicas inherit extensions automatically (filesystem level)
-- But: If extension creates user data in custom tables, that data replicates normally

-- Cloud provider limitations:
-- AWS RDS: Allows many extensions, some require parameter group changes
-- Google Cloud SQL: Curated list, some extensions restricted
-- Azure: Similar restrictions

-- Check cloud compatibility before adopting extension:
-- Verify extension is in provider's allowlist
-- Check if shared_preload_libraries modification is allowed (needed for pg_stat_statements)

-- Extensions in connection poolers (pgbouncer):
-- Session-level extensions (search_path) must be set consistently
-- Transaction pooling mode: Extensions work fine
-- Statement pooling mode: May have issues with session-level extension state
```

### 25.5.3 Performance Monitoring

```sql
-- Some extensions add background workers or statistics collectors
-- Monitor overhead:

-- pg_stat_statements memory usage:
-- Size = pg_stat_statements.max * ~200 bytes + query text storage
-- With max=10000, ~2MB base + query texts

-- Check for extension-related background workers:
SELECT * FROM pg_stat_activity WHERE backend_type LIKE '%worker%';

-- Extensions with custom wait events (PG 14+):
SELECT * FROM pg_stat_activity 
WHERE wait_event_type = 'Extension';
```

## 25.6 Extension Selection Policy

Establish organizational standards for extension adoption to prevent operational sprawl and security vulnerabilities.

### 25.6.1 Adoption Criteria

```sql
-- Before approving new extension, verify:

-- 1. Maintenance status:
-- Check last commit date in repository
-- Is it part of PostgreSQL contrib (more stable) or third-party?
-- Contrib extensions: pgcrypto, uuid-ossp, pg_trgm, etc. (ship with PostgreSQL)
-- Third-party: TimescaleDB, Citus, etc. (separate installation)

-- 2. Security audit:
-- Is it trusted? (trusted = true)
-- If untrusted, justify need and document security controls
-- Review C code for buffer overflows (if untrusted)

-- 3. Operational impact:
-- Does it require shared_preload_libraries? (Requires restart)
-- Does it add background workers?
-- Disk space usage for extension objects?

-- 4. Migration path:
-- Can it be uninstalled without data loss?
-- Are there breaking changes between versions?
-- Is there a dump/restore requirement for major updates?

-- 5. Licensing:
-- Compatible with your organization's policies?
-- PostgreSQL license vs GPL vs proprietary?
```

### 25.6.2 Approved Extension Registry

```sql
-- Maintain internal documentation of approved extensions:

-- Tier 1 (Standard): Approved for all databases
-- - pgcrypto: Cryptographic hashing
-- - uuid-ossp: UUID generation
-- - citext: Case-insensitive text
-- - pg_trgm: Fuzzy search
-- - btree_gin/btree_gist: Index optimization
-- - pg_stat_statements: Query statistics (if preload allowed)

-- Tier 2 (Restricted): Requires architecture review
-- - postgres_fdw: Foreign data wrappers (security implications)
-- - pg_cron: Job scheduling (operational complexity)
-- - timescaledb: Time-series (significant architectural impact)

-- Tier 3 (Prohibited): Not allowed in production
-- - plpython3u: Python execution (security risk unless heavily sandboxed)
-- - adminpack: File system access (violation of least privilege)

-- Implementation: Database-level check constraint or documentation
-- Enforce via CI/CD pipeline scanning for CREATE EXTENSION statements
```

### 25.6.3 Extension Cleanup

```sql
-- Remove unused extensions to reduce attack surface:

-- Identify potentially unused extensions:
SELECT 
    e.extname,
    e.extversion,
    pg_size_pretty(pg_total_relation_size(e.extnamespace::regnamespace)) as schema_size
FROM pg_extension e
LEFT JOIN pg_depend d ON d.refobjid = e.oid 
    AND d.deptype = 'e'
    AND d.classid = 'pg_class'::regclass
WHERE e.extname NOT IN ('plpgsql')  -- Exclude core extensions
GROUP BY e.extname, e.extversion, e.extnamespace
HAVING COUNT(d.objid) < 5;  -- Few dependent objects

-- Safe removal process:
-- 1. Verify no dependencies:
SELECT * FROM pg_depend WHERE refobjid = 'extension_name'::regextension;

-- 2. Backup data if extension created custom tables

-- 3. Drop extension:
DROP EXTENSION IF EXISTS extension_name;
-- Use CASCADE only if you're certain about dependent objects:
-- DROP EXTENSION extension_name CASCADE;  -- Dangers: Drops indexes, functions, etc.

-- 4. Verify application still functions (staging environment)
```

---

## Chapter Summary

In this chapter, you learned:

1. **Extension Management**: Extensions package SQL objects into versioned bundles managed via `CREATE EXTENSION`, `ALTER EXTENSION UPDATE`, and `DROP EXTENSION`. Install extensions into dedicated schemas (e.g., `extensions`) to avoid namespace pollution and manage search paths explicitly.

2. **Security Classifications**: Trusted extensions (e.g., `pgcrypto`, `citext`) operate within database security boundaries and can be installed by non-superusers. Untrusted extensions (e.g., `plpython3u`, `file_fdw`) execute with OS-level privileges, require superuser installation, and must be wrapped with `SECURITY DEFINER` functions for application access.

3. **Essential Toolkit**: `pgcrypto` provides cryptographic hashing and encryption (CPU-intensive, secure key management required); `uuid-ossp` generates various UUID formats (prefer v1 for time-ordering, v4 for randomness); `citext` enables transparent case-insensitive comparisons without functional indexes; `pg_trgm` powers fuzzy text matching and leading-wildcard LIKE queries; `pg_stat_statements` requires `shared_preload_libraries` configuration and tracks query performance statistics.

4. **Versioning Strategy**: Extension updates execute SQL migration scripts (`--old--new.sql`) during `ALTER EXTENSION UPDATE`. Breaking changes may require dump/restore (e.g., PostGIS major versions). Test updates in transactions that can roll back; downgrades typically require extension removal and reinstallation.

5. **Operational Impact**: Extensions affect logical replication (must be installed on subscribers), physical replication (inherited automatically), and cloud provider compatibility (check allowlists). Some require `shared_preload_libraries` modifications necessitating restarts. Monitor memory usage for statistics-collecting extensions.

6. **Governance**: Maintain tiered approval lists (Standard/Restricted/Prohibited) with criteria for maintenance status, security auditing, and licensing compliance. Scan for unused extensions periodically and remove them to reduce attack surface and maintenance burden.

**Next:** In Chapter 26, we will begin Part VII (Security and Access Control) with an exploration of Roles, Privileges, and Ownership—covering role design patterns, least privilege principles, object ownership management, and default privilege configurations for secure multi-tenant environments.