Skip to content

Database Optimization

Temp edited this page Nov 3, 2025 · 1 revision

Database Optimization

ANALYZE command and performance tuning for D1 databases.

Overview

Database Optimization in D1 Manager provides tools to improve query performance and maintain database health. The primary optimization technique is running ANALYZE to update query statistics.

Key Features:

  • 🔍 ANALYZE Command - Update query optimizer statistics
  • 📊 Bulk Optimization - Optimize multiple databases
  • ⏱️ Progress Tracking - Monitor optimization status
  • 📈 Performance Monitoring - Track query performance

What is ANALYZE?

Definition

ANALYZE is a SQLite command that:

  • Gathers statistics about table content
  • Updates the sqlite_stat1 table
  • Helps query planner choose optimal execution plans
  • Improves query performance

Command:

ANALYZE;

How It Works

Without ANALYZE:

Query Planner: "I don't know table sizes, I'll guess"
Result: Suboptimal query plans, slower queries

With ANALYZE:

Query Planner: "Table users has 10,000 rows, email column has 9,500 unique values"
Result: Optimal index usage, faster queries

When to Run ANALYZE

Run ANALYZE after:

  • ✅ Creating new indexes
  • ✅ Bulk data imports
  • ✅ Major data changes
  • ✅ Schema modifications
  • ✅ Regular maintenance (weekly/monthly)

Skip ANALYZE for:

  • ❌ Small tables (<100 rows)
  • ❌ After single row changes
  • ❌ Read-only databases
  • ❌ Immediately after previous ANALYZE

Accessing Database Optimization

Single Database Optimization

From database view:

  1. Navigate to a database
  2. Click "Optimize" button
  3. Confirm optimization
  4. Wait for completion
┌──────────────────────────────────────────┐
│ Optimize Database: app-production        │
├──────────────────────────────────────────┤
│ This will run ANALYZE to update query    │
│ optimizer statistics.                    │
│                                          │
│ Benefits:                                │
│ • Improved query performance             │
│ • Better index utilization               │
│ • Optimal query plans                    │
│                                          │
│ Duration: ~5-30 seconds                  │
│                                          │
│ [Cancel] [Optimize]                      │
└──────────────────────────────────────────┘

Bulk Database Optimization

From home page (database list):

  1. Select multiple databases using checkboxes
  2. Click "Optimize Selected"
  3. Review databases to optimize
  4. Click "Start Optimization"
  5. Monitor progress
┌──────────────────────────────────────────┐
│ Optimize 3 Databases                     │
├──────────────────────────────────────────┤
│ Selected databases:                      │
│ • app-production                         │
│ • app-staging                            │
│ • analytics-db                           │
│                                          │
│ This will run ANALYZE on each database.  │
│                                          │
│ [Cancel] [Start Optimization]            │
└──────────────────────────────────────────┘

Progress Tracking

Real-time progress for bulk operations:

┌──────────────────────────────────────────┐
│ Optimization Progress                    │
├──────────────────────────────────────────┤
│ ████████████░░░░░░░░░░░░░░  60%          │
│                                          │
│ ✓ app-production (5.2s)                  │
│ ⟳ app-staging (in progress...)           │
│ ⋯ analytics-db (pending)                 │
│                                          │
│ Completed: 1 of 3                        │
│ Failed: 0                                │
│                                          │
│ [Cancel] [Close when done]               │
└──────────────────────────────────────────┘

Optimization Results

Success

┌──────────────────────────────────────────┐
│ Optimization Complete ✓                  │
├──────────────────────────────────────────┤
│ Database: app-production                 │
│ Duration: 5.2 seconds                    │
│                                          │
│ Query statistics updated successfully.   │
│ Queries should now perform better.       │
│                                          │
│ [Close]                                  │
└──────────────────────────────────────────┘

Bulk Results

┌──────────────────────────────────────────┐
│ Bulk Optimization Complete               │
├──────────────────────────────────────────┤
│ Successfully optimized: 2 databases      │
│ ✓ app-production (5.2s)                  │
│ ✓ app-staging (3.8s)                     │
│                                          │
│ Failed: 1 database                       │
│ ✗ analytics-db (timeout error)           │
│                                          │
│ [Retry Failed] [Close]                   │
└──────────────────────────────────────────┘

Performance Tips

1. Create Appropriate Indexes

Before optimization, ensure indexes exist:

-- Index foreign keys
CREATE INDEX idx_posts_user_id ON posts(user_id);

-- Index frequently queried columns
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_posts_created ON posts(created_at);

-- Index columns used in WHERE clauses
CREATE INDEX idx_orders_status ON orders(status);

-- Then run ANALYZE
ANALYZE;

Why: ANALYZE helps the query planner use indexes effectively.

2. Run ANALYZE After Bulk Changes

Example workflow:

-- 1. Import large dataset
INSERT INTO users SELECT * FROM import_data;  -- 100,000 rows

-- 2. Create indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_created ON users(created_at);

-- 3. Run ANALYZE
ANALYZE;

Why: Statistics reflect actual data distribution.

3. Regular Maintenance Schedule

Recommended frequency:

High-Traffic Databases:

  • ANALYZE: Weekly
  • Full optimization: Monthly

Medium-Traffic Databases:

  • ANALYZE: Bi-weekly
  • Full optimization: Quarterly

Low-Traffic Databases:

  • ANALYZE: Monthly
  • Full optimization: Semi-annually

4. Optimize Query Patterns

Use indexes effectively:

-- Good: Uses index on email
SELECT * FROM users WHERE email = 'alice@example.com';

-- Bad: Cannot use index (leading wildcard)
SELECT * FROM users WHERE email LIKE '%@example.com';

-- Better: Use full-text search or different query
SELECT * FROM users WHERE email LIKE 'alice%';

5. Monitor Query Performance

Check slow queries:

-- Before optimization
SELECT * FROM posts WHERE user_id = 123;  -- 250ms

-- After ANALYZE + index
SELECT * FROM posts WHERE user_id = 123;  -- 5ms

VACUUM Operation

What is VACUUM?

VACUUM rebuilds the database file to:

  • Reclaim unused space
  • Defragment database
  • Reduce file size

Command:

VACUUM;

Why Not in D1 Manager?

VACUUM is not available via D1 REST API.

D1 automatically manages space reclamation, so manual VACUUM is typically not needed.

Manual VACUUM (if needed)

Use Wrangler CLI:

wrangler d1 execute my-database --remote --command="VACUUM"

When to use:

  • After deleting large amounts of data
  • File size significantly larger than data
  • Fragmentation issues

Caution: VACUUM requires exclusive access and can take time.

Query Optimization Tips

Use EXPLAIN QUERY PLAN

Analyze how queries execute:

EXPLAIN QUERY PLAN
SELECT * FROM users WHERE email = 'alice@example.com';

Output:

QUERY PLAN
|--SEARCH users USING INDEX idx_users_email (email=?)

Good signs:

  • "USING INDEX" - Index is used
  • "COVERING INDEX" - All columns from index

Bad signs:

  • "SCAN users" - Full table scan (slow)
  • No index mentioned

Add Indexes for Common Queries

Example analysis:

-- Slow query (no index)
SELECT * FROM posts WHERE status = 'published';

-- Add index
CREATE INDEX idx_posts_status ON posts(status);

-- Run ANALYZE
ANALYZE;

-- Fast query (uses index)
SELECT * FROM posts WHERE status = 'published';

Composite Indexes

For queries with multiple WHERE conditions:

-- Query
SELECT * FROM orders 
WHERE customer_id = 123 AND status = 'pending';

-- Composite index
CREATE INDEX idx_orders_customer_status 
ON orders(customer_id, status);

-- Run ANALYZE
ANALYZE;

Order matters:

  • Put most selective column first
  • Put columns used in equality (=) before range (>, <)

Covering Indexes

Include all queried columns in index:

-- Query
SELECT id, email FROM users WHERE username = 'alice';

-- Covering index
CREATE INDEX idx_users_username_email 
ON users(username, email);

-- Run ANALYZE
ANALYZE;

Benefit: Query can be satisfied entirely from index (no table lookup needed).

Performance Monitoring

Query Execution Time

Track query duration:

-- View query history
SELECT query, duration_ms 
FROM query_history 
WHERE database_id = 'your-db-id'
ORDER BY duration_ms DESC 
LIMIT 10;

Identify slow queries:

  • 100ms: Consider optimization

  • 1000ms: Definitely needs optimization

Index Usage Statistics

Check which indexes are used:

-- List all indexes
SELECT name, tbl_name 
FROM sqlite_master 
WHERE type = 'index';

-- Check index details
PRAGMA index_info(idx_users_email);

Table Statistics

View statistics after ANALYZE:

-- Query statistics table
SELECT * FROM sqlite_stat1;

Columns:

  • tbl - Table name
  • idx - Index name
  • stat - Statistics string

Best Practices

1. Optimize After Major Changes

Trigger optimization after:

  • Large imports
  • Schema changes
  • Index creation
  • Bulk deletions

2. Don't Over-Optimize

Avoid:

  • Running ANALYZE after every single query
  • Optimizing every 5 minutes
  • Optimizing empty/small tables

Reason: ANALYZE has overhead; unnecessary runs waste resources.

3. Test Before Production

Safe workflow:

1. Optimize in development
2. Measure performance improvement
3. Test queries thoroughly
4. Deploy optimization to production
5. Monitor results

4. Document Optimization Schedule

Example schedule:

app-production:
  - ANALYZE: Every Sunday 2:00 AM
  - Index review: Monthly
  - Query audit: Quarterly

app-staging:
  - ANALYZE: Every 2 weeks
  - As needed after large imports

5. Monitor After Optimization

Verify improvements:

1. Run ANALYZE
2. Execute common queries
3. Compare execution times
4. Document improvements
5. Adjust strategy as needed

Troubleshooting

Optimization Fails

Cause: Database locked, insufficient permissions, or timeout.

Solution:

  1. Retry operation
  2. Ensure no long-running queries
  3. Check database access permissions
  4. Try during low-traffic period

No Performance Improvement

Cause: Missing indexes or query not using indexes.

Solution:

  1. Check query with EXPLAIN QUERY PLAN
  2. Add appropriate indexes
  3. Run ANALYZE again
  4. Verify index usage

Optimization Times Out

Cause: Very large database or complex statistics.

Solution:

  1. Try during off-peak hours
  2. Optimize smaller batches
  3. Use CLI for very large databases:
wrangler d1 execute db-name --remote --command="ANALYZE"

API Details

Optimize Endpoint

POST /api/databases/optimize
Content-Type: application/json

{
  "databaseIds": ["uuid-1", "uuid-2", "uuid-3"]
}

Response:

{
  "succeeded": ["uuid-1", "uuid-2"],
  "failed": [
    {
      "id": "uuid-3",
      "error": "Timeout error"
    }
  ]
}

See API Reference for complete documentation.

Advanced Topics

Index Selectivity

High selectivity (good):

-- email column: 10,000 unique values in 10,000 rows
-- Selectivity: 100%
CREATE INDEX idx_users_email ON users(email);

Low selectivity (poor):

-- status column: 3 unique values in 10,000 rows  
-- Selectivity: 0.03%
-- Index may not help much
CREATE INDEX idx_orders_status ON orders(status);

Rule: Higher selectivity = more useful index.

Partial Indexes

Create indexes for specific conditions:

-- Index only active users
CREATE INDEX idx_active_users 
ON users(email) 
WHERE is_active = 1;

Benefit: Smaller index, faster queries for active users.

Expression Indexes

Index computed values:

-- Index lowercase email
CREATE INDEX idx_users_email_lower 
ON users(LOWER(email));

-- Query using expression
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';

Note: Requires consistent use of expression in queries.

Next Steps


Need Help? See Troubleshooting or open an issue.

Clone this wiki locally