Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions src/app.module.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ import { APP_GUARD } from '@nestjs/core';
import { ConfigModule } from '@nestjs/config';
import { TypeOrmModule } from '@nestjs/typeorm';
import { ScheduleModule } from '@nestjs/schedule';

import { AppController } from './app.controller';
import { SearchModule } from './search/search.module';
import { IndexOptimizationModule } from './database/index-optimization/index-optimization.module';
import { RateLimitingModule } from './rate-limiting/rate-limiting.module';
import { QuotaGuard } from './rate-limiting/guards/quota.guard';
import { getDatabaseConfig } from './config/database.config';
Expand All @@ -15,14 +17,14 @@ import { DataPipelineModule } from './data-pipeline/data-pipeline.module';

const featureFlags = loadFeatureFlags();


@Module({
imports: [
ConfigModule.forRoot({ isGlobal: true }),
TypeOrmModule.forRoot(getDatabaseConfig()),
ScheduleModule.forRoot(),
SessionModule,
SearchModule,
IndexOptimizationModule, // ✅ from feat branch
...(featureFlags.ENABLE_RATE_LIMITING ? [RateLimitingModule] : []),
DebuggingModule,
DataPipelineModule,
Expand All @@ -32,4 +34,4 @@ const featureFlags = loadFeatureFlags();
? [{ provide: APP_GUARD, useClass: QuotaGuard }]
: [],
})
export class AppModule { }
export class AppModule {}
80 changes: 80 additions & 0 deletions src/database/index-optimization/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Automatic Database Index Optimizer

PostgreSQL-specific tooling that recommends, creates, monitors and retires
indexes by reading the catalog and the statistics collector (`pg_stat_*`). All
analysis is read-only; the only writes are explicit `CREATE INDEX` /
`DROP INDEX` actions, both gated behind configuration flags.

## Capabilities

| Acceptance criterion | Component |
| --------------------------------- | ------------------------------------------- |
| Query analysis for recommendations| `QueryAnalysisService.analyze()` |
| Automatic index creation | `IndexCreationService` |
| Index usage monitoring | `IndexUsageMonitorService` |
| Stale index removal | `StaleIndexService` |
| Scheduled orchestration | `IndexOptimizationService` (`@Cron` weekly) |

## How recommendations are derived

1. **Foreign-key columns without an index.** Postgres does not automatically
index FK columns — a frequent cause of slow joins and cascade operations.
The catalog is queried for FK columns whose leading index columns are not
already covered, yielding concrete, safe column suggestions.
2. **Sequential-scan activity.** `pg_stat_user_tables` seq/idx scan counts
score and prioritise the above, and flag heavily seq-scanned tables
(`HIGH_SEQ_SCAN`).
3. **Slow statements.** When `pg_stat_statements` is installed, slow queries
are surfaced for context via `GET .../slow-queries`.

Generated DDL uses `CREATE INDEX CONCURRENTLY IF NOT EXISTS` so no long write
lock is taken. After creation the index's `indisvalid` flag is verified; a
failed concurrent build leaves an INVALID index, which is dropped automatically.

## Stale index removal — safety

An index is only eligible for removal when it is **not** a primary key, **not**
unique, **not** backing any constraint, has scan count ≤ `staleMinScans`, and is
larger than `staleMinSizeBytes`. Drops also use `CONCURRENTLY`.

## Configuration (all optional)

| Env var | Default | Purpose |
| -------------------------------- | ------- | ---------------------------------------- |
| `INDEX_OPT_ENABLED` | `false` | Enable the scheduled weekly cycle |
| `INDEX_OPT_DRY_RUN` | `true` | Analyse only; never execute DDL |
| `INDEX_OPT_AUTO_CREATE` | `false` | Allow automatic index creation |
| `INDEX_OPT_AUTO_DROP_STALE` | `false` | Allow automatic stale-index removal |
| `INDEX_OPT_SEQ_SCAN_THRESHOLD` | `1000` | Min seq scans before a table is a candidate |
| `INDEX_OPT_SEQ_SCAN_RATIO` | `0.5` | Min seq/idx scan ratio to flag a table |
| `INDEX_OPT_SLOW_QUERY_MS` | `200` | Mean exec time marking a statement slow |
| `INDEX_OPT_STALE_MIN_SIZE_BYTES` | `1MB` | Ignore stale indexes smaller than this |
| `INDEX_OPT_STALE_MIN_SCANS` | `0` | Scans at/below which an index is stale |
| `INDEX_OPT_MAX_CREATE_PER_RUN` | `3` | Cap on indexes created per cycle |
| `INDEX_OPT_SCHEMA` | `public`| Schema to operate on |

Even with `INDEX_OPT_ENABLED=true`, creation/drops stay in dry-run until you
also set `INDEX_OPT_DRY_RUN=false` and the relevant `AUTO_*` flag.

## API (admin only)

| Method & path | Description |
| -------------------------------------------------- | --------------------------------- |
| `GET /database/index-optimization/recommendations`| Index recommendations |
| `GET /database/index-optimization/slow-queries` | Slow statements (if enabled) |
| `GET /database/index-optimization/usage` | Index usage statistics |
| `GET /database/index-optimization/stale` | Stale indexes eligible for removal|
| `GET /database/index-optimization/last-run` | Summary of the last cycle |
| `POST /database/index-optimization/run?apply=true` | Run a cycle (dry-run unless apply)|

## Wiring

```ts
@Module({
imports: [
ScheduleModule.forRoot(), // required for the weekly cron
IndexOptimizationModule,
],
})
export class AppModule {}
```
61 changes: 61 additions & 0 deletions src/database/index-optimization/index-optimization.config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
/**
* Centralised configuration for the automatic index optimizer, resolved from
* environment variables. Conservative defaults are chosen so the optimizer is
* safe to enable: it runs in dry-run and never auto-applies DDL unless a human
* opts in.
*
* Env vars (all optional):
* INDEX_OPT_ENABLED – master switch for the scheduled run (default false)
* INDEX_OPT_DRY_RUN – analyse/recommend only, never apply DDL (default true)
* INDEX_OPT_AUTO_CREATE – allow automatic index creation (default false)
* INDEX_OPT_AUTO_DROP_STALE – allow automatic stale-index removal (default false)
* INDEX_OPT_SEQ_SCAN_THRESHOLD – min seq scans before a table is a candidate (default 1000)
* INDEX_OPT_SEQ_SCAN_RATIO – min seq/idx scan ratio to flag a table (default 0.5)
* INDEX_OPT_SLOW_QUERY_MS – mean exec time (ms) marking a statement slow (default 200)
* INDEX_OPT_STALE_MIN_SIZE_BYTES – ignore stale indexes smaller than this (default 1MB)
* INDEX_OPT_STALE_MIN_SCANS – scans at/below which an index is stale (default 0)
* INDEX_OPT_MAX_CREATE_PER_RUN – cap on indexes created in one cycle (default 3)
* INDEX_OPT_SCHEMA – schema to operate on (default public)
*/
export interface IndexOptimizationConfig {
enabled: boolean;
dryRun: boolean;
autoCreate: boolean;
autoDropStale: boolean;
seqScanThreshold: number;
seqScanRatio: number;
slowQueryMs: number;
staleMinSizeBytes: number;
staleMinScans: number;
maxCreatePerRun: number;
schema: string;
}

const bool = (value: string | undefined, fallback: boolean): boolean =>
value === undefined ? fallback : value.toLowerCase() === 'true';

const int = (value: string | undefined, fallback: number): number => {
const parsed = parseInt(value ?? '', 10);
return Number.isFinite(parsed) ? parsed : fallback;
};

const num = (value: string | undefined, fallback: number): number => {
const parsed = Number(value);
return Number.isFinite(parsed) ? parsed : fallback;
};

export function resolveIndexOptimizationConfig(): IndexOptimizationConfig {
return {
enabled: bool(process.env.INDEX_OPT_ENABLED, false),
dryRun: bool(process.env.INDEX_OPT_DRY_RUN, true),
autoCreate: bool(process.env.INDEX_OPT_AUTO_CREATE, false),
autoDropStale: bool(process.env.INDEX_OPT_AUTO_DROP_STALE, false),
seqScanThreshold: int(process.env.INDEX_OPT_SEQ_SCAN_THRESHOLD, 1000),
seqScanRatio: num(process.env.INDEX_OPT_SEQ_SCAN_RATIO, 0.5),
slowQueryMs: num(process.env.INDEX_OPT_SLOW_QUERY_MS, 200),
staleMinSizeBytes: int(process.env.INDEX_OPT_STALE_MIN_SIZE_BYTES, 1024 * 1024),
staleMinScans: int(process.env.INDEX_OPT_STALE_MIN_SCANS, 0),
maxCreatePerRun: int(process.env.INDEX_OPT_MAX_CREATE_PER_RUN, 3),
schema: process.env.INDEX_OPT_SCHEMA ?? 'public',
};
}
75 changes: 75 additions & 0 deletions src/database/index-optimization/index-optimization.controller.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
import { Controller, Get, Post, Query, UseGuards } from '@nestjs/common';
import { ApiTags, ApiOperation, ApiBearerAuth, ApiQuery } from '@nestjs/swagger';
import { Roles } from '../../auth/decorators/roles.decorator';
import { JwtAuthGuard } from '../../auth/guards/jwt-auth.guard';
import { RolesGuard } from '../../auth/guards/roles.guard';
import { UserRole } from '../../users/entities/user.entity';
import { IndexOptimizationService } from './index-optimization.service';
import { QueryAnalysisService } from './services/query-analysis.service';
import { IndexUsageMonitorService } from './services/index-usage-monitor.service';
import { StaleIndexService } from './services/stale-index.service';

/**
* Admin API for the database index optimizer. Mutating endpoints require an
* explicit `apply=true` flag so DDL is never executed by accident.
*/
@ApiTags('index-optimization')
@Controller('database/index-optimization')
@UseGuards(JwtAuthGuard, RolesGuard)
@ApiBearerAuth()
export class IndexOptimizationController {
constructor(
private readonly optimizer: IndexOptimizationService,
private readonly analysis: QueryAnalysisService,
private readonly usageMonitor: IndexUsageMonitorService,
private readonly staleIndex: StaleIndexService,
) {}

@Get('recommendations')
@Roles(UserRole.ADMIN)
@ApiOperation({ summary: 'Analyse the database and return index recommendations' })
recommendations() {
return this.analysis.analyze();
}

@Get('slow-queries')
@Roles(UserRole.ADMIN)
@ApiOperation({ summary: 'List slow statements from pg_stat_statements (if enabled)' })
slowQueries() {
return this.analysis.getSlowStatements();
}

@Get('usage')
@Roles(UserRole.ADMIN)
@ApiOperation({ summary: 'Get index usage statistics' })
usage() {
return this.usageMonitor.getSnapshot();
}

@Get('stale')
@Roles(UserRole.ADMIN)
@ApiOperation({ summary: 'List indexes judged stale and eligible for removal' })
stale() {
return this.staleIndex.findStaleIndexes();
}

@Get('last-run')
@Roles(UserRole.ADMIN)
@ApiOperation({ summary: 'Get the summary of the last optimization cycle' })
lastRun() {
return this.optimizer.getLastRun() ?? { message: 'No run recorded yet' };
}

@Post('run')
@Roles(UserRole.ADMIN)
@ApiQuery({
name: 'apply',
required: false,
type: Boolean,
description: 'When true, executes DDL instead of running in dry-run mode',
})
@ApiOperation({ summary: 'Run a full optimization cycle (dry-run unless apply=true)' })
run(@Query('apply') apply?: string) {
return this.optimizer.run(apply === 'true');
}
}
39 changes: 39 additions & 0 deletions src/database/index-optimization/index-optimization.module.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import { Module } from '@nestjs/common';
import { TypeOrmModule } from '@nestjs/typeorm';
import { IndexOptimizationController } from './index-optimization.controller';
import { IndexOptimizationService } from './index-optimization.service';
import { QueryAnalysisService } from './services/query-analysis.service';
import { IndexCreationService } from './services/index-creation.service';
import { IndexUsageMonitorService } from './services/index-usage-monitor.service';
import { StaleIndexService } from './services/stale-index.service';

/**
* IndexOptimizationModule wires the automatic database index optimizer:
* - QueryAnalysisService → index recommendations
* - IndexCreationService → automatic index creation
* - IndexUsageMonitorService → index usage monitoring
* - StaleIndexService → stale index removal
* - IndexOptimizationService → scheduled orchestration of the above
*
* Requires a configured TypeORM DataSource (PostgreSQL) and, for scheduling,
* ScheduleModule.forRoot() registered at the application root. The scheduled
* cycle is inert unless INDEX_OPT_ENABLED=true.
*/
@Module({
imports: [TypeOrmModule.forFeature([])],
controllers: [IndexOptimizationController],
providers: [
IndexOptimizationService,
QueryAnalysisService,
IndexCreationService,
IndexUsageMonitorService,
StaleIndexService,
],
exports: [
IndexOptimizationService,
QueryAnalysisService,
IndexUsageMonitorService,
StaleIndexService,
],
})
export class IndexOptimizationModule {}
97 changes: 97 additions & 0 deletions src/database/index-optimization/index-optimization.service.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
import { Injectable, Logger } from '@nestjs/common';
import { Cron, CronExpression } from '@nestjs/schedule';
import {
resolveIndexOptimizationConfig,
IndexOptimizationConfig,
} from './index-optimization.config';
import { QueryAnalysisService } from './services/query-analysis.service';
import { IndexCreationService } from './services/index-creation.service';
import { IndexUsageMonitorService } from './services/index-usage-monitor.service';
import { StaleIndexService } from './services/stale-index.service';
import { IOptimizationRunSummary } from './interfaces/index-optimization.interfaces';

/**
* Orchestrates a full index-optimization cycle:
* analyse → create recommended → monitor usage → remove stale
*
* Runs on a weekly schedule when INDEX_OPT_ENABLED=true, and can be triggered
* on demand via the controller. Each stage independently respects the dry-run
* and auto-create / auto-drop flags so an operator can dial in exactly how much
* autonomy the optimizer has.
*/
@Injectable()
export class IndexOptimizationService {
private readonly logger = new Logger(IndexOptimizationService.name);
private readonly config: IndexOptimizationConfig;
private lastRun?: IOptimizationRunSummary;

constructor(
private readonly analysis: QueryAnalysisService,
private readonly creation: IndexCreationService,
private readonly usageMonitor: IndexUsageMonitorService,
private readonly staleIndex: StaleIndexService,
config?: IndexOptimizationConfig,
) {
this.config = config ?? resolveIndexOptimizationConfig();
}

/** Scheduled weekly run; no-op unless explicitly enabled. */
@Cron(CronExpression.EVERY_WEEK)
async scheduledRun(): Promise<void> {
if (!this.config.enabled) {
this.logger.debug('Index optimizer disabled (INDEX_OPT_ENABLED=false)');
return;
}
this.logger.log('Starting scheduled index optimization cycle');
await this.run();
}

/**
* Execute a full cycle.
* @param force when true, applies DDL even if config is dry-run (used by the
* manual "apply" endpoint). Auto-create/auto-drop flags still gate
* destructive vs additive actions.
*/
async run(force = false): Promise<IOptimizationRunSummary> {
const startedAt = new Date().toISOString();

// 1. Query analysis → recommendations.
const recommendations = await this.analysis.analyze();

// 2. Index creation (additive). Gated by autoCreate; dry-run unless forced.
const createDryRun = force ? false : this.config.dryRun || !this.config.autoCreate;
const created = await this.creation.createFromRecommendations(
recommendations,
createDryRun,
);

// 3. Usage monitoring snapshot (read-only).
await this.usageMonitor.sample();

// 4. Stale index removal (destructive). Gated by autoDropStale.
const dropDryRun = force ? false : this.config.dryRun || !this.config.autoDropStale;
const removedStale = await this.staleIndex.removeStaleIndexes(dropDryRun);

const summary: IOptimizationRunSummary = {
startedAt,
finishedAt: new Date().toISOString(),
dryRun: createDryRun && dropDryRun,
recommendations,
created,
removedStale,
};

this.lastRun = summary;
this.logger.log(
`Index optimization complete: ${recommendations.length} recommendation(s), ` +
`${created.filter((c) => c.created).length} created, ` +
`${removedStale.filter((r) => r.dropped).length} stale removed`,
);
return summary;
}

/** Return the summary of the most recent run, if any. */
getLastRun(): IOptimizationRunSummary | undefined {
return this.lastRun;
}
}
Loading
Loading