[Epic] 💾 Performance - Database Indexing Optimization

# 💾 Performance - Database Indexing Optimization

## Goal

Implement comprehensive database indexing strategy to optimize query performance:
1. **Add indexes for foreign keys** that are frequently used in JOINs
2. **Add composite indexes** for common multi-column filter patterns
3. **Add indexes for frequently filtered columns** (is_active, status, enabled)
4. **Add indexes for ORDER BY columns** (created_at, updated_at, timestamps)
5. **Add indexes for unique lookups** (email, slug, token)
6. **Optimize many-to-many association tables**
7. **Add covering indexes** where beneficial

This dramatically improves query performance by 10-100x for filtered and joined queries, reduces database CPU usage by 30-60%, and enables efficient pagination.

## Why Now?

Database indexing is critical for scalability and performance:

1. **Query Performance**: Reduce query time from seconds to milliseconds for filtered queries
2. **Scalability**: Support 10x more concurrent users without database bottleneck
3. **Resource Efficiency**: Reduce database CPU usage by 30-60%
4. **Better UX**: Fast page loads even with thousands of records
5. **Federation Performance**: Faster lookups for federated gateway queries
6. **Cost Savings**: Reduce database costs by improving query efficiency

---

## 📖 User Stories

<details>
<summary>US-1: API Client - Fast Filtered Queries</summary>

**As an** API Client
**I want** filtered list queries to execute quickly
**So that** my application responds fast even with large datasets

**Acceptance Criteria:**

```gherkin
Given I request GET /tools with team_id filter
When the database has 10,000+ tools
Then the query should use an index on team_id
And the response time should be <50ms (vs 500-2000ms without index)
And the database should perform an index scan, not a full table scan

Given I request GET /tools?enabled=true&team_id=123
When filtering by multiple columns
Then the query should use a composite index (team_id, enabled)
And the query execution plan should show "Index Scan" not "Seq Scan"
And the response time should be <30ms

Given I request GET /teams with is_active=true filter
When the database has thousands of teams
Then the query should use an index on is_active
And pagination should be efficient with ORDER BY created_at
And the database CPU usage should be minimal
```

**Technical Requirements:**
- Indexes on all foreign key columns
- Composite indexes for common filter combinations
- Indexes on boolean filter columns (is_active, enabled, status)
- Indexes supporting ORDER BY clauses

</details>

<details>
<summary>US-2: Admin User - Fast Team Member Lookups</summary>

**As an** Admin User
**I want** team membership queries to execute instantly
**So that** I can quickly view team members and permissions

**Acceptance Criteria:**

```gherkin
Given I view a team with 100+ members
When the page loads member list
Then the query should use composite index (team_id, is_active)
And the query should execute in <20ms
And sorting by role should be efficient

Given I check if a user is a team member
When querying by team_id and user_email
Then the query should use composite index (team_id, user_email)
And the lookup should execute in <5ms
And the database should perform only an index-only scan

Given I list all teams for a user
When the user is a member of 50+ teams
Then the query should use index on user_email
And JOIN with teams should use index on team_id
And the query should execute in <50ms
```

**Technical Requirements:**
- Composite index on (team_id, user_email, is_active) for email_team_members
- Index on user_email for reverse lookups
- Index on (team_id, role, is_active) for permission checks
- Efficient JOIN performance with indexed foreign keys

</details>

<details>
<summary>US-3: System - Efficient Cascade Deletes and Foreign Key Lookups</summary>

**As a** System
**I want** foreign key constraints and cascade operations to be efficient
**So that** deletes and updates don't cause performance bottlenecks

**Acceptance Criteria:**

```gherkin
Given I delete a team with many associated resources
When cascade delete operations occur
Then all foreign key lookups should use indexes
And the operation should complete in <100ms for 100 related records
And no full table scans should occur during cascade

Given I delete a gateway with federated tools
When the database checks foreign key references
Then the query should use index on tools.gateway_id
And the cascade delete should be efficient
And the operation should not block other queries

Given I update a user's email (primary key)
When foreign key references need to be updated
Then all referencing tables should have indexes on FK columns
And the update should propagate efficiently
And the operation should complete in <200ms
```

**Technical Requirements:**
- Indexes on all foreign key columns for efficient lookups
- Proper cascade delete configuration
- Efficient JOIN performance for related records
- No full table scans during FK constraint checks

</details>

---

## 🏗 Architecture

### Current State (Without Indexes)

```sql
-- Example: Query team members without composite index
EXPLAIN SELECT * FROM email_team_members 
WHERE team_id = '123' AND is_active = true;

-- Result: Seq Scan on email_team_members (cost=0.00..1234.56)
-- Filter: (team_id = '123' AND is_active = true)
-- Planning time: 0.5ms
-- Execution time: 450ms ❌ SLOW - Full table scan!
```

### After Indexing

```sql
-- With composite index: CREATE INDEX idx_team_members_team_active ON email_team_members(team_id, is_active);
EXPLAIN SELECT * FROM email_team_members 
WHERE team_id = '123' AND is_active = true;

-- Result: Index Scan using idx_team_members_team_active (cost=0.42..8.44)
-- Index Cond: (team_id = '123' AND is_active = true)
-- Planning time: 0.2ms
-- Execution time: 3ms ✅ FAST - Index scan!
```

---

## 📋 Summary of Indexes to Add

**150+ indexes across 35 tables** organized into these categories:

### Phase 1: Core Foreign Key Indexes (40+ indexes)
Tools, Resources, Prompts, Servers, Gateways, A2A Agents tables

### Phase 2: RBAC and User Management (25+ indexes)
Roles, UserRoles, PermissionAuditLog, EmailUsers tables

### Phase 3: Team Management (50+ indexes) 
EmailTeams, EmailTeamMembers, EmailTeamInvitations, EmailTeamJoinRequests, History tables

### Phase 4: Metrics Tables (15+ indexes)
Tool, Resource, Server, Prompt, A2A metrics tables

### Phase 5: OAuth and Token Management (15+ indexes)
OAuth tokens, states, API tokens, usage logs, revocations

### Phase 6: Session and SSO (12+ indexes)
Session records, SSO providers, auth sessions, auth events

### Phase 7: Association Tables (8+ indexes)
Server-Tool, Server-Resource, Server-Prompt, Server-A2A associations

### Phase 8: Miscellaneous (5+ indexes)
Resource subscriptions, GRPC services

**Full implementation tasks are detailed in the complete Epic description.**

---

## ✅ Success Criteria

- [ ] All foreign key columns have indexes (50+ indexes)
- [ ] Common filter patterns have composite indexes (30+ indexes)
- [ ] Boolean filter columns have indexes (20+ indexes)
- [ ] Timestamp columns for ORDER BY have indexes
- [ ] Query performance improved by 10-100x
- [ ] No sequential scans on large tables
- [ ] Database CPU usage reduced by 30-60%
- [ ] Migration tested on SQLite and PostgreSQL
- [ ] Documentation complete

---

## 🏁 Definition of Done

- [ ] Alembic migration created with 150+ index additions
- [ ] Migration tested on development and test databases
- [ ] EXPLAIN ANALYZE shows index usage
- [ ] Performance benchmarks show 10-100x improvement
- [ ] No regressions in existing tests
- [ ] Documentation updated
- [ ] Ready for production deployment

---

## 📝 Key Implementation Details

**Index Types:**
- Single column indexes for foreign keys
- Composite indexes for multi-column filters
- Partial indexes for filtered subsets
- Covering indexes to avoid table lookups

**PostgreSQL vs SQLite:**
- PostgreSQL: Use \`CREATE INDEX CONCURRENTLY\` (no locks)
- SQLite: Requires table locking during index creation
- Both support B-tree composite indexes

**Migration Safety:**
- Use concurrent index creation for PostgreSQL
- Test on development database first
- Include downgrade() for rollback
- Document estimated migration time

---

## 🔗 Related Issues

- #1224 API Pagination (indexes enable efficient pagination)
- #1296 Redis Caching (reduces cache miss penalty) 
- #1297 Server Tuning (complements connection pooling)
- #975 Session Pooling (reduces per-query overhead)

---

## 📚 References

- [PostgreSQL Indexes](https://www.postgresql.org/docs/current/indexes.html)
- [SQLite Query Planner](https://www.sqlite.org/queryplanner.html)
- [Use The Index, Luke!](https://use-the-index-luke.com/)
- [SQLAlchemy Indexes](https://docs.sqlalchemy.org/en/20/core/constraints.html#indexes)

## Benchmarks:

- `docker-compose.yml` spins up postgresql (or mariadb) and nginx, or the associated `make minikube*` and `charts/` or `docker-compose-yml`
- `tests/load/` - can load up millions of records in the database, of all types
- `mcp-servers/go/benchmark-server` - MCP server with arbitrary configurable number of tools, resources or prompts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Epic] 💾 Performance - Database Indexing Optimization #1353

💾 Performance - Database Indexing Optimization

Goal

Why Now?

📖 User Stories

🏗 Architecture

Current State (Without Indexes)

After Indexing

📋 Summary of Indexes to Add

Phase 1: Core Foreign Key Indexes (40+ indexes)

Phase 2: RBAC and User Management (25+ indexes)

Phase 3: Team Management (50+ indexes)

Phase 4: Metrics Tables (15+ indexes)

Phase 5: OAuth and Token Management (15+ indexes)

Phase 6: Session and SSO (12+ indexes)

Phase 7: Association Tables (8+ indexes)

Phase 8: Miscellaneous (5+ indexes)

✅ Success Criteria

🏁 Definition of Done

📝 Key Implementation Details

🔗 Related Issues

📚 References

Benchmarks:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Epic] 💾 Performance - Database Indexing Optimization #1353

Description

💾 Performance - Database Indexing Optimization

Goal

Why Now?

📖 User Stories

🏗 Architecture

Current State (Without Indexes)

After Indexing

📋 Summary of Indexes to Add

Phase 1: Core Foreign Key Indexes (40+ indexes)

Phase 2: RBAC and User Management (25+ indexes)

Phase 3: Team Management (50+ indexes)

Phase 4: Metrics Tables (15+ indexes)

Phase 5: OAuth and Token Management (15+ indexes)

Phase 6: Session and SSO (12+ indexes)

Phase 7: Association Tables (8+ indexes)

Phase 8: Miscellaneous (5+ indexes)

✅ Success Criteria

🏁 Definition of Done

📝 Key Implementation Details

🔗 Related Issues

📚 References

Benchmarks:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions