Skip to content

feat: enable opensearch slow logging#1423

Merged
gbirman merged 5 commits intomainfrom
gab/opensearch-slow-logging
Feb 9, 2026
Merged

feat: enable opensearch slow logging#1423
gbirman merged 5 commits intomainfrom
gab/opensearch-slow-logging

Conversation

@gbirman
Copy link
Copy Markdown
Contributor

@gbirman gbirman commented Feb 9, 2026

Enable CloudWatch logging for OpenSearch to diagnose slow query
performance issues identified in production traces.

Infrastructure changes:
- Add CloudWatch Log Groups for index, search, and application logs
- Configure log publishing options on OpenSearch domain
- Set retention: 30 days (prod), 7 days (dev) for slow logs

Runtime configuration:
- Add configure_slow_logs.ts script to set query thresholds
- Log queries > 1s at WARN, > 500ms at INFO, > 200ms at DEBUG

This helps diagnose bottlenecks like the 15s search_unified trace
where database queries were fast but OpenSearch was slow.

Setup: Deploy with pulumi up, then run configure_slow_logs.ts
@gbirman gbirman requested a review from a team as a code owner February 9, 2026 20:47
@github-actions github-actions bot added the infra label Feb 9, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Feb 9, 2026

Code review

Found 1 issue that needs to be addressed:

Missing CloudWatch Logs Resource Policy

Location: infra/stacks/opensearch/index.ts around line 163

The PR creates CloudWatch Log Groups and configures logPublishingOptions on the OpenSearch domain, but is missing a required aws.cloudwatch.LogResourcePolicy resource that grants the OpenSearch service (es.amazonaws.com) permission to write logs to CloudWatch.

Without this resource policy, the deployment will either:

  • Fail with a ValidationException from AWS, or
  • Silently fail to deliver logs (the log groups will exist but OpenSearch won't be able to write to them)

Required fix:

Add a CloudWatch Logs resource policy before the opensearchDomain resource. See:

The policy needs to:

  1. Grant es.amazonaws.com service principal the logs:PutLogEvents and logs:CreateLogStream actions
  2. Target all three log group ARNs with :* suffix
  3. Be added as a dependency to the opensearchDomain resource to ensure proper ordering

Reference:

},
],
tags,

@gbirman gbirman force-pushed the gab/opensearch-slow-logging branch from e560282 to 143f61c Compare February 9, 2026 20:54
Grant es.amazonaws.com service principal permission to write logs
to CloudWatch Log Groups. This policy is required for OpenSearch
to publish slow query logs to CloudWatch.

Without this policy, OpenSearch domain creation succeeds but logs
are not written to CloudWatch.
@gbirman gbirman merged commit 7dbee64 into main Feb 9, 2026
20 checks passed
@gbirman gbirman deleted the gab/opensearch-slow-logging branch February 9, 2026 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant