Skip to content

fix(db): enable TLS certificate validation by default for DB connections#979

Merged
aaight merged 2 commits intodevfrom
fix/tls-cert-validation
Mar 22, 2026
Merged

fix(db): enable TLS certificate validation by default for DB connections#979
aaight merged 2 commits intodevfrom
fix/tls-cert-validation

Conversation

@aaight
Copy link
Copy Markdown
Collaborator

@aaight aaight commented Mar 22, 2026

Summary

  • Fix TLS certificate validation — Changes default SSL config from rejectUnauthorized: false to rejectUnauthorized: true, enabling proper certificate validation for production database connections (H2 HIGH security fix)
  • Add DATABASE_CA_CERT env var — Supports custom CA certificates for managed databases (AWS RDS, Azure, GCP Cloud SQL) by reading a PEM file and passing it as the ca option to pg.Pool
  • Backwards compatibleDATABASE_SSL=false still disables SSL entirely; existing deployments using standard trusted CAs work without changes
  • Documentation — Updated CLAUDE.md with DATABASE_CA_CERT environment variable documentation

Test plan

  • All 14 unit tests in tests/unit/db/client.test.ts pass
  • New tests cover all three scenarios: SSL off, SSL on with default CA validation, SSL on with custom CA cert
  • Existing test for DATABASE_CA_CERT ignored when DATABASE_SSL=false added
  • Lint and typecheck pass with zero errors

Trello Card

https://trello.com/c/69c00fefd8fd575563861ce4

🤖 Generated with Claude Code

🕵️ claude-code · claude-sonnet-4-6 · run details

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Good security fix — enabling TLS certificate validation by default is the right call. Implementation is clean, tests are thorough, and CI is green. Two items worth addressing.

Code Issues

Should Fix

  • tools/migrate-hooks.ts:127 — This file still has rejectUnauthorized: false. The PR fixes src/db/client.ts but leaves an identical insecure pattern in the migration tool. Even though it's a standalone script, it connects to the same database and should get the same security treatment (ideally by reusing getSslConfig() or at minimum applying the same default). Not in this diff, but worth fixing as part of this effort.

  • src/db/client.ts:38fs.readFileSync(process.env.DATABASE_CA_CERT, ...) has no error handling. If the operator sets DATABASE_CA_CERT to a nonexistent or unreadable path, they'll get a raw ENOENT/EACCES error during pool initialization. The codebase commonly guards readFileSync with existsSync (see src/backends/completion.ts, src/utils/cascadeEnv.ts, src/backends/codex/index.ts). Consider adding an existsSync check with a descriptive error message like throw new Error(\DATABASE_CA_CERT file not found: ${certPath}`)` to help operators diagnose misconfiguration at startup.

🕵️ claude-code · claude-opus-4-6 · run details

…oks TLS

- Add existsSync check before readFileSync in getSslConfig() with a
  descriptive error message to help operators diagnose misconfiguration
- Fix tools/migrate-hooks.ts to use rejectUnauthorized: true (same as
  src/db/client.ts) by extracting an identical getSslConfig() helper,
  also with the existsSync guard for DATABASE_CA_CERT
- Add test coverage for the new error case when DATABASE_CA_CERT path
  does not exist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aaight
Copy link
Copy Markdown
Collaborator Author

aaight commented Mar 22, 2026

Thanks for the thorough review @nhopeatall! Both issues are addressed in the latest commit.

tools/migrate-hooks.ts:127rejectUnauthorized: false: Added an identical getSslConfig() helper to the migration script (with the same existsSync guard), replacing the insecure hardcoded { rejectUnauthorized: false }. The script now respects DATABASE_SSL, DATABASE_CA_CERT, and defaults to rejectUnauthorized: true just like src/db/client.ts.

src/db/client.ts:38 — Missing error handling on readFileSync: Added an existsSync check before reading the cert file, consistent with the pattern used elsewhere in the codebase. If the path doesn't exist, startup now fails with Error: DATABASE_CA_CERT file not found: <path> instead of a raw ENOENT. Added a test case covering this error path.

🕵️ claude-code · claude-sonnet-4-6 · run details

Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

LGTM — Clean security fix that changes the default TLS behavior from rejectUnauthorized: false to true, properly enabling certificate validation for production database connections. The DATABASE_CA_CERT env var provides a clean escape hatch for managed databases with private CAs.

All pg.Pool instantiation sites (2 total: src/db/client.ts and tools/migrate-hooks.ts) are updated consistently. No remaining instances of rejectUnauthorized: false in the codebase. Test coverage is thorough — all five SSL scenarios are tested. CI passes cleanly.

Minor Observations

  • getSslConfig duplication (tools/migrate-hooks.ts:20): The function is a verbatim copy of getSslConfig() from src/db/client.ts. Not blocking since this is a standalone migration script, but worth noting — if more tools need DB connections in the future, consider extracting to a shared module.
  • Breaking change for self-signed certs: Deployments using self-signed certificates without configuring DATABASE_CA_CERT will break. This is the correct trade-off for a security fix, and the migration path (set DATABASE_CA_CERT) is well-documented.

🕵️ claude-code · claude-opus-4-6 · run details

@aaight aaight merged commit 08ea4e8 into dev Mar 22, 2026
8 checks passed
zbigniewsobiecki added a commit that referenced this pull request Mar 22, 2026
The dev database uses a self-signed certificate chain, which started
failing after TLS rejectUnauthorized was enabled by default in #979.
Add DATABASE_SSL=false to all migration steps in the dev deploy workflow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Mar 23, 2026
…g services

PR #979 tightened DB SSL defaults to rejectUnauthorized: true, but the
dev database uses a self-signed certificate. The deploy workflow already
passed DATABASE_SSL=false to one-off migration containers via -e flags,
but the long-running router and dashboard containers read their env from
/opt/services/cascade-dev.env — which never had this variable set.

Result: every router startup since that PR crashed at seedAgentDefinitions
with "self-signed certificate in certificate chain" before the process
could serve any traffic.

Add an idempotent step (sed removes any existing line, echo appends the
correct value) that runs once per deploy, before docker compose restarts
both services. Since both containers share the same env_file, a single
write fixes both the router and the dashboard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Mar 23, 2026
…g services (#986)

PR #979 tightened DB SSL defaults to rejectUnauthorized: true, but the
dev database uses a self-signed certificate. The deploy workflow already
passed DATABASE_SSL=false to one-off migration containers via -e flags,
but the long-running router and dashboard containers read their env from
/opt/services/cascade-dev.env — which never had this variable set.

Result: every router startup since that PR crashed at seedAgentDefinitions
with "self-signed certificate in certificate chain" before the process
could serve any traffic.

Add an idempotent step (sed removes any existing line, echo appends the
correct value) that runs once per deploy, before docker compose restarts
both services. Since both containers share the same env_file, a single
write fixes both the router and the dashboard.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Mar 23, 2026
Migration containers were missing DATABASE_SSL and DATABASE_CA_CERT,
causing SELF_SIGNED_CERT_IN_CHAIN failures after TLS cert validation
was enabled by default in #979.

Add --env-file /opt/services/cascade.env to the three migration steps
(db migrate, trigger config migration, hooks migration) so they pick
up the same SSL configuration already used by the re-encrypt step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Mar 23, 2026
Migration containers were missing DATABASE_SSL and DATABASE_CA_CERT,
causing SELF_SIGNED_CERT_IN_CHAIN failures after TLS cert validation
was enabled by default in #979.

Add --env-file /opt/services/cascade.env to the three migration steps
(db migrate, trigger config migration, hooks migration) so they pick
up the same SSL configuration already used by the re-encrypt step.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants