Skip to content

[ENG-3281] Add all pools to database healthcheck, add extended pool status data #7856

Merged
johnewart merged 6 commits intomainfrom
johnewart/ENG-3281
Apr 16, 2026
Merged

[ENG-3281] Add all pools to database healthcheck, add extended pool status data #7856
johnewart merged 6 commits intomainfrom
johnewart/ENG-3281

Conversation

@johnewart
Copy link
Copy Markdown
Collaborator

@johnewart johnewart commented Apr 8, 2026

Ticket ENG-3281

Description Of Changes

  • Adds all database pools to the health check (does a SELECT 1 for non-primary DB)
  • This will trigger pre-warming when checking out a connection
  • Added extended status data for pools, whether they are prewarmed, prewarmed status (capacity, checked_in, checked_out, percentage of capacity) that could be used for alerts or status data.

Code Changes

  • Add extended data to health check
  • Add tests

Steps to Confirm

N/A

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

…for prewarming state (can be alerted on, for example)
@johnewart johnewart requested a review from a team as a code owner April 8, 2026 15:15
@johnewart johnewart requested review from nreyes-dev and removed request for a team April 8, 2026 15:15
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
fides-plus-nightly Ready Ready Preview, Comment Apr 16, 2026 6:28pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
fides-privacy-center Ignored Ignored Apr 16, 2026 6:28pm

Request Review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The overall structure is solid — extracting per-pool health checks into small helper functions (_check_sync_session, _check_async_session, _get_async_readonly_prewarming_details) makes the main handler readable, and the new PoolStatus/PoolPrewarming models are clean. The prewarm-awareness logic in ensure_async_readonly_pool_prewarmed is correct (double-checked locking, global guarded inside the async lock).

A few issues worth addressing before merge:

Must Fix

  1. database_revision is silently dropped (line 161) — hardcoding it to None is a breaking change for consumers that relied on the migration revision in the health response. If intentional, the field should be removed from the schema and the changelog updated to mention it; otherwise the revision check needs to be preserved.

  2. get_readonly_api_session() failures cause a 500, not a 503 (lines 122–131) — the try/finally block has no except clause, so any exception thrown while acquiring the readonly session propagates up as an unhandled error rather than being reported as an unhealthy pool. An except Exception branch that sets pools["api_sync_readonly"] = PoolStatus(healthy=False) would fix this.

Suggestions

  1. "skipped" in PoolHealth is never used (line 29) — either use it for non-configured pools or remove it from the literal to avoid a confusing dead value.

  2. Non-configured pools are reported as healthy=True (lines 133, 153) — a pool that was never checked isn't really healthy. This is where "skipped" would be the right signal, and it would let callers distinguish "pool is healthy" from "pool is not configured".

  3. session = session # type: AsyncSession (line 186) — no-op reassignment; use a proper inline type annotation instead.

  4. Redundant if/else in test_database_healthcheck (test line ~3436–3441) — both branches assert identically, so the conditional is dead code.

Minor

The "needs migration" status that the old get_db_health could return is gone without mention in the changelog. If migration-state health reporting is intentionally dropped, it's worth a brief note there.

Comment thread src/fides/api/v1/endpoints/health.py Outdated
database="unhealthy" if has_unhealthy_pool else "healthy",
pools=pools,
async_readonly_pool_prewarmed=async_readonly_pool_prewarmed,
database_revision=None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change: database_revision is now always null

Previously the response populated database_revision with the actual Alembic migration revision (from get_db_health). This hardcodes it to None, silently breaking any consumer of this endpoint that reads database_revision to detect migration state.

If migration-state reporting is intentionally removed, the changelog entry should call that out explicitly and the field should probably be removed from DatabaseHealthCheck rather than left as an always-null optional. If it should still be reported, the revision check needs to be re-added.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds concerning, is dropping migration-state awareness intentional here? Doesn't seem to be part of the ticket. Is claude right about this?

Comment on lines +122 to +124
if CONFIG.database.sqlalchemy_readonly_database_uri:
readonly_db: Optional[Session] = None
try:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing except — errors from get_readonly_api_session() will surface as 500 instead of unhealthy

The try/finally block closes the session but doesn't catch exceptions. If get_readonly_api_session() itself raises (e.g. pool exhausted, misconfigured URI), the exception propagates uncaught from database_health and FastAPI returns a 500 rather than a well-formed {"database": "unhealthy", ...} with 503.

The intent is clearly to surface this as an unhealthy pool. Suggest adding an except branch:

try:
    readonly_db = get_readonly_api_session()
    pools["api_sync_readonly"] = PoolStatus(
        healthy=_check_sync_session(readonly_db) == "healthy"
    )
except Exception as error:
    logger.error("Unable to reach sync readonly pool: {}", Pii(str(error)))
    pools["api_sync_readonly"] = PoolStatus(healthy=False)
finally:
    if readonly_db:
        readonly_db.close()

from fides.config import CONFIG

CacheHealth = Literal["healthy", "unhealthy", "no cache configured", "skipped"]
PoolHealth = Literal["healthy", "unhealthy", "skipped"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"skipped" is defined in PoolHealth but never returned

All code paths set a pool to either healthy=True or healthy=False via PoolStatus(healthy=...). The PoolHealth literal on the top-level database field never uses "skipped" either. Either use it for non-configured pools (instead of healthy=True) or remove it from the type to avoid confusion.

Comment thread src/fides/api/v1/endpoints/health.py Outdated
Comment on lines +152 to +153
else:
pools["api_async_readonly"] = PoolStatus(healthy=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unconfigured pools reported as healthy=True — semantically misleading

When async_readonly_database_uri is not set, the pool entry is still added to the response as healthy: true. The same pattern appears for api_sync_readonly at line 133. A pool that doesn't exist isn't "healthy" — it wasn't checked at all. Since PoolHealth already declares "skipped" as a valid literal, this would be a natural place to use it, and consumers could distinguish "configured and healthy" from "not configured".

Comment thread src/fides/api/v1/endpoints/health.py Outdated
"""Return health status for an async pool-backed session factory."""
try:
async with session_factory() as session:
session = session # type: AsyncSession
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No-op reassignment used as a type comment

session = session # type: AsyncSession is a no-op — it reassigns the variable to itself. The # type: comment form is for legacy Python 2-style annotations and has no effect here either. Use a proper inline annotation instead:

async with session_factory() as session:
    session: AsyncSession  # or just rely on the factory's type annotation
    await session.execute(text("SELECT 1"))

Or, if the goal is just to satisfy a type checker, annotate session_factory more precisely so the checker can infer the session type.

Comment thread tests/ctl/core/test_api.py Outdated
Comment on lines +3436 to +3441
if test_config.database.sqlalchemy_readonly_database_uri:
assert payload["pools"]["api_sync_readonly"]["healthy"] is True
assert payload["pools"]["api_sync_readonly"]["prewarming"] is None
else:
assert payload["pools"]["api_sync_readonly"]["healthy"] is True
assert payload["pools"]["api_sync_readonly"]["prewarming"] is None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both branches of the if/else are identical — the conditional is dead code

The if test_config.database.sqlalchemy_readonly_database_uri: block and the else: block make exactly the same assertions. The conditional adds no test value and may mask a missing test for the "URI configured" branch (e.g. asserting the pool was actually checked). Simplify to:

assert payload["pools"]["api_sync_readonly"]["healthy"] is True
assert payload["pools"]["api_sync_readonly"]["prewarming"] is None

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 66.95652% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.03%. Comparing base (0050de4) to head (38a384e).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/fides/api/v1/endpoints/health.py 73.52% 20 Missing and 7 partials ⚠️
src/fides/api/db/ctl_session.py 16.66% 10 Missing ⚠️
src/fides/api/db/database.py 0.00% 0 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (66.95%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7856      +/-   ##
==========================================
- Coverage   85.06%   85.03%   -0.04%     
==========================================
  Files         629      629              
  Lines       40861    40971     +110     
  Branches     4748     4763      +15     
==========================================
+ Hits        34758    34838      +80     
- Misses       5030     5052      +22     
- Partials     1073     1081       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@nreyes-dev nreyes-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the stuff that Claude pointed to. Particularly the migration thing.

My Claude came up with these two critical things that would need to be fixed for sure before merging this:

  • Missing except on sync readonly session — get_readonly_api_session() failure gives a 500 instead of a properly formatted 503. Real bug.
  • database_revision / migration check silently dropped — not in scope of the ticket, breaks existing behavior. Should either be preserved or explicitly called out as intentional.

Comment thread src/fides/api/v1/endpoints/health.py Outdated
database="unhealthy" if has_unhealthy_pool else "healthy",
pools=pools,
async_readonly_pool_prewarmed=async_readonly_pool_prewarmed,
database_revision=None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds concerning, is dropping migration-state awareness intentional here? Doesn't seem to be part of the ticket. Is claude right about this?

…eck for consistency, uses 'skipped' instead of leaving it for future use, and marks non-configured pools as skipped
@johnewart
Copy link
Copy Markdown
Collaborator Author

/code-review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Database Health Endpoint — Per-Pool Details & Readonly Prewarming

This PR expands the /health/database endpoint to report per-pool health, adds PoolStatus/PoolPrewarming response models, and handles optional readonly pools as "skipped". The overall shape is clean and the test coverage is solid. There is one correctness bug that survived the stated "no-op assignment fix", and a few minor issues worth addressing.


Bug (should fix)

database.py:143 — double call to context.get_current_revision()

current_revision is correctly stored on line 141 and returned on line 149, but the if-comparison on line 143 still calls context.get_current_revision() a second time instead of using the stored variable. The commit message describes this as a fixed "no-op assignment" but the fix is incomplete. No observed behavioral bug since the revision is stable between calls, but the code is semantically wrong and the stored value is misleading. Fix: replace context.get_current_revision() on line 143 with current_revision.


Suggestions

health.py:172migration_health == "unhealthy" branch appears to be dead code

get_db_health returns "unhealthy" only when the sync DB is unreachable, which also causes the sync primary pool check to return "unhealthy", making has_unhealthy_pool true first. This branch can never be reached. Either add a comment explaining it as an intentional safety net or remove it.

health.py:206 — intermediate sess: AsyncSession variable

Exists only to satisfy the type checker because session_factory: Any makes the async with result untyped. Better to type session_factory properly (e.g. Callable[[], AsyncContextManager[AsyncSession]]) and drop the intermediate variable. See inline comment.

ctl_session.py:207 — lock scope comment

The return ASYNC_READONLY_POOL_WARMED at the end of ensure_async_readonly_pool_prewarmed is read outside the lock. This is safe (Python GIL, bool assignment is atomic), but a brief comment explaining the lock is only guarding against double-warming would help future readers.

test_api.py:test_database_healthcheck — consider splitting the test

The single integration test uses if/else branches for readonly-configured vs. not-configured paths. In most CI environments only one branch exercises. Splitting into two named tests would make failures immediately identifiable from the test name.


What looks good

  • PoolHealth = Literal["healthy", "unhealthy", "skipped"] is a clean addition; "skipped" is correctly excluded from the has_unhealthy_pool check.
  • async_readonly_pool_prewarmed = None when the pool is unconfigured is the right sentinel (distinct from False meaning "configured but not warmed").
  • The readonly_db.close() in a finally block correctly handles cleanup even when _check_sync_session raises.
  • New tests cover all major code paths: unhealthy pool → 503, needs migration → 503, readonly session failure → 503, prewarmed details in response.
  • _get_async_readonly_prewarming_details correctly guards target <= 0 to avoid a division-by-zero for capacity_percentage.

🔬 Codegraph: unavailable


💡 Write /code-review in a comment to re-run this review.

Comment thread src/fides/api/v1/endpoints/health.py Outdated
"""Return health status for an async pool-backed session factory."""
try:
async with session_factory() as raw_session:
sess: AsyncSession = raw_session
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:206

The intermediate sess: AsyncSession = raw_session assignment is only here to satisfy the type checker, because session_factory: Any makes the async with result untyped. Consider typing session_factory more precisely instead — e.g.:

from collections.abc import Callable, AsyncGenerator

async def _check_async_session(
    session_factory: Callable[[], AsyncContextManager[AsyncSession]]
) -> PoolHealth:
    try:
        async with session_factory() as sess:
            await sess.execute(text("SELECT 1"))
        return "healthy"
    ...

This removes the intermediate variable and makes the intent clearer.

overall_database: DatabaseResponseHealth = "unhealthy"
elif migration_health == "needs migration":
overall_database = "needs migration"
elif migration_health == "unhealthy":
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:168-173

Minor: the migration_health == "unhealthy" branch (line 172) is currently dead code. get_db_health in database.py only returns "healthy", "unhealthy", or "needs migration", and the "unhealthy" case from get_db_health is already subsumed by has_unhealthy_pool (since the sync primary pool will also be unhealthy whenever get_db_health returns "unhealthy"). Either document why this branch exists as a safety net, or remove it to avoid confusion.

Copy link
Copy Markdown
Contributor

@nreyes-dev nreyes-dev Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what claude said here is true, isn't it? Approving anyways, worst case this is just a dead branch of the if/else block.

readonly_async_engine,
)
ASYNC_READONLY_POOL_WARMED = True

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/db/ctl_session.py:207

ensure_async_readonly_pool_prewarmed always returns ASYNC_READONLY_POOL_WARMED at the end, but the value is read under the lock only within the if not ASYNC_READONLY_POOL_WARMED branch. If the pool was already warmed before this call, the function returns True without entering the lock — which is fine for the warm check, but means the return ASYNC_READONLY_POOL_WARMED at line 207 reads the global without the lock held. This is benign (it's a bool and Python GIL protects single-variable reads), but it's worth a comment explaining that the lock is only needed to prevent double-warming, not to protect the final read.

assert detail["database_revision"] == "9e83545ed9b6"

def test_database_healthcheck_readonly_pool_session_failure_returns_503(
self,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/ctl/core/test_api.py:3438 (the test_database_healthcheck function)

The single test function contains if/else branches for both "readonly configured" and "readonly not configured" paths. In CI the test will silently skip the readonly branch assertions if the pool isn't configured. Consider splitting into two separate tests (e.g. test_database_healthcheck_no_readonly and test_database_healthcheck_with_readonly) with appropriate pytest.mark.skipif or fixture-driven conditions, so a failing readonly assertion is immediately identifiable from the test name rather than requiring test output inspection.

@johnewart
Copy link
Copy Markdown
Collaborator Author

/code-review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Database Health Pool Details (#7856)

This PR expands the /health/database endpoint from a single database/database_revision response into a structured per-pool health report. The direction is solid — per-pool visibility is a meaningful observability improvement — and the overall structure is clean. A few things worth addressing before merge:

Issues

Redundant flag read after ensure_async_readonly_pool_prewarmed (minor)
ensure_async_readonly_pool_prewarmed() already returns ASYNC_READONLY_POOL_WARMED, but the return value is dropped and is_async_readonly_pool_prewarmed() is called immediately after. The two calls can be collapsed to a single await ensure_async_readonly_pool_prewarmed().

capacity_percentage can exceed 1.0 (minor)
The calculation (checked_in + checked_out) / target does not account for SQLAlchemy pool overflow connections. Values > 1.0 are possible and may surprise consumers. The field should be documented or clamped, or renamed to signal that overflow is possible.

migration_health == "unhealthy" branch appears unreachable (minor)
get_db_health in database.py only produces "healthy" or "needs migration". If this third branch is genuinely reachable it needs a test; if it isn't, it should be removed to avoid confusion.

Health endpoint now has up to 4 sequential pool round-trips with no timeout (suggestion)
A hung pool will stall the entire health response. Consider per-probe timeouts so a slow pool surfaces as "unhealthy" quickly rather than blocking the endpoint.

Primary sync pool SELECT 1 is redundant with get_db_health (nit)
Connectivity to the primary sync pool is already exercised by the migration check. The extra SELECT 1 doesn't add health signal, but it does give the pool a consistent named slot in the response. Worth a brief comment explaining the intent.

ensure_async_readonly_pool_prewarmed return semantics are ambiguous (nit)
False is returned both when the pool is not configured and when it's configured but not yet warmed. A docstring clarifying the three states would help future callers.

What's good

  • The database.py change avoiding a duplicate context.get_current_revision() call is a clean micro-improvement.
  • The test rewrite — replacing a single parametrized test with focused, named scenario tests — is a clear readability win and makes failure messages much more actionable.
  • The finally: readonly_db.close() guard in the sync readonly path correctly prevents session leaks on error.
  • Using Pii(str(error)) consistently in error logs maintains existing PII-scrubbing discipline.
  • The PoolStatus(health="skipped") pattern for unconfigured optional pools is a clean API contract.

🔬 Codegraph: connected (46725 nodes)


💡 Write /code-review in a comment to re-run this review.

else:
pools["api_async_readonly"] = PoolStatus(health="skipped")
async_readonly_pool_prewarmed = None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:168

ensure_async_readonly_pool_prewarmed() already returns the current value of ASYNC_READONLY_POOL_WARMED, but the return value is discarded and is_async_readonly_pool_prewarmed() is called immediately after to read the same flag. The two calls can be collapsed:

async_readonly_pool_prewarmed = await ensure_async_readonly_pool_prewarmed()

This removes the redundant is_async_readonly_pool_prewarmed() call and eliminates the tiny window (irrelevant in practice, but semantically cleaner) between the two reads of the global flag.

capacity_percentage: Optional[float]
if target <= 0:
capacity_percentage = None
else:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:236

capacity_percentage is computed as (checked_in + checked_out) / target, which can exceed 1.0 when SQLAlchemy's max_overflow allows connections beyond the configured pool size. Consumers of this field may reasonably expect a value in [0, 1]. Consider either:

  1. Clamping: min(1.0, (checked_in + checked_out) / target)
  2. Renaming the field to pool_utilization_ratio and documenting that values > 1.0 indicate overflow connections are in use.

At minimum, add a note to PoolPrewarming.capacity_percentage that values > 1.0 are possible.

else:
overall_database = "healthy"

response = DatabaseHealthCheck(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:183

The migration_health == "unhealthy" branch appears to be dead code (or at least untested). get_db_health in database.py only sets db_health to "needs migration" or "healthy" — it doesn't appear to return "unhealthy" under normal failure paths. If this branch is genuinely reachable (e.g., via an exception path elsewhere in get_db_health), it should have a corresponding test. If it isn't reachable, it should be removed to avoid confusion.

@@ -83,21 +114,136 @@ def get_cache_health() -> str:
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:114

The endpoint now issues up to four sequential network round-trips (two sync pool probes + two async pool probes). If any pool is hung or slow, the health check will block for a correspondingly long time, which can cause monitoring systems to time out and falsely report the service as down.

Consider wrapping each pool probe with asyncio.wait_for(..., timeout=<seconds>) (or a connect_args socket timeout at the engine level) so that a single slow pool surfaces as "unhealthy" quickly rather than stalling the whole response.

return response


def _check_sync_session(db: Session) -> PoolHealth:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/v1/endpoints/health.py:198

_check_sync_session runs SELECT 1 on the DI-injected db session that get_db_health has already used (via db.connection()). The connectivity has effectively already been verified by the migration check. The primary sync pool entry does add a named slot in the response, but it doesn't give incremental health signal over what get_db_health already established. This is fine as-is, but worth a brief comment explaining the intent (e.g., "provides a consistent pools entry even though connectivity was already verified by get_db_health").

)
ASYNC_READONLY_POOL_WARMED = True

# Lock only serializes warm_async_pool; the flag read below is safe without the lock
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/fides/api/db/ctl_session.py:208

The function returns ASYNC_READONLY_POOL_WARMED read outside the lock. The comment explains this is safe within asyncio's single-threaded event loop (no concurrent coroutines can mutate the flag between the lock release and this read). That reasoning is correct, but the two early-return False paths (lines before the lock) mean the function can also return False when the pool is simply not configured — which is indistinguishable from "configured but not yet warmed". This is fine for the current call sites but could be surprising to future callers. A docstring clarifying the three return states ("not configured", "configured but not warmed", "configured and warmed") would help.

Copy link
Copy Markdown
Contributor

@erosselli erosselli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes thanks, thanks! let's make sure to get CI green , or add our coverage ignore comments if there's parts of the code we can't test

overall_database: DatabaseResponseHealth = "unhealthy"
elif migration_health == "needs migration":
overall_database = "needs migration"
elif migration_health == "unhealthy":
Copy link
Copy Markdown
Contributor

@nreyes-dev nreyes-dev Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what claude said here is true, isn't it? Approving anyways, worst case this is just a dead branch of the if/else block.

@johnewart johnewart enabled auto-merge April 16, 2026 18:22
@johnewart johnewart added this pull request to the merge queue Apr 16, 2026
Merged via the queue into main with commit 3bf2e3e Apr 16, 2026
68 of 70 checks passed
@johnewart johnewart deleted the johnewart/ENG-3281 branch April 16, 2026 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants