make Postgres connect timeout configurable#148
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR makes the Postgres “connect with retry” wall-clock budget configurable via AGENTEVALS_DB_CONNECT_TIMEOUT_S, with a new 600s default intended to better tolerate slow Kubernetes database bring-up during startup.
Changes:
- Replace the fixed connect-retry deadline constant usage with a resolver function (
connect_deadline_seconds()) that readsAGENTEVALS_DB_CONNECT_TIMEOUT_S. - Increase the default connect-retry budget from 60s to 600s and document the env var override behavior.
- Add a Helm value (
database.postgres.connectTimeoutSeconds) and wire it into the Deployment asAGENTEVALS_DB_CONNECT_TIMEOUT_S.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/agentevals/storage/postgres/pool.py | Uses the new deadline resolver when retrying asyncpg pool warmup. |
| src/agentevals/storage/postgres/migrator.py | Introduces env-var based connect-retry budget resolution and updates retry deadline usage/default. |
| charts/agentevals/values.yaml | Adds a configurable Helm value for the DB connect timeout with documentation. |
| charts/agentevals/templates/deployment.yaml | Plumbs the Helm value into the container env as AGENTEVALS_DB_CONNECT_TIMEOUT_S. |
Comments suppressed due to low confidence (1)
src/agentevals/storage/postgres/migrator.py:288
connect_deadline_seconds()accepts non-finite floats likeNaN(e.g. env var value "nan"), which will propagate into the retry deadline math and can makesleep_forbecome NaN, causingasyncio.sleep()to raise and abort startup. Consider rejecting non-finite values (e.g.math.isfinite(val)), and falling back to the default with a warning as intended by the docstring.
def connect_deadline_seconds() -> float:
"""Resolve the connect-retry budget. Reads ``AGENTEVALS_DB_CONNECT_TIMEOUT_S``
and falls back to :data:`CONNECT_RETRY_DEADLINE_S` if the env var is
unset, empty, non-numeric, or non-positive."""
raw = os.getenv("AGENTEVALS_DB_CONNECT_TIMEOUT_S")
if raw is None or raw == "":
return CONNECT_RETRY_DEADLINE_S
try:
val = float(raw)
except ValueError:
logger.warning(
"Invalid AGENTEVALS_DB_CONNECT_TIMEOUT_S=%r (not a number); using default %.0fs",
raw,
CONNECT_RETRY_DEADLINE_S,
)
return CONNECT_RETRY_DEADLINE_S
if val <= 0:
logger.warning(
"Invalid AGENTEVALS_DB_CONNECT_TIMEOUT_S=%r (must be positive); using default %.0fs",
raw,
CONNECT_RETRY_DEADLINE_S,
)
return CONNECT_RETRY_DEADLINE_S
return val
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR makes the Postgres connect-retry budget tunable via
AGENTEVALS_DB_CONNECT_TIMEOUT_Swith a 600s default.