Skip to content

Fix: Connection test to not just check 200#688

Merged
tdewanNvidia merged 1 commit intomainfrom
tdewan/fix_conn
Mar 11, 2026
Merged

Fix: Connection test to not just check 200#688
tdewanNvidia merged 1 commit intomainfrom
tdewan/fix_conn

Conversation

@tdewanNvidia
Copy link
Contributor

@tdewanNvidia tdewanNvidia commented Mar 11, 2026

Description

Connection check should not just check 200 but should retry on 429 and handle other status codes

Issue - None

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features
    • URL test configuration now allows optional strict expected status or no expectation for flexible validation.
    • Added configurable retriable HTTP status codes to mark temporary failures vs permanent ones.
    • Behavior updated: if an expected status is set, responses must match; otherwise retriable codes trigger retries, 5xx are treated as failures, and other codes are treated as success.
    • Success logging now includes the observed status; retry/backoff behavior unchanged.

@tdewanNvidia tdewanNvidia requested a review from a team as a code owner March 11, 2026 19:43
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

HTTP status handling in the connection validator was changed: expected_status_code on URLTestConfig is now optional (default None) and retriable_status_codes: List[int] was added (default [429, 503]). _connection_test now branches between strict equality (when expected set) and categorized handling (retriable, 5xx failure, or success) and includes observed status in success messaging.

Changes

Cohort / File(s) Summary
Connection Validator
src/operator/utils/node_validation_test/connection_validator.py
Changed URLTestConfig.expected_status_code from int (default 200) to Optional[int] (default None). Added retriable_status_codes: List[int] (default [429, 503]). Updated _connection_test logic: if expected_status_code set → strict equality check; if None → treat codes in retriable_status_codes as retriable (log warning, return None), treat 5xx as service failure (log error, return None), otherwise mark reachable (log info). Success logging and NodeCondition message now include observed status code.

Sequence Diagram

sequenceDiagram
    participant Client as HTTP Endpoint
    participant Validator as Connection Validator
    participant Config as URLTestConfig

    Client->>Validator: Respond with HTTP status
    Validator->>Config: Read expected_status_code
    alt expected_status_code is set
        Config-->>Validator: configured value
        Validator->>Validator: Compare observed == expected
        alt equal
            Validator->>Validator: Log success (include status), return NodeCondition
        else not equal
            Validator->>Validator: Log error, return None
        end
    else expected_status_code is None
        Config-->>Validator: None (flexible mode)
        Validator->>Config: Check retriable_status_codes
        alt status in retriable_status_codes
            Validator->>Validator: Log warning, return None (retry)
        else not retriable
            Validator->>Validator: Check if status is 5xx
            alt is 5xx
                Validator->>Validator: Log error, return None
            else
                Validator->>Validator: Log info, return NodeCondition (include status)
            end
        end
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped to check a URL, ears all keen and bright,
If the code is 429 I pause and think tonight,
Five-hundreds make me thump — an error I will flag,
But two-hundreds make me cheer and nibble on a rag! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix: Connection test to not just check 200' directly and clearly summarizes the main change: modifying connection validation to handle status codes beyond just HTTP 200.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tdewan/fix_conn

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/operator/utils/node_validation_test/connection_validator.py (1)

165-172: Consider clarifying the fallback behavior for 5xx codes.

The condition status_code in url_config.failure_status_codes or status_code >= 500 means all 5xx codes are treated as failures regardless of the failure_status_codes configuration. The explicit list provides documentation value but can't be used to exclude specific 5xx codes from being failures.

If this is intentional (5xx should always fail), consider simplifying or adding a comment to clarify:

# All 5xx codes are failures; explicit list is for clarity/documentation
if status_code >= 500:

If users should be able to configure which 5xx codes are failures, remove the or status_code >= 500 fallback.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/operator/utils/node_validation_test/connection_validator.py` around lines
165 - 172, The current check in the connection validator treats every 5xx as
failure by combining status_code in url_config.failure_status_codes or
status_code >= 500; decide and implement one of two fixes: (A) if 5xx should
always fail, simplify the condition to just if status_code >= 500 and add a
clarifying comment above it, or (B) if callers should be able to control which
5xx codes fail, remove the "or status_code >= 500" clause so only values in
url_config.failure_status_codes trigger the failure branch; update the logging
message in the failure branch (the block using status_code and url_config.url)
if you change the condition to reflect the chosen behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/operator/utils/node_validation_test/connection_validator.py`:
- Around line 165-172: The current check in the connection validator treats
every 5xx as failure by combining status_code in url_config.failure_status_codes
or status_code >= 500; decide and implement one of two fixes: (A) if 5xx should
always fail, simplify the condition to just if status_code >= 500 and add a
clarifying comment above it, or (B) if callers should be able to control which
5xx codes fail, remove the "or status_code >= 500" clause so only values in
url_config.failure_status_codes trigger the failure branch; update the logging
message in the failure branch (the block using status_code and url_config.url)
if you change the condition to reflect the chosen behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c8137bdc-f014-4189-845c-c619b5502587

📥 Commits

Reviewing files that changed from the base of the PR and between 4cc76c2 and b13aabb.

📒 Files selected for processing (1)
  • src/operator/utils/node_validation_test/connection_validator.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/operator/utils/node_validation_test/connection_validator.py (1)

124-131: Docstring references hardcoded status codes that are actually configurable.

The docstring says "Retriable codes (429, 503)" but these values come from url_config.retriable_status_codes and can be customized per URL config. Consider updating the docstring to reflect this:

📝 Suggested docstring update
         Status code handling:
             - If expected_status_code is set, only that code is considered success
             - If expected_status_code is None (default):
-                - Retriable codes (429, 503) trigger retry
+                - Codes in retriable_status_codes (default: 429, 503) trigger retry
                 - Any other 5xx indicates service is down
                 - All other codes (2xx, 3xx, 4xx) indicate service is reachable
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/operator/utils/node_validation_test/connection_validator.py` around lines
124 - 131, Update the docstring to stop hardcoding "Retriable codes (429, 503)"
and instead state that retriable codes come from the URL configuration
(url_config.retriable_status_codes) and may be customized per URL; keep the rest
of the logic description (expected_status_code behavior, 5xx handling, reachable
codes) but reference url_config.retriable_status_codes where retriable codes are
mentioned and note the default set is used when not overridden.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/operator/utils/node_validation_test/connection_validator.py`:
- Around line 124-131: Update the docstring to stop hardcoding "Retriable codes
(429, 503)" and instead state that retriable codes come from the URL
configuration (url_config.retriable_status_codes) and may be customized per URL;
keep the rest of the logic description (expected_status_code behavior, 5xx
handling, reachable codes) but reference url_config.retriable_status_codes where
retriable codes are mentioned and note the default set is used when not
overridden.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 376c3327-4518-4e8a-a747-0a8421e2341e

📥 Commits

Reviewing files that changed from the base of the PR and between b13aabb and aef0005.

📒 Files selected for processing (1)
  • src/operator/utils/node_validation_test/connection_validator.py

@tdewanNvidia tdewanNvidia merged commit 71969cc into main Mar 11, 2026
9 checks passed
@tdewanNvidia tdewanNvidia deleted the tdewan/fix_conn branch March 11, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants