Skip to content

Conversation

@bryancall
Copy link
Contributor

When a plugin uses TSHttpTxnServerAddrSet() to set a server address and the connection fails, ATS should retry by calling the OS_DNS hook again to allow the plugin to provide an alternative address. However, the retry logic was missing handling for the USE_API case.

Changes:

  • Added retry handling for ResolveInfo::OS_Addr::USE_API in handle_response_from_server()
  • Clears dns_info.resolved_p to trigger OS_DNS hook re-execution
  • Resets os_addr_style to TRY_DEFAULT to allow normal flow
  • Clears server_request to allow it to be rebuilt for new destination
  • Added api_server_addr_set_retried flag to prevent infinite retry loops
  • Added test plugin and autest to verify the fix
  • Updated TSHttpTxnServerAddrSet documentation to clarify retry behavior

Fixes #12611

When a plugin uses TSHttpTxnServerAddrSet() to set a server address and the
connection fails, ATS should retry by calling the OS_DNS hook again to allow
the plugin to provide an alternative address. However, the retry logic was
missing handling for the USE_API case.

Changes:
- Added retry handling for ResolveInfo::OS_Addr::USE_API in handle_response_from_server()
- Clears dns_info.resolved_p to trigger OS_DNS hook re-execution
- Resets os_addr_style to TRY_DEFAULT to allow normal flow
- Clears server_request to allow it to be rebuilt for new destination
- Added api_server_addr_set_retried flag to prevent infinite retry loops
- Added test plugin and autest to verify the fix

This matches the documented behavior in TSHttpTxnServerAddrSet API docs which
states: 'Plugins should be prepared for TS_HTTP_OS_DNS_HOOK and any subsequent
hooks to be called multiple times.'
@bryancall bryancall added this to the 10.2.0 milestone Dec 5, 2025
@bryancall bryancall self-assigned this Dec 5, 2025
@bryancall
Copy link
Contributor Author

bryancall commented Dec 5, 2025

Why limit to 1 retry:

The retry logic intentionally allows only one additional OS_DNS hook call rather than using connect_attempts_max_retries. This avoids a combinatorial explosion of connection attempts:

  • With 1 OS_DNS retry and connect_attempts_max_retries=3: up to 6 total connection attempts (2 addresses × 3 retries each)
  • If we used connect_attempts_max_retries for OS_DNS retries too: up to 12 attempts (4 addresses × 3 retries each)

The single retry gives the plugin one opportunity to provide an alternative address on connection failure, which matches what issue #12611 requested. This keeps latency bounded while still enabling the documented retry behavior.

Since this is working and we have a test, adding an additional configuration option on how many times to retry the DNS would be easy to do.

The StillRunningAfter check was failing because processes terminate
after the test completes. Since we only need to verify the log content,
not that processes stay running, remove this check.
@bryancall bryancall requested a review from masaori335 December 8, 2025 22:51
Copy link
Contributor

@masaori335 masaori335 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding ResolveInfo::OS_Addr::USE_API case seems make sense. Also, retrying only once is reasonable.

@bryancall bryancall merged commit bb34d9e into apache:master Dec 10, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

retry fails if api_server_addr_set is true

2 participants