Skip to content

Conversation

XJDKC
Copy link
Member

@XJDKC XJDKC commented Jun 18, 2025

Summary

Fixes an issue where retry flag was-retried was not accessible after request execution due to context isolation.

Previously, requests were executed with a BasicHttpContext instance, which caused Apache HttpClient to create internal child contexts. As a result, any attributes (such as was-retried) set during the retry process were not visible to the caller after the response.

This change ensures that a shared HttpClientContext is used when executing requests, allowing retry-related attribute to persist and be accessible for error handling and observability.

Prior work: #12818

Changes

Use HttpClientContext.create() instead of BasicHttpContext.

Testing

Adding a unit test that simulates a self-conflict table corruption scenario.

@github-actions github-actions bot added the core label Jun 18, 2025
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awaiting UT test as per offline discussion !

ErrorResponse enrichedErrorResponse =
ErrorResponse.builder()
.wasRetried(wasRetried == Boolean.TRUE)
.wasRetried(Boolean.TRUE.equals(wasRetried))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it required ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not required, I just got used to using equals after seeing some static analysis warnings. Using == may not pass certain checks, even though it's safe here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also interested in this change, but if it avoids a warning, sounds good to me :)

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !
Thanks @XJDKC ! Nice catch.

// headers.
// Return the path generated for the test case, so that the client can call that path to exercise
// it.
private static String addRequestTestCaseAndGetPath(HttpMethod method, Item body, int statusCode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since nothing else broke I think the change here is ok, but we should probably update the doc to note that this will only set the response for the "next" invocation of the path correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the doc, I'll address this more thoroughly in the follow-up refactoring.

In most cases, we should return a consistent response. However, for retry scenarios, we may need finer control over the responses across retries. For example, it may make sense for the final mock request in a retry sequence to return a fixed, constant response.

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thanks for validating the issue with non 503 errors as well. I left a few notes on test layout.

@RussellSpitzer RussellSpitzer requested a review from rdblue June 18, 2025 16:13
@RussellSpitzer
Copy link
Member

@XJDKC We may be going a different route with this, there were some offline discussions and we are considering just disabling all retries to just avoid this issue entirely and just throw CSU for all 5XX errors

@XJDKC
Copy link
Member Author

XJDKC commented Jun 18, 2025

@XJDKC We may be going a different route with this, there were some offline discussions and we are considering just disabling all retries to just avoid this issue entirely and just throw CSU for all 5XX errors

Got it, so we're planning to retry only when the error code is 429?

@XJDKC XJDKC closed this Jun 18, 2025
@XJDKC
Copy link
Member Author

XJDKC commented Jun 18, 2025

Just realized that this PR is to fix the HttpContext creation, so we still need to check in this fix but it's not a required fix for the protential table corruption.
cc: @RussellSpitzer

@XJDKC XJDKC reopened this Jun 18, 2025
@nastra
Copy link
Contributor

nastra commented Jun 24, 2025

given that the was-retried flag is being removed in #13352 is this PR still needed?

Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants