-
Notifications
You must be signed in to change notification settings - Fork 845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout feature for synchronous api calls #724
Conversation
@@ -299,9 +300,14 @@ default Builder retryPolicy(Consumer<RetryPolicy.Builder> retryPolicy) { | |||
* requests that don't get aborted until several seconds after the timer has been breached. Because of this, the client | |||
* execution timeout feature should not be used when absolute precision is needed. | |||
* | |||
* <p> | |||
* For synchronous streaming operations, customized implemenations of {@link ResponseTransformer} must handle interrupt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just "implementations of".
} | ||
|
||
SyncTimeoutTask(Thread threadToInterrupt) { | ||
this.threadToInterrupt = Validate.paramNotNull(threadToInterrupt, "threadToInterrupt"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check seems strange to me seeing as the default ctor sets it to null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, when using the default ctor, the thread is not required. It is intended to avoid null threadToInterrupt
only when using this ctor. I've seen similar pattern in AwsBasicCredentials
.
Lines 57 to 68 in 92f35e0
protected AwsBasicCredentials(String accessKeyId, String secretAccessKey) { | |
this(accessKeyId, secretAccessKey, true); | |
} | |
private AwsBasicCredentials(String accessKeyId, String secretAccessKey, boolean validateCredentials) { | |
this.accessKeyId = trimToNull(accessKeyId); | |
this.secretAccessKey = trimToNull(secretAccessKey); | |
if (validateCredentials) { | |
Validate.notNull(this.accessKeyId, "Access key ID cannot be blank."); | |
Validate.notNull(this.secretAccessKey, "Secret access key cannot be blank."); | |
} |
I can probably create another private ctor adding boolean validateThread
param to make it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that sort of makes sense, but I don't think the AwsBasicCredentials
case is the same since the validation behavior is controlled by another parameter. I just thought it was weird to bother checking since the class works whether or not the thread is null
since there's always a null check before using it
timeoutExceptionAssertion().accept(() -> retryableCallable().call()); | ||
} | ||
|
||
public static class SlowBytesResponseTransformer<ResponseT> implements ResponseTransformer<ResponseT, ResponseBytes<ResponseT>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to test with these different specific transformer types as a opposed to just some generic "slow" transformer? The tests that use them look identical otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, those tests are created to make sure we have InterruptMonitor.checkInterrupted();
in each of the response transformer types to cover the scenarios where the request time exceeds the configured time right after entering our defined response transformer.
https://github.com/aws/aws-sdk-java-v2/pull/724/files#diff-66508e304899c9905bc9d659b856b50dR119
|
||
return requestCallable.call(); | ||
try { | ||
return requestCallable.call(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In V1, for non-streaming, if request execution timeout is enabled, we count time to read the full HTTP response towards the timeout (by buffering the entity first). Do we this in v2? If not should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently do this in V2, but I guess we can add a isBufferResponse
in ExecuteRequest
and pass it to the http clients. This seems a bit out of scope of this PR and I'd say creating another PR for this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the goals of the new timers is to make the per attempt and per call timeouts behave more similiarly. So the per attempt timeout would not just be an abort on the HTTP request but a timer that interrupts as well and is scoped to the entirety of the things we do per request (i.e. unmarshalling for example). This would elimnate the need for buffering as we consume the data in unmarshalling which is subject to the timeout. Is this how the timeouts work (haven't looked at the PR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the current timeout implementation is the same as v1, so attempt timeout is just aborting the request. I like the idea of scoping it to the entire thing and will update the PR to see how it looks. One concern I have though is now that both call timemout and attempt timeout can interrupt the thread, handling interrupt exception can get tricky and we might get more issues with uncleared interrupt status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's always the tricky part of this feature. We also need to make sure the right exception is thrown. I.E. if api call attempt and api call timeouts trigger at the same time then api call timeout should be thrown as it's non-retryable.
timeoutTracker.abortable(requestCallable); | ||
|
||
context.apiCallAttemptTimeoutTracker(timeoutTracker); | ||
context.apiCallTimeoutTracker().abortable(requestCallable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this redundant with line 71?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended. line 71 is setting the abortable for ApiCallAttemptTimeoutTracker
whereas line 74 is setting the abortable for ApiCallTimeoutTracker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay I see it
} | ||
} | ||
|
||
public static void wastingTimeInterrupitably() throws InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "interruptibly" mispelled
* Schedule a {@link TimeoutTask} that aborts the task if not otherwise completed before the given timeout. | ||
* | ||
* @param timeoutExecutor the executor to execute the {@link TimeoutTask} | ||
* @return a {@link TimeoutTracker} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docs for the other params as well. It's hard to tell what isInterruptedThread
is without looking at the source. I'd also suggest changing it to interruptCurrentThread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. will fix.
* InterruptedException} is thrown from a interruptible task, you should either re-interrupt the current thread or throw that | ||
* {@link InterruptedException} from the {@link #apply(Object, AbortableInputStream)} method. Failure to do these things may | ||
* prevent the SDK from stopping the request in a timely manner in the event the thread is interrupted externally. | ||
* InterruptedException} is thrown from a interruptible task, you should throw that {@link InterruptedException} from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we remove the ability for customers to just interrupt the thread without (re)throwing an InterruptedException
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the customer just sets the interrupt flag, we are relying on InterruptMonitor.checkInterrupted();
to throw InterruptedException
and it can only happen after transform
method, so the timeout might be far off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean? Can we add a check for Thread.isInterrupted()
after we call transform
?
To me it's perfectly valid for customers to do this in their transform methods:
MyObject transform() {
try {
// something interruptible
} catch (InterruptedException e) {
log.error("Transform was interrupted", e);
Thread.interrupt();
return null;
}
}
We have an explicit test for this behavior in the S3 module (though it's currently disabled): https://github.com/aws/aws-sdk-java-v2/blob/2.0.0-preview-12/services/s3/src/it/java/software/amazon/awssdk/services/s3/GetObjectFaultIntegrationTest.java#L108-L129
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we can add the check after we call transform
, but the time when the request actually gets timed out might be far from accurate.
For example, say we set the request timeout to 1s
and if the transform
method takes more than 5s
, the request only will throw TimeoutException
after 5s when it goes to InterruptMonitor.checkInterrupted()
, or even worse, if the request gets stuck in transform
, TimeoutException will never get thrown.
Yeah, I can see it's valid for customers to just re-interrupt the thread, but my point is it's highly possible that they get TimeoutException long after the configured timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we already state in the doc that customers should throw an InterruptedException
if their transform method gets interrupted, so I guess I don't see how that's too different (as far as timing accuracy) if instead of re-throwing the exception, they set the flag and return right away and on the SDK side we have a check for the flag as soon as transform()
returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should be fine for most of the cases. I was thinking of one edge case where the customers just set the interrupted flag without re-throwing any exception and do some time-consuming task after it.
I will update the docs.
*/ | ||
private RuntimeException handleInterruptedException(RequestExecutionContext context, InterruptedException e) { | ||
if (e instanceof SdkInterruptedException) { | ||
((SdkInterruptedException) e).getResponseStream().ifPresent(r -> invokeSafely(r::close)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to set the current response in RequestExecutionContext and close it on any exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are actually closing inputStream in HandleResponseStage
.
This is to close the response stream if the thread gets interrupted before that line.
Line 63 in 28a5f48
closeInputStreamIfNeeded(httpResponse, didRequestFail); |
import software.amazon.awssdk.core.exception.SdkServiceException; | ||
import software.amazon.awssdk.core.http.HttpResponseHandler; | ||
|
||
public class ResponseHandlerTestUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit final
@@ -81,10 +83,14 @@ | |||
*/ | |||
default ReturnT apply(ResponseT response, AbortableInputStream inputStream) throws Exception { | |||
try { | |||
return transform(response, inputStream); | |||
} catch (RetryableException e) { | |||
InterruptMonitor.checkInterrupted(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we do this here instead instead of in
Line 124 in 716acc5
return responseTransformer.apply(resp, response.content().get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, same as exception handling logic here. I think we should probably move the whole thing to the SyncClientHandler
public void run() { | ||
log.debug(() -> "Timing out, aborting the task"); | ||
hasExecuted = true; | ||
if (!threadToInterrupt.isInterrupted()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, will remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f2aca09
to
a62a2c5
Compare
a62a2c5
to
4ab4f25
Compare
…637064dd Pull request: release <- staging/e795d56a-1e26-460a-848f-bacb637064dd
Description
Add timeout feature for synchronous api calls to allow customers to configure how long the api call or api call attempt should timeout if it didn't complete within the configured time.
It works the same way as v1.Updated: the apiCallTimeout works the same way as v1,ApiCallAttemptTimeout
, on the other hand, behaves differently compared with v1. It will not only abort the request but also interrupt the thread just like apiCallTimeout.ApiCallTimeout
: a timeout task is scheduled to a separate thread pool and it will interrupt and the current thread and abort the underlying http request when the total request time exceeds the configured timeout.ApiCallAttemptTimeout
: a timeout task is scheduled to a separate thread pool and it will interrupt the thread and abort the http request when times up.NOTE:
ApiCallAttemptTimeout
will not abort the request if the request is stuck in theResponseTransformer
because the timer task is being cancelled inMakeHttpRequestStage
andResponseTransformer
is triggered after that. This is different from asynchronous timeout where the timer task is being cancelled upon the completion of the completable future.Testing
Integration tests for timeouts are passed.
All integration tests are passed.
Screenshots (if appropriate)
Types of changes
Checklist
mvn install
succeedsLicense