-
Notifications
You must be signed in to change notification settings - Fork 820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add apiCallAttempt timeout and apiCall timeout feature for asynchrounous operations #657
Conversation
9425025
to
a913b3d
Compare
Still going through it but I had a some questions regarding the async timeouts: What are the situations in which a customer would want to use a timer on an async call? Since we return a e.g. CompletableFuture<GetObjectResponse> f = s3Client.getObject(request, toFileTransformer);
// do other stuff
f.get(10, TimeUnit.MILLISECONDS); I think we should also clearly define what it means to timeout an async task since by definition it doesn't happen synchronously with the current thread. For example, does the timer start when we submit the request to async client? Or when the async client actually starts executing the request? |
I think timeouts for async is more about releasing resources than customer SLAs. I.E. close the connection and give it back to the pool, don't submit to our future completion executor, etc. |
Customer might still want to timeout if the request is taking too long and could take up resources and there might be other chained operations in the For example:
Side note: Java 9 introduced
Yeah, I'll add more javadocs. For For |
@shorea I thought about that as well, but it seems like timeout is the wrong place to hang it on. Resource cleanup should all take place if the customer cancels the future while the request is executing as well right? I'm not sure sure if we do that today, but I don't think we do. |
@@ -87,6 +94,10 @@ | |||
public static final SdkClientOption<SdkHttpClient> SYNC_HTTP_CLIENT = | |||
new SdkClientOption<>(SdkHttpClient.class); | |||
|
|||
public static final SdkClientOption<Duration> API_CALL_ATTEMPT_TIME_OUT = new SdkClientOption<>(Duration.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should be TIMEOUT
to match spelling of the getter/setter as well as the spelling of timeout in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 good catch. it's too tempting to separate them when they are all uppercase!
} | ||
|
||
@Override | ||
public void run() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do anything special when streaming responses? Do we bubble an exception to the Subscriber
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I'm aware of. The exception is bubbled to AsyncResponseTransformer.exceptionOccurred
.
We have a test for streaming operation: https://github.com/aws/aws-sdk-java-v2/pull/657/files#diff-a78559873b6762403653c7ac0f1beb9eR103
|
||
/** | ||
* @return True if client execution has been aborted by the timer task. False otherwise | ||
*/ | ||
boolean hasTimeoutExpired(); | ||
boolean isTimedOut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: this makes it sound like the task itself has timed out. Maybe something like hasExecuted
?
+1 to cleaning up if the user cancels the future (eg. if they're using Java 9's I still think we should support global timeouts within the SDK so that the customer doesn't always have to enforce it at the future level, and can use the same timeout configuration with sync and async. |
@@ -73,6 +79,43 @@ protected RequestOverrideConfiguration(Builder<?> builder) { | |||
return apiNames; | |||
} | |||
|
|||
/** | |||
* The amount of time to wait for the http request to complete before giving up and timing out. A positive value | |||
* enables this feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be "this value will always be positive, if present"? (same comment below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, fixed
* don't get aborted until several seconds after the timer has been breached. Because of this, the request timeout | ||
* feature should not be used when absolute precision is needed. | ||
* | ||
* @see Builder#apiCallAttemptTimeout(Duration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a section here that differentiates between this and apiCallTimeout
? I see there's one down there already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 fixed
@@ -226,6 +230,7 @@ private SdkClientConfiguration finalizeAsyncConfiguration(SdkClientConfiguration | |||
*/ | |||
private SdkClientConfiguration finalizeConfiguration(SdkClientConfiguration config) { | |||
return config.toBuilder() | |||
.option(TIMEOUT_EXECUTOR_SERVICE, resolveTimeoutExecutorService()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial thought without looking further in the code first: We already have an internal scheduled executor service for retries. Can we use the same one for timeouts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we can, probably need to rename the existingASYNC_RETRY_EXECUTOR_SERVICE
to SDK_INTERNAL_EXECUTOR_SERVICE
(or something else)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reused the scheduled executor and renamed it to SCHEDULED_EXECUTOR_SERVICE
} | ||
|
||
@Override | ||
public Builder toBuilder() { | ||
return new DefaultClientOverrideConfigurationBuilder().advancedOptions(advancedOptions.toBuilder()) | ||
.headers(headers) | ||
.retryPolicy(retryPolicy) | ||
.apiCallAttemptTimeout(apiCallAttemptTimeout) | ||
.apiCallAttemptTimeout(apiCallAttemptTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! fixed
public static ApiCallAttemptTimeoutException create() { | ||
return builder().message("HTTP request execution did not complete before the specified timeout configuration.") | ||
.build(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we specify the timeout that was configured here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 updated
return requestConfig.apiCallAttemptTimeout().get().toMillis(); | ||
} | ||
|
||
if (apiCallAttemptTimeout != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems duplicated from elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other place is getApiTimoeut
and this is getApiCallAttemptTimeout
@@ -51,6 +51,7 @@ private SdkException handleServiceException(SdkHttpFullResponse response, Execut | |||
try { | |||
return delegate.handle(response, executionAttributes); | |||
} catch (InterruptedException e) { | |||
Thread.currentThread().interrupt(); | |||
throw e; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here. Where were we not interrupting the thread before? Can we just fix it there instead of re-interrupting here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, reverted the change and will create a backlog item to track those.
@@ -31,7 +31,7 @@ log4j.logger.org.apache.http.wire=INFO | |||
log4j.logger.org.apache.http=DEBUG | |||
# log4j.logger.org.apache.http.wire=DEBUG | |||
# | |||
# log4j.logger.com.amazonaws=DEBUG | |||
log4j.logger.sofeware.amazonaws.awssdk=TRACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
software*? needed?
@@ -27,8 +27,9 @@ log4j.appender.A1.layout.ConversionPattern=%d [%t] %-5p %c - %m%n | |||
#log4j.logger.httpclient.wire=DEBUG | |||
|
|||
# HttpClient 4 Wire Logging | |||
log4j.logger.org.apache.http.wire=DEBUG | |||
log4j.logger.org.apache=TRACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these needed?
public static Duration isPositiveIfPresent(Duration duration, String fieldName) { | ||
if (duration == null) { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe isPositiveOrNull
? "Present" implies there's an optional involved, to me. Can isPositive
delegate to here now for its positivity check?
I also think the naming convention elsewhere is to include "param" in the method name if the second parameter is a field name. Otherwise it would be a string format for the caller to specify the whole message... but it looks like that's already been broken by isPositive
, so nevermind!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have missed it, but are API call attempt timeouts implement for sync calls?
context.apiCallTimeoutTracker(timeoutTracker); | ||
|
||
future.whenComplete((o, t) -> { | ||
if (t != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the checks here because we don't want it to timeout if the completableFuture fails exceptionally. I'll add a test for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the future has already been completed, the timeout task should not have any effect though right? Unless it tries to force the timeout exception by using obtrudeException
: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html#obtrudeException-java.lang.Throwable-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops.. I meant to write if (t == null)
(The unit tests succeeded because there were either response unmarshaller expception or ApiCallAttemptTimeoutException got retried). But yeah, I think you are right, we should remove the check here.
public void run() { | ||
log.trace(() -> "Times up. Aborting the request"); | ||
hasExecuted = true; | ||
if (completableFuture != null && !completableFuture.isDone()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would we have a null
future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it should not be null. I'll add the validation in the constructor.
|
||
@Override | ||
public void cancel() { | ||
if (future != null && !future.isDone()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, would we ever have a null
future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup. same as above, will remove the null check here and add validation in the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some weird linebreaks in the javadocs for apiCallTimeout
, apiCallAttemptTimeout
getters/setters. Can we fix those as well?
|
||
@Override | ||
public void cancel() { | ||
if (!future.isDone()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this check. If the future is already complete cancel()
should have no effect. There's also no guarantee isDone
would still be false when we call cancel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 fixed
public static <T> TimeoutTracker timeCompletableFuture(CompletableFuture<T> completableFuture, | ||
ScheduledExecutorService timeoutExecutor, | ||
SdkClientException exceptionToThrow, | ||
long timeoutInMills) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Taking Duration
as a param would be more flexible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm. I actually prefer using long here because we don't need another null check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough; it's an internal util so I' don't feel strongly either way.
* | ||
* @param duration Number to validate | ||
* @param fieldName Field name to display in exception message if not positive. | ||
* @return Duration if positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "or null"
return firstValue; | ||
} | ||
|
||
T value = fallbackValue.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fallbackValue
is empty get()
will throw an exception here. I think we can just return fallbackValue
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops nm, it's a Supplier
, not an Optional
. In that case we can just change this to Optional.ofNullable(value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you add a changelog message? Can you also clarify in your commit and and changelog message that it's for async only?
6d498ac
to
cf225b2
Compare
…1d0a417c Pull request: release <- staging/388a3a8b-bd1c-4999-82b9-975d1d0a417c
.option(SdkClientOption.ASYNC_RETRY_EXECUTOR_SERVICE, Executors.newScheduledThreadPool(1)) | ||
.option(SdkAdvancedAsyncClientOption.FUTURE_COMPLETION_EXECUTOR, | ||
Runnable::run) | ||
.option(SdkClientOption.SCHEDULED_EXECUTOR_SERVICE, Executors.newScheduledThreadPool(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isnt this thread pool technically unbounded? The core pool size is 1, but the max pool is int.max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the work queue used by the scheduledThreadPool is unbounded. Because the JDK classes only use the core pool size until the work queue is full, and the work queue is unbounded, the maximumPoolSize within the scheduled thread pool should never used.
Add apiCallAttempt timeout and apiCall timeout feature for asynchronous operations.
I renamed the original timeout classes. Below are the new classes:
TimeoutTask
interface of the timer task to abort the abortable->
AsyncTimeoutTask
implementation for asynchronous operationTimeoutTracker
interface to track theTimeoutTask
and theScheduledFuture
->
ApiCallTimeoutTracker
default implementations to track Api call time out taskAsyncTimeoutTask
->
NoOpTimeoutTracker
when timeout is not enabledI removed all the existing timer unit tests using theAmazonSyncHttpClient
because I'm working on adding more tests using the generated clients and those tests would cover the existing tests.Second thoughts: will fix them and add them backAdded them back.
TODOs:
add more protocol tests with both apiCallTimeout and apiCallAttemptTimeout enabled.add javadocsclean up logs