Skip to content

Conversation

@ianbotsf
Copy link
Contributor

Issue #

Addresses #20 and smithy-kotlin#224.

Description of changes

This change adds support for retries in the runtime (but does not yet integrate them into codegen—separate PR coming). See the included design doc (docs/design/retries.md) for more details.

Companion PR: smithy-kotlin#487

Scope

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor

@aajtodd aajtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, couple questions

import aws.smithy.kotlin.runtime.retries.RetryErrorType.*
import aws.smithy.kotlin.runtime.retries.impl.StandardRetryPolicy

object SdkRetryPolicy : StandardRetryPolicy() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/style

AwsDefaultRetryPolicy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

This needs to evaluate transient errors doesn't it?

"RequestLimitExceeded" to Throttling,
"RequestThrottled" to Throttling,
"RequestThrottledException" to Throttling,
"RequestTimeout" to Timeout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correctness

Aren't both RequestTimeout and RequestTImeoutException classified as Transient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec calls these kinds of errors "transient" but our code calls them "timeout" errors. Transient isn't a good name for these since that's a class of error that also encompasses throttling and many server-side error conditions.

"TransactionInProgressException" to Throttling,
)

override fun evaluateOtherExceptions(ex: Throwable): RetryDirective? = when (ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correctness

Don't we need to classify raw HTTP status codes 500, 502, 503, and 504 as Transient?

Copy link
Contributor Author

@ianbotsf ianbotsf Oct 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, seems I missed that part. In order to accomplish that, we'll need access to the status code of the response which is only available in the metadata when casting the protocolResponse to an HttpProtocolResponse, which is defined in smithy-kotlin:runtime:protocol:http. Given that SdkRetryPolicy is currently located in aws-sdk-kotlin:aws-runtime:aws-core, it seems we have some options:

  • Make aws-sdk-kotlin:aws-runtime:aws-core depend on smithy-kotlin:runtime:protocol:http. Seems odd to have a "core" package depend on a protocol implementation but maybe that's okay for AWS SDK since all the services run on HTTP?
  • Move the retry policy to aws-sdk-kotlin:aws-runtime:protocols:http, which already depends on smithy-kotlin:runtime:protocol:http. Seems odd to have a retry policy defined somewhere protocol-specific but again maybe that's okay for AWS SDK since all the services run on HTTP?
  • Move statusCode up to ProtocolResponse and make it a generic feature of all protocols. While this is the most expedient, I like it the least because I don't think it's true all protocols will have responses with status codes.

Which is least odd?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make aws-sdk-kotlin:aws-runtime:aws-core depend on smithy-kotlin:runtime:protocol:http

I don't love this. I just moved config out of aws-core for similar reasons. It's an option though.

Move the retry policy to aws-sdk-kotlin:aws-runtime:protocols:http, which already depends on smithy-kotlin:runtime:protocol:http. Seems odd to have a retry policy defined somewhere protocol-specific but again maybe that's okay for AWS SDK since all the services run on HTTP?

This is closer to what makes sense to me. I could also see defining the core logic in aws-core (basically what you have now) and having it be an open class that can be further customized for HTTP protocols?

FWIW this looks to be where Rust defined theirs (aws-http).

I don't love (3) either because it defeats the entire point of the generic interface. At that point we should maybe reconsider the design/relationships.

Copy link
Contributor

@aajtodd aajtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix and ship!

"TransactionInProgressException" to Throttling,
)

internal val knownStatusCodes = mapOf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correctness

Aren't these Transient errors?

Copy link
Contributor

@aajtodd aajtodd Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also don't we need to define error codes for RequestTimeout and RequestTimeoutException as Transient error codes?
Nevermind, saw your other comment. We apparently call these Timeout

The known status codes still looks incorrect though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right you are, they should be Timeout.

@ianbotsf ianbotsf merged commit 4ae0349 into main Oct 5, 2021
@ianbotsf ianbotsf deleted the retries branch October 5, 2021 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants