Various performance improvements.#3178
Conversation
millems
commented
May 5, 2022
- Replace uses of String.replace with StringUtils.replace and StringUtils.replaceEach to avoid the overhead of regex parsing the search term.
- Remove uses of SdkHttpRequest and SdkHttpResponse's headers and rawQueryParameters calls in favor of more specific methods that do not require data copying. Added spotbugs rules to prevent uses of these methods in the future.
- Update SdkHttpRequest and SdkHttpResponse to not copy data in headers and rawQueryParameters when round-tripped to/from builder unless it is actually modified.
- Reduce string copies performed during message signing.
- Remove uses of String.format that were performance-impacting on profiler output.
- Remove construction of log strings within the netty client unless the logging is enabled.
- Move most user-agent construction to client creation time instead of at request time.
- Initialize collection/map capacities to prevent resizes that will always occur.
- Prevent creation of RateLimitingTokenBucket unless adaptive retries are enabled.
1. Replace uses of String.replace with StringUtils.replace and StringUtils.replaceEach to avoid the overhead of regex parsing the search term. 2. Remove uses of SdkHttpRequest and SdkHttpResponse's headers and rawQueryParameters calls in favor of more specific methods that do not require data copying. Added spotbugs rules to prevent uses of these methods in the future. 3. Update SdkHttpRequest and SdkHttpResponse to not copy data in headers and rawQueryParameters when round-tripped to/from builder unless it is actually modified. 4. Reduce string copies performed during message signing. 5. Remove uses of String.format that were performance-impacting on profiler output. 6. Remove construction of log strings within the netty client unless the logging is enabled. 7. Move most user-agent construction to client creation time instead of at request time. 8. Initialize collection/map capacities to prevent resizes that will always occur. 9. Prevent creation of RateLimitingTokenBucket unless adaptive retries are enabled.
| this.dependencies = dependencies; | ||
| this.requestPipeline = requestPipeline; | ||
| this.rateLimitingTokenBucket = new RateLimitingTokenBucket(); | ||
| this.rateLimitingTokenBucket = null; |
There was a problem hiding this comment.
Since you're reviewing, @dagnir: Isn't the RetryableStage created with each request, so this rate limiting token bucket is never reused between requests, making it essentially do nothing?
There was a problem hiding this comment.
I didn't fix that here. I can create a backlog item to fix it...
| this.dependencies = dependencies; | ||
| this.requestPipeline = requestPipeline; | ||
| this.rateLimitingTokenBucket = new RateLimitingTokenBucket(); | ||
| this.rateLimitingTokenBucket = null; |
|
Awesome stuff! |
|
SonarCloud Quality Gate failed. |
S3 team had been, partly on our reports [1] working on numerous performance enhancements [2] [3] that Apache James IMAP workload would definitely benefit from. [1] aws/aws-sdk-java-v2#3162 [2] aws/aws-sdk-java-v2#3175 [3] aws/aws-sdk-java-v2#3178 Upgrade
|
|
||
| SdkHttpFullRequest.Builder builder = doSign(request, requestParams, signingParams, | ||
| new ContentChecksum(digestHex, sdkChecksum)); | ||
| new ContentChecksum(digestHex, sdkChecksum)); |
There was a problem hiding this comment.
Is the indent change here intended? Or an IDE fail?
There was a problem hiding this comment.
Not intended, but more of an IDE success. The indent is an improvement, imo.
| * @return true if the character is white space, false otherwise. | ||
| */ | ||
| private boolean isWhiteSpace(final char ch) { | ||
| return ch == ' ' || ch == '\t' || ch == '\n' || ch == '\u000b' || ch == '\r' || ch == '\f'; |
There was a problem hiding this comment.
Guava's charmatcher allows denser checks, with a binary search and binary sets, which might be handy with 6 + chars to check for.... Especially if in the future you decide to add more.
There was a problem hiding this comment.
I considered doing something like a 256 size boolean array lookup table for chars being whitespace (with some special handling for unicode), but this method didn't show up very high on a profiler so it felt like premature optimization. It would be nice if we could have a Guava dependency, but since we have to implement it ourselves (or deal with importing it), I wasn't inclined.
| byte[] kSecret = ("AWS4" + credentials.secretAccessKey()) | ||
| .getBytes(Charset.forName("UTF-8")); | ||
| .getBytes(StandardCharsets.UTF_8); |
There was a problem hiding this comment.
Does this needs to be recomputed all of the time? Can't the credentials holds kSecret?
There was a problem hiding this comment.
That's a cool idea. We'd probably want to use a concurrent hash map with weak reference keys from credential to cached values so that we don't leak that information out of the signer, but that could definitely save some computation for reused credentials at the cost of some volatile reads.
There was a problem hiding this comment.
I thought credentials was immutable when commenting. If mutability is in the game it would ruin potentially ruin gains...
There was a problem hiding this comment.
Credentials is immutable, but they might be used for other signing algorithms, so we don't really want to pre-calculate it within the credentials when it's created.
| .stream() | ||
| .collect(Collectors.toMap(Map.Entry::getKey, e -> HeaderValue.fromString( | ||
| firstIfPresent(e.getValue())))); | ||
| Map<String, HeaderValue> headers = new LinkedHashMap<>(); |
There was a problem hiding this comment.
Anything preventing us to pre-size this hash map?
S3 team had been, partly on our reports [1] working on numerous performance enhancements [2] [3] that Apache James IMAP workload would definitely benefit from. [1] aws/aws-sdk-java-v2#3162 [2] aws/aws-sdk-java-v2#3175 [3] aws/aws-sdk-java-v2#3178







