Reducing memory consumption of the AWS SDK for Java v2 when loading S3 objects into memory
This repo is for evaluation of different approaches to reducing memory consumption of the AWS SDK Java v2 async
method
AsyncResponseTransformer.toBytes()
.
The resulting PR is aws/aws-sdk-java-v2#4355.
Approaches under consideration for reducing memory usage:
- Change A: Optimise byte storage, comes in two successive parts:
- Change B: Avoid performing a byte array copy when creating the
ResponseBytes
instance
Approaches A & B can be considered individually, but work much better when combined.
The test script findMinMem.sh
repeatedly downloads a 258 MB
object from S3 into memory, while varying the amount of Java heap memory allocated with -Xmx
,
to find the amount of memory necessary for the download to consistently succeed with the given
approach.
These are the resulting memory requirements found, in MB, for all possible permutations of the fixes:
Exclude A | A1 | A1+A2 | |
---|---|---|---|
Exclude B | 1070 | 934 | 644 |
Include B | 787 | 657 | 357 |
From the results, it's clear that we need all 3 changes (A1, A2, & B) in order to get the lowest memory usage.
- JVM invocation: forked (for each invocation of
GetObject
) - Invocation repetition count for success: 20
- SdkAsyncHttpClient:
AwsCrtAsyncHttpClient
(the AWS CRT-based HTTP client) - Download integrity: SHA-256 check against known hash