Skip to content

Reducing memory consumption of the AWS SDK for Java v2 when loading S3 objects into memory

Notifications You must be signed in to change notification settings

rtyley/aws-sdk-async-response-bytes

Repository files navigation

aws-sdk-async-response-bytes

Reducing memory consumption of the AWS SDK for Java v2 when loading S3 objects into memory

This repo is for evaluation of different approaches to reducing memory consumption of the AWS SDK Java v2 async method AsyncResponseTransformer.toBytes(). The resulting PR is aws/aws-sdk-java-v2#4355.

Approaches under consideration for reducing memory usage:

  • Change A: Optimise byte storage, comes in two successive parts:
    • A1: Use the Content Length to initialise the ByteArrayOutputStream with a byte array of the right size.
    • A2: Use a simple fixed-size byte array in preference to a ByteArrayOutputStream
  • Change B: Avoid performing a byte array copy when creating the ResponseBytes instance

Approaches A & B can be considered individually, but work much better when combined.

Automated Memory-Consumption tests

The test script findMinMem.sh repeatedly downloads a 258 MB object from S3 into memory, while varying the amount of Java heap memory allocated with -Xmx, to find the amount of memory necessary for the download to consistently succeed with the given approach.

These are the resulting memory requirements found, in MB, for all possible permutations of the fixes:

Exclude A A1 A1+A2
Exclude B 1070 934 644
Include B 787 657 357

From the results, it's clear that we need all 3 changes (A1, A2, & B) in order to get the lowest memory usage.

Test details

About

Reducing memory consumption of the AWS SDK for Java v2 when loading S3 objects into memory

Topics

Resources

Stars

Watchers

Forks