Improve InputStreamResponseTransformer subscription feedback #19

grddev · 2020-04-28T12:19:36Z

Sadly, it turns out that the simplifications in #16 didn't really cut it from a performance perspective, so this instead tries to improve upon the existing byte-based buffering.

Ideally, all of this wouldn't be a problem, and the numbers you requested from the aws-sdk would be in bytes rather than chunks, as then you could just increase the number to the buffer size whenever the combined buffer size was below that point.

The main problem in estimating the impact of the additional requests is that we don't know how large the corresponding chunks will be. This uses a simple exponentially-weighted average over chunk sizes received in the past and assumes that the in-flight chunks are equally large. If the chunk sizes stay consistent, or don't change too dramatically in sizes, this should work well.

In the event that the chunk sizes grow significantly we might still overshoot by a lot, so as an additional precaution we don't allow having more than 1000 chunks of unknown sizes in flight. If the chunk sizes changes rapidly and grow beyond 32k we could still be in trouble, but that seems very unlikely to happen, and we have to make some assumptions or else we can't request any data at all using the requests abstraction.

stenlarsson

Nice! 👍

src/main/java/io/burt/athena/result/s3/InputStreamResponseTransformer.java

stenlarsson · 2020-04-28T14:09:13Z

src/main/java/io/burt/athena/result/s3/InputStreamResponseTransformer.java

@@ -28,12 +30,15 @@
    private Throwable error;
    private AtomicBoolean complete;
    private AtomicInteger approximateBufferSize;
+    private AtomicInteger requests;
+    private volatile float approximateChunkSize;


This is not initialised anywhere which is dangerous. Setting it to the length of the first chunk received should be a good starting point.

It would be nontrivial to initialize this from the first chunk (and also ensure that it is nonzero), so I think the better option is to initialize it to something. I picked 8192 completely arbitrarily, but with the continued adjustment, it shouldn't matter so much, but it should not be too small.

In short, the problem with the previous version was that the throttling didn't take into account how much we had previously requested, and thus it would over-request data by at least 10x. The somewhat longer version is that when we receive (or finish) a chunk of the response, and we are below the threshold, we request 10 more chunks. With the threshold set to 32MB, and the typical chunk size set to 1500, we will have process well over 22k chunks before we reach the threshold. Each and every one of these times, we will request 10 additional chunks (plus an additional 10 if the chunk was non-empty and we finished it before reaching the threshold). This means that we will sign up for 10x the number of chunks required to reach the threshold, and we will buffer up to that amount, as there is no means of telling the producer we want less. Assuming the chunk sizes stay consistent, this behavior is actually independent of the actual chunk sizes. By changing the number of chunks we request to 1, it means we only grow the number of in-flight chunks when finishing a chunk, but since we check the threshold at this location as well, and only increase by 1 here as well, we can't really overshoot. One potential drawback with this approach is that the ramp-up in buffer size will be dependent on the rate and amount of data processing.

It seems the simplistic approach becomes fairly slow when scaled down, and this once again amplifies the feedback, but with some limits in place. The main problem in estimating the impact of the additional requests is that we don't know how large the corresponding chunks will be. This uses a simple exponentially-weighted average over chunk sizes received in the past and assumes that the in-flight chunks are equally large. If the chunk sizes stay consistent, or don't change too dramatically in sizes, this should work well. In the event that the chunk sizes grow significantly we might still overshoot by a lot, so as an additional precaution we don't allow having more than 1000 chunks of unknown sizes in flight. If the chunk sizes changes rapidly and grow beyond 32k we could still be in trouble, but that seems very unlikely to happen, and we have to make some assumptions or else we can't request any data at all using the requests abstraction.

grddev requested review from iconara and stenlarsson April 28, 2020 12:19

grddev self-assigned this Apr 28, 2020

iconara approved these changes Apr 28, 2020

View reviewed changes

stenlarsson approved these changes Apr 28, 2020

View reviewed changes

grddev mentioned this pull request Apr 28, 2020

Simplify InputStreamResponseTransformer subscription feedback #16

Closed

grddev added 3 commits April 29, 2020 09:47

Improvements after code review

42322d4

grddev force-pushed the input-stream-response-transformer-approximate-chunk-sizes branch from 71cfc51 to 42322d4 Compare April 29, 2020 07:47

grddev merged commit 207bd15 into master Apr 29, 2020

grddev deleted the input-stream-response-transformer-approximate-chunk-sizes branch April 29, 2020 07:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve InputStreamResponseTransformer subscription feedback #19

Improve InputStreamResponseTransformer subscription feedback #19

grddev commented Apr 28, 2020

stenlarsson left a comment

stenlarsson Apr 28, 2020

grddev Apr 28, 2020

Improve InputStreamResponseTransformer subscription feedback #19

Improve InputStreamResponseTransformer subscription feedback #19

Conversation

grddev commented Apr 28, 2020

stenlarsson left a comment

Choose a reason for hiding this comment

stenlarsson Apr 28, 2020

Choose a reason for hiding this comment

grddev Apr 28, 2020

Choose a reason for hiding this comment