Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve InputStreamResponseTransformer subscription feedback #19

Merged

Conversation

grddev
Copy link
Contributor

@grddev grddev commented Apr 28, 2020

Sadly, it turns out that the simplifications in #16 didn't really cut it from a performance perspective, so this instead tries to improve upon the existing byte-based buffering.

Ideally, all of this wouldn't be a problem, and the numbers you requested from the aws-sdk would be in bytes rather than chunks, as then you could just increase the number to the buffer size whenever the combined buffer size was below that point.

The main problem in estimating the impact of the additional requests is that we don't know how large the corresponding chunks will be. This uses a simple exponentially-weighted average over chunk sizes received in the past and assumes that the in-flight chunks are equally large. If the chunk sizes stay consistent, or don't change too dramatically in sizes, this should work well.

In the event that the chunk sizes grow significantly we might still overshoot by a lot, so as an additional precaution we don't allow having more than 1000 chunks of unknown sizes in flight. If the chunk sizes changes rapidly and grow beyond 32k we could still be in trouble, but that seems very unlikely to happen, and we have to make some assumptions or else we can't request any data at all using the requests abstraction.

@grddev grddev self-assigned this Apr 28, 2020
Copy link
Contributor

@stenlarsson stenlarsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 👍

@@ -28,12 +30,15 @@
private Throwable error;
private AtomicBoolean complete;
private AtomicInteger approximateBufferSize;
private AtomicInteger requests;
private volatile float approximateChunkSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not initialised anywhere which is dangerous. Setting it to the length of the first chunk received should be a good starting point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nontrivial to initialize this from the first chunk (and also ensure that it is nonzero), so I think the better option is to initialize it to something. I picked 8192 completely arbitrarily, but with the continued adjustment, it shouldn't matter so much, but it should not be too small.

In short, the problem with the previous version was that the throttling didn't take into account how much we had previously requested, and thus it would over-request data by at least 10x.

The somewhat longer version is that when we receive (or finish) a chunk of the response, and we are below the threshold, we request 10 more chunks. With the threshold set to 32MB, and the typical chunk size set to 1500, we will have process well over 22k chunks before we reach the threshold. Each and every one of these times, we will request 10 additional chunks (plus an additional 10 if the chunk was non-empty and we finished it before reaching the threshold). This means that we will sign up for 10x the number of chunks required to reach the threshold, and we will buffer up to that amount, as there is no means of telling the producer we want less. Assuming the chunk sizes stay consistent, this behavior is actually independent of the actual chunk sizes.

By changing the number of chunks we request to 1, it means we only grow the number of in-flight chunks when finishing a chunk, but since we check the threshold at this location as well, and only increase by 1 here as well, we can't really overshoot.

One potential drawback with this approach is that the ramp-up in buffer size will be dependent on the rate and amount of data processing.
It seems the simplistic approach becomes fairly slow when scaled down, and this once again amplifies the feedback, but with some limits in place.

The main problem in estimating the impact of the additional requests is that we don't know how large the corresponding chunks will be. This uses a simple exponentially-weighted average over chunk sizes received in the past and assumes that the in-flight chunks are equally large. If the chunk sizes stay consistent, or don't change too dramatically in sizes, this should work well.

In the event that the chunk sizes grow significantly we might still overshoot by a lot, so as an additional precaution we don't allow having more than 1000 chunks of unknown sizes in flight. If the chunk sizes changes rapidly and grow beyond 32k we could still be in trouble, but that seems very unlikely to happen, and we have to make some assumptions or else we can't request any data at all using the requests abstraction.
@grddev grddev force-pushed the input-stream-response-transformer-approximate-chunk-sizes branch from 71cfc51 to 42322d4 Compare April 29, 2020 07:47
@grddev grddev merged commit 207bd15 into master Apr 29, 2020
@grddev grddev deleted the input-stream-response-transformer-approximate-chunk-sizes branch April 29, 2020 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants