Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate possibility of OutOfMemory errors from scaling Kinesis Shards #250

Open
jbeemster opened this issue Feb 8, 2022 · 1 comment

Comments

@jbeemster
Copy link
Member

Currently the S3 Loader buffers operate on a per-shard basis. This means that if your rotation buffer is 64mb each shard allocated to the consumer can consume this much memory. If the number of shards suddenly scales you run the risk of needing not 64mb of memory for this buffer but instead N x MaxByteBuffer - let alone overhead for the JVM and processing in general being done.

This behavior makes it impossible to auto-scale consistently as you never know how much memory an individual consumer might end up needing.

@istreeter
Copy link
Contributor

A good way to fix this is to rewrite the loader as a fs2 app, using fs2-aws as the kinesis source. Using that library, the shards would share a single buffer, so memory usage should be approximately constant even as the number of input shards increases.

This would be a big rewrite of the app, but it's a change we want to do anyway, and it is consistent with how we are now writing all other Snowplow apps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants