New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 input can take a long time to start and a long time to stop #80

Open
ph opened this Issue Apr 6, 2016 · 3 comments

Comments

Projects
None yet
3 participants
@ph
Contributor

ph commented Apr 6, 2016

When you have a bucket with a really large quantity of files it can take a while to start because of all the api calls the code has to do. #25 optimize the numbers of call to a reasonable about by using v2 of the API, but this still problematic.

The plugin can also take a really long time to stop, the current architecture of the plugin is single threaded. This mean the following: the listing of remote files, the downloading, the uncompressing and the actual processing is done in a single thread.

The stop doesn't correctly interrupt this chain.

We need to decouple theses part in different stages to better control the flow of execution of this plugin.

@ph ph added the bug label Apr 6, 2016

@ph

This comment has been minimized.

Contributor

ph commented Apr 6, 2016

I am in the process of merging the logic of #25 and decoupling the code a bit to have better control of the execution. The problem with the v2 api is the way stuff are mocked have changed a lot since v1.

So I take the time to cleaning things up to see if I can improve performance and my confidence in the changes.

@ph

This comment has been minimized.

Contributor

ph commented Apr 6, 2016

Also large files is killing the performance of this plugin.

@ph ph self-assigned this Apr 6, 2016

@andrewvc andrewvc added the P2 label May 17, 2016

@kylegoch

This comment has been minimized.

kylegoch commented Jul 11, 2017

Any update on this, we keep all of our CloudTrail logs in an S3 bucket and as the months have gone by the plugin has become more and more difficult to rely on since its checking months of CloudTrail log files now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment