New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-s3 mutithread, limits and relative start #1669

Open
spartantri opened this Issue Oct 16, 2018 · 1 comment

Comments

Projects
None yet
2 participants
@spartantri

spartantri commented Oct 16, 2018

Feature request

Compatibility
manager

Component involved
aws wodle

Description

The current version of aws-s3.py script has some limitations that make it ok for very small deployments but not usable for bigger deployments.

  • iter_files_in_bucket does the file verification and parsing sequentially which can take ages, consider processing the objects in parallel up to a limit (e.g. 1000 or from an ossec.conf key value)
  • retain_db_records value is very low and the values are hardcoded, it would be better to take it from ossec.conf key values, think what happens with those with +1000 objects in them?
  • start after is a nice feature but is also hardcoded, could it be relative instead like -1d
  • filter_args is also hardcoded to 1000, think what happens with those with +1000 objects in them?
  • iter_events could benefit from a whitelist/blacklist filter to avoid importing noisy/useless logs

The last rewrite improved a lot since last time I checked in 3.4 :)

@mgmacias95

This comment has been minimized.

Member

mgmacias95 commented Oct 17, 2018

Hello @spartantri,

Thank you so much for this feedback 😄!!

  • You're right about the iter_files_in_bucket function but it's difficult to do real parallel code in python. There's a threading module available in python but, because of the python's GIL, the code won't be parallel at all. We must think how to increase performance in that function.
  • There's already an issue about the retain_db_records: #1367. If you have more than 1K log files daily in your bucket, you will get repeated events in the alerts. There's already a fix made in #1602 that we're testing.
  • The -1d idea is great! We should definitely add that 😄!
  • What do you mean with whitelist/blacklist? We do that using rules and CDB lists.

Also, we'll keep improving this module. Check #1522 to see our current roadmap.

Best regards,
Marta

@mgmacias95 mgmacias95 referenced this issue Oct 17, 2018

Open

AWS wodle improvements #1522

2 of 26 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment