Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuleLoader cleanup and optimizations #2609

Open
4 tasks
Tracked by #3556
brokensound77 opened this issue Mar 1, 2023 · 5 comments
Open
4 tasks
Tracked by #3556

RuleLoader cleanup and optimizations #2609

brokensound77 opened this issue Mar 1, 2023 · 5 comments
Labels
backlog python Internal python for the repository

Comments

@brokensound77
Copy link
Collaborator

brokensound77 commented Mar 1, 2023

Overview

The purpose of this is to identify opportunities to clean up the code that makes up the rule loader (Rule, rule_validators, etc.). Loading the rules has gotten significantly slower and while some of it is due to the necessity of expanding validation, this should explore opportunities for optimization.

rule loader profiling

image

Observations

@terrancedejesus
Copy link
Collaborator

An example of something related is the "validate against ECS/Beats/Non-ECS.json AND THEN validate against integrations schema" logic. Related: #2627

A stop-gap may be to add a small patch to this validation logic. In the meantime any integration rule that uses EQL can have the integration specific fields added to the non-ecs file.

@botelastic
Copy link

botelastic bot commented May 8, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the stale 60 days of inactivity label May 8, 2023
@botelastic
Copy link

botelastic bot commented May 15, 2023

This has been closed due to inactivity. If you feel this is an error, please re-open and include a justifying comment.

@botelastic botelastic bot closed this as completed May 15, 2023
@w0rk3r w0rk3r removed the stale 60 days of inactivity label May 15, 2023
@w0rk3r w0rk3r reopened this May 15, 2023
@w0rk3r w0rk3r added the backlog label May 15, 2023
@terrancedejesus terrancedejesus added the python Internal python for the repository label Jul 10, 2023
@Mikaayenson
Copy link
Collaborator

Mikaayenson commented Jan 2, 2024

Screenshot 2024-01-02 at 1 59 34 PM

  • See the screenshot for some profiling I did with the latest code:
  • I personally like the idea of limiting what we're validating. In addition to perhaps validating integrations IFF it fails the ecs check, we may want to consider separating query validation from the rule loader.
    • Rule loader could still have some light validation (e.g. only validating the latest version supported with the latest schemas). This would help us every time we wanted to just load the rules.
    • Perhaps then, we could set an environment variable flag that specifies VALIDATE_ALL_VERSIONS that can be disabled via GitHub label for CI purposes. It's still unclear when we would be comfortable using this, but it gives us another option.
  • Lastly, it doesn't appear that multithreading will buy us much (may be even an expensive refactor since the currently implementation is not thread safe).

@Mikaayenson
Copy link
Collaborator

Mikaayenson commented Jan 2, 2024

Adding a couple skips in key places throughout the code where we looped through ALL stack versions or ALL integration versions, etc, essentially limiting validation to the latest versions.

if os.environ["DR_FAST"]:
    break

Toggling the environment variable, I consistently saw faster speeds, which is expected since we're NOT traversing EVERY version.

(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ time python test.py 

real    0m34.850s
user    0m34.218s
sys     0m0.608s
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ export DR_FAST=false
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ time python test.py 

real    3m32.382s
user    3m31.080s
sys     0m0.912s
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ export DR_FAST=false
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ export DR_FAST=true
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ time python test.py 

real    0m35.043s
user    0m34.382s
sys     0m0.589s
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ export DR_FAST=false
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ time python test.py 

real    3m33.100s
user    3m32.106s
sys     0m0.928s
(venv) ubuntu@trade-linux-testing:~/detection-rules-main$ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog python Internal python for the repository
Projects
None yet
Development

No branches or pull requests

4 participants