Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnalysisD: 4.0.0 vs 4.0 with PCRE2 support [Profiling] #6522

Closed
juliancnn opened this issue Nov 6, 2020 · 0 comments
Closed

AnalysisD: 4.0.0 vs 4.0 with PCRE2 support [Profiling] #6522

juliancnn opened this issue Nov 6, 2020 · 0 comments
Assignees
Labels
module/analysis Issues related to the Analysis daemon

Comments

@juliancnn
Copy link
Member

juliancnn commented Nov 6, 2020

Profiling - AnalysisD: 4.0.0 vs 4.0 with PCRE2 support

Brief

An analysis of the heap memory and the impact on the loading of rules and decoders was performed after the implementation of PCRE2.

Test environment

  • PCRE2 testing was done in the development branch. (From now on we call it PR Expression)
  • Testing without these changeswas done in the v4.0.0 Tag. (From now on we call it Master)
  • The ruleset used is tagged as 4.0.0 (In Wazuh-ruleset repository)

HEAP analysis:

Data obtained with massif:

valgrind --tool=massif --massif-out-file=/massif/massif.out.%p --peak-inaccuracy=1.0 --stacks=no --heap=yes --max-snapshots=1000 --time-unit=ms /var/ossec/bin/ossec-analysisd -f -u root

Analyzing the memory snapshot obtained with massif, after loading rules and decoders (and before the initialization of the queue) we have the following use of heap:

PR Expression:
13.24 MBytes (13887085 B) - snapshot=668
Master:
13.19 MBytes (13835574 B) - snapshot=668

The memory cost of supporting PCRE2 is approximately 0.37% (50.3 Kbytes). After starting the queues this represents 0.33%. It is worth clarifying that this is due to the new structure where all types of regex are stored and depends on the number of regexs, but the memory footprint is small.

without PCRE2

MAssif_MASTER

with PCRE2
massif_expression

Performance at startup

Data obtained with callgrind:

valgrind --tool=callgrind  --separate-threads=yes --callgrind-out-file=/callgrind/callgrind.out.%p ./ossec-analysisd -f -u root

Callgrind allows estimating the impact based on the number of instructions, this means that file load times, memory searches, cache misses, etc. are not taken into account, but the results may vary depending on the architecture, compiler, and optimization flags.

Column description:
- Incl: the total instructions performed by this function and all functions it calls beneath it. x100 / total instructions
- Self: The instructions performed exclusively by this function. This counter only tracks instructions used by this function, not any instruction used by functions that are called by this function.
- Called: Number of times the function was called

callgrind

the w_expression_compile and w_calloc_expression functions, their impact is so small that they do not reach 0.00%, although it was to be expected because they are simply a swtich.

Most of the CPU time is caused by parsing and memory allocation of XML files, accounting for 87% of startup cost
ParseXML

Attached test results.

Regards,
Julian

callgrind_205-regex-pcre2.zip
callgrind_master.zip
massif.out.zip

@juliancnn juliancnn self-assigned this Nov 6, 2020
@JcabreraC JcabreraC added this to the Sprint 120 - Core milestone Nov 6, 2020
@jnasselle jnasselle added the module/analysis Issues related to the Analysis daemon label Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module/analysis Issues related to the Analysis daemon
Projects
None yet
Development

No branches or pull requests

3 participants