You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, we match bytes by doing a prefix search against encountered bytes (up to 0x100 long). Since many sequences of bytes we search for have some structure (well, common length), like a GUID or cryptographic S-Box, we can optimize some of these searches by indexing the bytes by their prefix (for common lengths, like 8, 16, 32, and 64 bytes). Then, when the wanted bytes feature has this same length, we can do if feature in features rather than for bytes in features: if bytes.startswith(feature).
This can also help the rule logic planner, since it can pre-filter more rule when the hashable features are known.
The tradeoff is that we generate N (probably 4-5) more features per bytes feature.
Maybe definitely do 16 (the size of a GUID).
8, 256, and 64 also look nice and round (and probably not-domain-specific), so consider those. 9 comes from OpenSSL SHA constants. 171 comes from Tiger S-Boxes.
Against mimikatz with the changes in #2080, we have the following evaluation counts by Bytes feature size:
feature class
evaluation count
evaluate.feature.bytes
261,464
evaluate.feature.bytes.171
71,400
evaluate.feature.bytes.64
35,794
evaluate.feature.bytes.256
34,002
evaluate.feature.bytes.16
24,226
evaluate.feature.bytes.9
18,837
evaluate.feature.bytes.128
17,002
evaluate.feature.bytes.8
10,576
evaluate.feature.bytes.56
10,200
evaluate.feature.bytes.28
7,176
evaluate.feature.bytes.48
6,800
evaluate.feature.bytes.32
6,091
evaluate.feature.bytes.7
3,588
evaluate.feature.bytes.5
3,588
evaluate.feature.bytes.20
3,400
evaluate.feature.bytes.72
3,400
evaluate.feature.bytes.121
1,794
evaluate.feature.bytes.40
897
evaluate.feature.bytes.6
897
evaluate.feature.bytes.4
897
evaluate.feature.bytes.12
897
evaluate.feature.bytes.232
2
Indexing the power-of-2 lengths would save about 49% of the scanning evaluations. I'm not sure what this costs in runtime. Will investigate before going deeper.
The text was updated successfully, but these errors were encountered:
Today, we match bytes by doing a prefix search against encountered bytes (up to 0x100 long). Since many sequences of bytes we search for have some structure (well, common length), like a GUID or cryptographic S-Box, we can optimize some of these searches by indexing the bytes by their prefix (for common lengths, like 8, 16, 32, and 64 bytes). Then, when the wanted bytes feature has this same length, we can do
if feature in features
rather thanfor bytes in features: if bytes.startswith(feature)
.This can also help the rule logic planner, since it can pre-filter more rule when the hashable features are known.
The tradeoff is that we generate N (probably 4-5) more features per bytes feature.
Maybe definitely do 16 (the size of a GUID).
8, 256, and 64 also look nice and round (and probably not-domain-specific), so consider those. 9 comes from OpenSSL SHA constants. 171 comes from Tiger S-Boxes.
Against mimikatz with the changes in #2080, we have the following evaluation counts by Bytes feature size:
Indexing the power-of-2 lengths would save about 49% of the scanning evaluations. I'm not sure what this costs in runtime. Will investigate before going deeper.
The text was updated successfully, but these errors were encountered: