Convert string and char_buf matches to hash look ups #1408

kevsecurity · 2023-08-30T16:42:43Z

Currently string, char buf, file, fd, and path matchers (equal, prefix, postfix, plus the 'not' equivalents for file, fd and path) are all written in BPF using loops to inspect (potentially) every character in the strings to be matched.

This series (see commits for details) converts all these matchers to hash map look ups, using a set of 6 hash maps of varying sizes for equal matches, and LPM TRIE maps for prefix and postfix look ups. In addition, the 'not' equivalents are implemented for string and char buf.

The default string lengths for the matches has been set at 128 bytes, but could be increased to an arbitrary size for equals, and up to 256 bytes for prefix and postfix. For small programs (<5.3), the postfix limit (which is bounded by a byte-by-byte reverse copy) has been raised from 40 chars to 96 chars. For large programs, the postfix limit is set at 128 chars (raised from 50 chars).

netlify · 2023-08-30T16:53:28Z

✅ Deploy Preview for tetragon ready!

Name	Link
🔨 Latest commit	`4e4f713`
🔍 Latest deploy log	https://app.netlify.com/sites/tetragon/deploys/64f0aae21577c800087df4d2
😎 Deploy Preview	https://deploy-preview-1408--tetragon.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kkourt

Hi,

Had a look at the first patch. The idea of using hash maps makes perfect sense to me, so thanks for taking this on.

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space. Could you provide some examples of how much space we would save using this approach? I think it depends on the use-case but my feeling is that if we are talking about 10-20 string values, then the difference is not that big.

In any case, I would suggest that the keep the go structures simpler than what we do in bpf, and then deal with the exact structure of the bpf maps when we load the data.

bpf/process/string_maps.h

pkg/sensors/tracing/selectors.go

pkg/selectors/selectors.go

bpf/process/types/basic.h

bpf/process/string_maps.h

pkg/sensors/tracing/selectors.go

examples/tracingpolicy/hardlink-observe.yaml

pkg/selectors/selectors.go

pkg/selectors/kernel_test.go

kevsecurity · 2023-09-05T08:55:21Z

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space.

The motivation is to save CPU cycles in the look up code. If we only have one hash map with a key size of (say) 128 bytes, then every string we try to look up will require hashing 128 bytes, regardless if many of them are 0x00. Using several maps with different size keys means we only hash an amount of bytes that is closer to the string length we wish to look up.

kevsecurity · 2023-09-05T11:51:11Z

In any case, I would suggest that the keep the go structures simpler than what we do in bpf, and then deal with the exact structure of the bpf maps when we load the data.

The problem with keeping the go structures simpler is that we need to know the array indices of the hash maps at the point that we build the kernel selectors. The easiest way to know these is to make lists of hash maps that mirror those in BPF. That said, some level of harmonisation is possible, which I shall do.

tixxdz · 2023-09-05T12:42:30Z

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space.

The motivation is to save CPU cycles in the look up code. If we only have one hash map with a key size of (say) 128 bytes, then every string we try to look up will require hashing 128 bytes, regardless if many of them are 0x00. Using several maps with different size keys means we only hash an amount of bytes that is closer to the string length we wish to look up.

During offline sync we discussed quickly why the different maps, and to me it is fine. Maybe it is complex from readability, but from the performance view even if we don't have numbers, but the fact that 1. these selectors where not working properly and now they are being fixed, and 2. file name and string matching is the most basic and easy use case. Users may watch different paths then having better benchmark results on top is always good.

kkourt · 2023-09-06T06:55:46Z

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space.

, and it would be nice to justify the code complexity of having multiple maps with some numbers.
The motivation is to save CPU cycles in the look up code. If we only have one hash map with a key size of (say) 128 bytes, then every string we try to look up will require hashing 128 bytes, regardless if many of them are 0x00. Using several maps with different size keys means we only hash an amount of bytes that is closer to the string length we wish to look up.

During offline sync we discussed quickly why the different maps, and to me it is fine. Maybe it is complex from readability, but from the performance view even if we don't have numbers, but the fact that 1. these selectors where not working properly and now they are being fixed, and 2. file name and string matching is the most basic and easy use case. Users may watch different paths then having better benchmark results on top is always good.

(1) and (2) make sense to me, but they could also be implemented with a single map for all sizes. Having different maps for different sizes is a performance optimization (for CPU cycles as you mention above), and, given the code complexity it introduces, I think it's reasonable to ask: is it worth it? How much time do we save compared to when we use a single map for all sizes?

kevsecurity · 2023-09-06T08:33:24Z

(1) and (2) make sense to me, but they could also be implemented with a single map for all sizes. Having different maps for different sizes is a performance optimization (for CPU cycles as you mention above), and, given the code complexity it introduces, I think it's reasonable to ask: is it worth it? How much time do we save compared to when we use a single map for all sizes?

Jhash cost is a little over a CPU cycle per length of data to hash in bytes, but we could say that is equal; i.e. hashing 128 bytes costs 128 CPU cycles (see https://fosdem.org/2023/schedule/event/bpf_hashing/ for details/confirmation). The existing byte-by-byte comparison has a cost of CPU cycles that similarly grows with the length of the data to compare.

While fixing (1) and (2) above, we also don't want to incur extra CPU cost if we can avoid it. In this situation, we want to give the users the ability to specify fairly long strings/filenames to match on – say ~128 bytes. If we take the simplistic approach of a single map, then all look ups will require hashing 128 bytes, with a cost of ~128 CPU cycles. By dividing the strings into different maps, we reduce this cost to approximately the size of the key. Therefore, assuming many look up strings are less than 24 byte in length, many look ups would cost 25 CPU cycles compared with 128 CPU cycles. We could assume that the next biggest category would be slightly longer strings, with these costing 49 CPU cycles compared to 128 CPU cycles.

Compared to these gains, the code isn't really that complex. Rather than putting all the strings in the same map, they are simply divided into six maps based on their length; and then at look up, the appropriate map is chosen based on the length of the look up string. An advantage of this approach is that we could increase the maximum string length to, say, 1024 bytes with no additional CPU cost – there may be a small memory cost, however.

Given that Tetragon hooks kernel functions and performs comparisons during many of the hooks, we always cost a system CPU time. Looking at Sysmon (Windows) as an example of a tool with millions of installs, many of the use cases feature string and filename matching. IMO it's not a case of "Is it worth it?" but more a case of "How could we justify not doing it?"

kevsecurity · 2023-09-06T11:41:27Z

I've taken some measurements using 50,000,000 calls to a kprobe:

The difference between no hash and the hash based on 6 maps is 60.2ns/kprobe.
The difference between no hash and the hash based on a single map is 127.6ns/kprobe.
The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe.

Objectively, the hash for a single map is more than twice the cost of the hash for 6 maps. This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%.

olsajiri · 2023-09-06T13:25:12Z

I'm getting following error for spec below, which I assume should work now

time="2023-09-06T13:17:54Z" level=fatal msg="Failed to start tetragon" error="sensor gkp-sensor-1 from collection sys-symlink-passwd failed to load: failed prog bpf/objs/bpf_multi_kprobe_v61.o kern_version 393730 loadInstance: failed to insert string_maps_0_8: update: key too big for map: argument list too long"

it seems there's double values iteration, first in writeMatchValues and then in writeMatchStrings,
but I still don't get fully the mapid/map_idx correlation.. so I might be off ;-)

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "sys-symlink-passwd"
spec:
  kprobes:
  - call: "sys_symlinkat"
    syscall: true
    args:
    - index: 0
      type: "string"
    - index: 1
      type: "int"
    - index: 2
      type: "string"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Equal"
        values:
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"

kevsecurity · 2023-09-06T13:46:44Z

I'm getting following error for spec below, which I assume should work now

time="2023-09-06T13:17:54Z" level=fatal msg="Failed to start tetragon" error="sensor gkp-sensor-1 from collection sys-symlink-passwd failed to load: failed prog bpf/objs/bpf_multi_kprobe_v61.o kern_version 393730 loadInstance: failed to insert string_maps_0_8: update: key too big for map: argument list too long"

it seems there's double values iteration, first in writeMatchValues and then in writeMatchStrings, but I still don't get fully the mapid/map_idx correlation.. so I might be off ;-)

That's odd. I'll take a look. Thanks for trying it out.

mtardy

So I think after slowly reviewing this I learned many things, thanks, that's a really cool use of BPF_MAP_TYPE_LPM_TRIE! I must admit the multiple maps + padding stuff makes everything a bit more complex and hard to read indeed. I wonder after your benchmarks if you want to remove that optimization or not.

Anyway, that would be helpful as well for #1278, as of now we use sets as well for In and NotIn that we could replace with those already implemented operators. Paths will be similar to string or char_buf.

I made some annoying remarks because it was confusing/or just convention, feel free to ignore them, just sayin.

It should be good to do a pass on the documentation as well (after that PR) to remove the trailing \0, even if they are harmless (thanks to your fix), they are confusing:

-  - "/etc/passwd\0"
+  - "/etc/passwd"

bpf/process/string_maps.h

pkg/selectors/selectors.go

pkg/sensors/tracing/selectors.go

kevsecurity · 2023-09-07T10:21:41Z

it seems there's double values iteration, first in writeMatchValues and then in writeMatchStrings, but I still don't get fully the mapid/map_idx correlation.. so I might be off ;-)

Yep that's what it was – I fixed it in the last push. When I simplified the code I moved writeMatchStrings into writeMatchValues as I thought it made more sense there, but this introduced the double iteration. All looks good now.

Regarding mapid/map_idx: consider just the smallest key'd map (key size is 25 bytes). A single kprobe could have more than one string Equal match section – maybe for different parameters or different actions. Therefore there could be 2 or more hash tables of strings, one for each match section. Each of these is placed inside an array_of_hash map at indices 0, 1, 2... These indices are then stored in the kernel selectors, so each match knows which hash table to use.

But of course there are six key sizes, so therefore six array_of_hash maps, and therefore the above process six times. In the selector we store the index for the relevant hash map in the first array_of_hash, then the index for the relevant hash map in the second array_of_hash, and so on for all six. In BPF, we take the input string, work out which key size it would fall into, get the index from the selector for that hash map, and look it up in the relevant array_of_hash. Then we can pad our string to that key size and look up the string in that hash map.

HTH

olsajiri

I'd like to see more tests given the size of new code (also for prefix/postfix),
but I guess that can be added later, anyway it looks good to me now

kevsecurity · 2023-09-08T08:12:16Z

I'd like to see more tests given the size of new code (also for prefix/postfix), but I guess that can be added later, anyway it looks good to me now

Thanks for the approval. My thought about tests was that this isn't new functionality, so while more tests might be a good idea, we were already okay with the testing of the existing implementation (which happened to have bugs that I fixed a little while back!).

mtardy · 2023-09-08T12:26:31Z

I'd like to see more tests given the size of new code (also for prefix/postfix), but I guess that can be added later, anyway it looks good to me now

Thanks for the approval. My thought about tests was that this isn't new functionality, so while more tests might be a good idea, we were already okay with the testing of the existing implementation (which happened to have bugs that I fixed a little while back!).

yep understandable, it would be nice though to add some tests on the limits, like the one Jiri proposed #1408 (comment). But as he said, we can add them later.

kevsecurity · 2023-09-08T14:00:22Z

yep understandable, it would be nice though to add some tests on the limits, like the one Jiri proposed #1408 (comment). But as he said, we can add them later.

I added tests to the first commit for the 6 map string look up, all focused on the limits. Prefix and postfix could potentially benefit from tests at the max string length, but I don't feel they are urgent. The fact that tests already exist for them (from the previous implementation) satisfies me.

kkourt

Thanks! This looks good to me.

I have some suggestions, but I feel the PR is definitely an improvement from the current state of things, so I'm happy to merge it as is. The only thing I would like to clarify before merging is the numbers. First, thanks for measuring! The summary seems to be that this gives us a 2.3% improvement, which is small but an improvement nevertheless.

~~I'm not sure I understand the numbers though.~~

I've taken some measurements using 50,000,000 calls to a kprobe:

The difference between no hash and the hash based on 6 maps is 60.2ns/kprobe.
The difference between no hash and the hash based on a single map is 127.6ns/kprobe.
The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe.

Objectively, the hash for a single map is more than twice the cost of the hash for 6 maps. This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%.

For example:

The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe.

So my reading of this is that we are saving 67.4ns for every invocation of the hook.
But after you say that:

This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%.

~~So I'm not sure where the 2.3% comes from. Am I missing something?~~
I was: 67.4 / 2920 = 2.3%.

Could you add the measurements (maybe the absolute values?) for every run in the commit message so that it is available as part of git history?

Thanks!

bpf/process/types/basic.h

pkg/selectors/kernel.go

pkg/selectors/selectors.go

bpf/process/types/basic.h

Full string matches in argMatch selectors are limited to two strings per selector. This limitation is due to the complexity of looping over every byte of long strings. This commit reimplements this as a set of hash look ups to reduce complexity and permit an infinite number of full string matches. BPF hash maps use jhash to hash their keys. jhash operates on 12 byte blocks, followed by a final block. It appears that the most efficient use of jhash is when the input is 1 byte longer than a multiple of 12. Using a single hash map would require all inputs to be the same length, as the key is a fixed length. This is inefficient as short strings will involve a lot of hashing of padding bytes. This commit implements six string hash maps, for padded strings of sizes 25, 49, 73, 97, 121, and 145 bytes. In order to not confuse character buffers of different lengths where one buffer ends in padding and the other ends in matching 0s, the first byte of the padded buffer is the length. This permits this approach to be used for character buffers in addition to null-terminated strings. Note, as a consequence of implementing hash look ups for string matching, we no longer need to specify trailing null characters ("\0") in tracing policies. Any trailing null characters will be removed internally before being padded and inserted into the maps. I've taken some measurements using 50,000,000 calls to a kprobe: * The difference between no hash and the hash based on 6 maps is 60.2ns/kprobe. * The difference between no hash and the hash based on a single map is 127.6ns/kprobe. * The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe. Objectively, the hash for a single map is more than twice the cost of the hash for 6 maps. This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

Prefix string matches are limited to two strings per selector. This limitation is due to the complexity of looping over every byte in long strings. This commit replaces the prefix matching with LPM TRIE look ups. Prefix strings are stored in a LPM TRIE map with the prefixes set to the length of the strings (in bits). Look ups are performed by setting the look up prefix to the length of the string to look up (in bits) and calling the usual map look up helper. This permits any number of string prefixes for strings to be compared against. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

Postfix string matches are limited to two strings per selector. This limitation is due to the complexity of looping over every byte in long strings. This commit replaces the postfix matching with LPM TRIE look ups. Postfix strings are stored (in reverse) in a LPM TRIE map with the prefixes set to the length of the strings (in bits). Look ups are performed by setting the look up prefix to the length of the string to look up (in bits), and the data to the reverse of the string to look up, and calling the usual map look up helper. This permits any number of string postfixes for strings to be compared against. As hash look ups are now used for postfix matching, the null characters (\0) are no longer required on the end of match strings. Any null characters found on the ends of the match strings will be gracefully removed. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

Just like in the previous commit regarding string matches, this commit converts file and FD equal matches to hash look ups. It also covers notEqual matches too. See previous commits for details. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

As for the previous commits that converted string prefix and postfix matches to LPM TRIE look ups, this commit converts file and FD prefix and postfix matches to the same. Also includes 'not' matches. Note: TestMultipleMountPathFiltered on kernels <v5.2 has been modified as the policy specified a postfix path that was longer than could possibly be matched. The test now specifies a policy with a path that can be matched correctly. The test passed prior to this commit because the postfix comparison started at the end and stopped as soon as one of the strings was exhausted. This meant that the full specified postfix string wasn't checked. This was an incorrect result as the start of the path could have contained any directories and it would still have matched. The updated test now only contains a postfix path that can actually be checked. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

Having implemented all the equal/prefix/postfix matches in hashes, and having implemented the not versions for file/fd, now implement the not versions for string and char buffers. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

When the string and file matches relied on byte for byte matching in BPF, postfix matches were limited to 50 chars for >=5.4 and 40 chars <5.4. With the new hash match look ups, the >=5.4 limit was raised to the standard 128 characters by default. For <5.4, we can now raise the 'small program' limit to 96 without exceeding the instruction count. Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>

kkourt

Thanks! 🚀

kevsecurity requested a review from a team as a code owner August 30, 2023 16:42

kevsecurity requested a review from tixxdz August 30, 2023 16:42

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from cf91e38 to e5890d1 Compare August 30, 2023 16:52

kevsecurity added the release-note/minor This PR introduces a minor user-visible change label Aug 30, 2023

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch 2 times, most recently from 5230a73 to 4e4f713 Compare August 31, 2023 14:59

kevsecurity mentioned this pull request Sep 4, 2023

Test and document the limitations of the fields in TracingPolicy #709

Open

kkourt requested changes Sep 5, 2023

View reviewed changes

bpf/process/string_maps.h Show resolved Hide resolved

pkg/sensors/tracing/selectors.go Outdated Show resolved Hide resolved

pkg/selectors/selectors.go Outdated Show resolved Hide resolved

bpf/process/types/basic.h Show resolved Hide resolved

olsajiri reviewed Sep 5, 2023

View reviewed changes

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch 3 times, most recently from c57141f to 503053e Compare September 5, 2023 15:47

kevsecurity requested review from kkourt and olsajiri September 5, 2023 16:54

mtardy reviewed Sep 6, 2023

View reviewed changes

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from 503053e to 71ed2ff Compare September 6, 2023 17:32

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from 71ed2ff to 923dd6f Compare September 7, 2023 13:45

olsajiri approved these changes Sep 7, 2023

View reviewed changes

kkourt requested changes Sep 11, 2023

View reviewed changes

kevsecurity added 2 commits September 11, 2023 15:46

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch 2 times, most recently from ea8550c to 4d15cd7 Compare September 11, 2023 16:46

kevsecurity added 5 commits September 11, 2023 17:46

kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from 4d15cd7 to d24eaad Compare September 11, 2023 16:47

kevsecurity requested a review from kkourt September 11, 2023 17:04

kkourt approved these changes Sep 12, 2023

View reviewed changes

kevsecurity merged commit bbfee9d into main Sep 12, 2023
26 checks passed

kevsecurity deleted the pr/kevsecurity/string-match-hash branch September 12, 2023 08:58

mtardy mentioned this pull request Sep 15, 2023

Add the "Prefix" operator for the "matchBinaries" selector #1278

Closed

2 tasks

p-linnane mentioned this pull request Nov 1, 2023

tetra 1.0.0 Homebrew/homebrew-core#153073

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert string and char_buf matches to hash look ups #1408

Convert string and char_buf matches to hash look ups #1408

kevsecurity commented Aug 30, 2023

netlify bot commented Aug 30, 2023 •

edited

kkourt left a comment

kevsecurity commented Sep 5, 2023

kevsecurity commented Sep 5, 2023

tixxdz commented Sep 5, 2023

kkourt commented Sep 6, 2023 •

edited

kevsecurity commented Sep 6, 2023

kevsecurity commented Sep 6, 2023 •

edited

olsajiri commented Sep 6, 2023

kevsecurity commented Sep 6, 2023

mtardy left a comment

kevsecurity commented Sep 7, 2023

olsajiri left a comment

kevsecurity commented Sep 8, 2023

mtardy commented Sep 8, 2023

kevsecurity commented Sep 8, 2023

kkourt left a comment •

edited

kkourt left a comment

Convert string and char_buf matches to hash look ups #1408

Convert string and char_buf matches to hash look ups #1408

Conversation

kevsecurity commented Aug 30, 2023

netlify bot commented Aug 30, 2023 • edited

✅ Deploy Preview for tetragon ready!

kkourt left a comment

Choose a reason for hiding this comment

kevsecurity commented Sep 5, 2023

kevsecurity commented Sep 5, 2023

tixxdz commented Sep 5, 2023

kkourt commented Sep 6, 2023 • edited

kevsecurity commented Sep 6, 2023

kevsecurity commented Sep 6, 2023 • edited

olsajiri commented Sep 6, 2023

kevsecurity commented Sep 6, 2023

mtardy left a comment

Choose a reason for hiding this comment

kevsecurity commented Sep 7, 2023

olsajiri left a comment

Choose a reason for hiding this comment

kevsecurity commented Sep 8, 2023

mtardy commented Sep 8, 2023

kevsecurity commented Sep 8, 2023

kkourt left a comment • edited

Choose a reason for hiding this comment

kkourt left a comment

Choose a reason for hiding this comment

netlify bot commented Aug 30, 2023 •

edited

kkourt commented Sep 6, 2023 •

edited

kevsecurity commented Sep 6, 2023 •

edited

kkourt left a comment •

edited