Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert string and char_buf matches to hash look ups #1408

Merged
merged 7 commits into from Sep 12, 2023

Conversation

kevsecurity
Copy link
Contributor

Currently string, char buf, file, fd, and path matchers (equal, prefix, postfix, plus the 'not' equivalents for file, fd and path) are all written in BPF using loops to inspect (potentially) every character in the strings to be matched.

This series (see commits for details) converts all these matchers to hash map look ups, using a set of 6 hash maps of varying sizes for equal matches, and LPM TRIE maps for prefix and postfix look ups. In addition, the 'not' equivalents are implemented for string and char buf.

The default string lengths for the matches has been set at 128 bytes, but could be increased to an arbitrary size for equals, and up to 256 bytes for prefix and postfix. For small programs (<5.3), the postfix limit (which is bounded by a byte-by-byte reverse copy) has been raised from 40 chars to 96 chars. For large programs, the postfix limit is set at 128 chars (raised from 50 chars).

@kevsecurity kevsecurity requested a review from a team as a code owner August 30, 2023 16:42
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from cf91e38 to e5890d1 Compare August 30, 2023 16:52
@netlify
Copy link

netlify bot commented Aug 30, 2023

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit 4e4f713
🔍 Latest deploy log https://app.netlify.com/sites/tetragon/deploys/64f0aae21577c800087df4d2
😎 Deploy Preview https://deploy-preview-1408--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kevsecurity kevsecurity added the release-note/minor This PR introduces a minor user-visible change label Aug 30, 2023
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch 2 times, most recently from 5230a73 to 4e4f713 Compare August 31, 2023 14:59
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Had a look at the first patch. The idea of using hash maps makes perfect sense to me, so thanks for taking this on.

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space. Could you provide some examples of how much space we would save using this approach? I think it depends on the use-case but my feeling is that if we are talking about 10-20 string values, then the difference is not that big.

In any case, I would suggest that the keep the go structures simpler than what we do in bpf, and then deal with the exact structure of the bpf maps when we load the data.

bpf/process/string_maps.h Show resolved Hide resolved
pkg/sensors/tracing/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
bpf/process/types/basic.h Show resolved Hide resolved
bpf/process/string_maps.h Show resolved Hide resolved
pkg/sensors/tracing/selectors.go Outdated Show resolved Hide resolved
examples/tracingpolicy/hardlink-observe.yaml Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/kernel_test.go Show resolved Hide resolved
@kevsecurity
Copy link
Contributor Author

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space.

The motivation is to save CPU cycles in the look up code. If we only have one hash map with a key size of (say) 128 bytes, then every string we try to look up will require hashing 128 bytes, regardless if many of them are 0x00. Using several maps with different size keys means we only hash an amount of bytes that is closer to the string length we wish to look up.

@kevsecurity
Copy link
Contributor Author

In any case, I would suggest that the keep the go structures simpler than what we do in bpf, and then deal with the exact structure of the bpf maps when we load the data.

The problem with keeping the go structures simpler is that we need to know the array indices of the hash maps at the point that we build the kernel selectors. The easiest way to know these is to make lists of hash maps that mirror those in BPF. That said, some level of harmonisation is possible, which I shall do.

@tixxdz
Copy link
Member

tixxdz commented Sep 5, 2023

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space.

The motivation is to save CPU cycles in the look up code. If we only have one hash map with a key size of (say) 128 bytes, then every string we try to look up will require hashing 128 bytes, regardless if many of them are 0x00. Using several maps with different size keys means we only hash an amount of bytes that is closer to the string length we wish to look up.

During offline sync we discussed quickly why the different maps, and to me it is fine. Maybe it is complex from readability, but from the performance view even if we don't have numbers, but the fact that 1. these selectors where not working properly and now they are being fixed, and 2. file name and string matching is the most basic and easy use case. Users may watch different paths then having better benchmark results on top is always good.

@kevsecurity kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch 3 times, most recently from c57141f to 503053e Compare September 5, 2023 15:47
@kkourt
Copy link
Contributor

kkourt commented Sep 6, 2023

One thing I'm not so sure about is whether it is worth the complexity of having multiple maps instead of one. As far as I understand the motivation here is to save space.

, and it would be nice to justify the code complexity of having multiple maps with some numbers.
The motivation is to save CPU cycles in the look up code. If we only have one hash map with a key size of (say) 128 bytes, then every string we try to look up will require hashing 128 bytes, regardless if many of them are 0x00. Using several maps with different size keys means we only hash an amount of bytes that is closer to the string length we wish to look up.

During offline sync we discussed quickly why the different maps, and to me it is fine. Maybe it is complex from readability, but from the performance view even if we don't have numbers, but the fact that 1. these selectors where not working properly and now they are being fixed, and 2. file name and string matching is the most basic and easy use case. Users may watch different paths then having better benchmark results on top is always good.

(1) and (2) make sense to me, but they could also be implemented with a single map for all sizes. Having different maps for different sizes is a performance optimization (for CPU cycles as you mention above), and, given the code complexity it introduces, I think it's reasonable to ask: is it worth it? How much time do we save compared to when we use a single map for all sizes?

@kevsecurity
Copy link
Contributor Author

(1) and (2) make sense to me, but they could also be implemented with a single map for all sizes. Having different maps for different sizes is a performance optimization (for CPU cycles as you mention above), and, given the code complexity it introduces, I think it's reasonable to ask: is it worth it? How much time do we save compared to when we use a single map for all sizes?

Jhash cost is a little over a CPU cycle per length of data to hash in bytes, but we could say that is equal; i.e. hashing 128 bytes costs 128 CPU cycles (see https://fosdem.org/2023/schedule/event/bpf_hashing/ for details/confirmation). The existing byte-by-byte comparison has a cost of CPU cycles that similarly grows with the length of the data to compare.

While fixing (1) and (2) above, we also don't want to incur extra CPU cost if we can avoid it. In this situation, we want to give the users the ability to specify fairly long strings/filenames to match on – say ~128 bytes. If we take the simplistic approach of a single map, then all look ups will require hashing 128 bytes, with a cost of ~128 CPU cycles. By dividing the strings into different maps, we reduce this cost to approximately the size of the key. Therefore, assuming many look up strings are less than 24 byte in length, many look ups would cost 25 CPU cycles compared with 128 CPU cycles. We could assume that the next biggest category would be slightly longer strings, with these costing 49 CPU cycles compared to 128 CPU cycles.

Compared to these gains, the code isn't really that complex. Rather than putting all the strings in the same map, they are simply divided into six maps based on their length; and then at look up, the appropriate map is chosen based on the length of the look up string. An advantage of this approach is that we could increase the maximum string length to, say, 1024 bytes with no additional CPU cost – there may be a small memory cost, however.

Given that Tetragon hooks kernel functions and performs comparisons during many of the hooks, we always cost a system CPU time. Looking at Sysmon (Windows) as an example of a tool with millions of installs, many of the use cases feature string and filename matching. IMO it's not a case of "Is it worth it?" but more a case of "How could we justify not doing it?"

@kevsecurity
Copy link
Contributor Author

kevsecurity commented Sep 6, 2023

I've taken some measurements using 50,000,000 calls to a kprobe:

  • The difference between no hash and the hash based on 6 maps is 60.2ns/kprobe.
  • The difference between no hash and the hash based on a single map is 127.6ns/kprobe.
  • The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe.

Objectively, the hash for a single map is more than twice the cost of the hash for 6 maps. This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%.

@olsajiri
Copy link
Contributor

olsajiri commented Sep 6, 2023

I'm getting following error for spec below, which I assume should work now

time="2023-09-06T13:17:54Z" level=fatal msg="Failed to start tetragon" error="sensor gkp-sensor-1 from collection sys-symlink-passwd failed to load: failed prog bpf/objs/bpf_multi_kprobe_v61.o kern_version 393730 loadInstance: failed to insert string_maps_0_8: update: key too big for map: argument list too long"

it seems there's double values iteration, first in writeMatchValues and then in writeMatchStrings,
but I still don't get fully the mapid/map_idx correlation.. so I might be off ;-)

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "sys-symlink-passwd"
spec:
  kprobes:
  - call: "sys_symlinkat"
    syscall: true
    args:
    - index: 0
      type: "string"
    - index: 1
      type: "int"
    - index: 2
      type: "string"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Equal"
        values:
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"
        - "AAAAAAAAAAAA"

@kevsecurity
Copy link
Contributor Author

I'm getting following error for spec below, which I assume should work now

time="2023-09-06T13:17:54Z" level=fatal msg="Failed to start tetragon" error="sensor gkp-sensor-1 from collection sys-symlink-passwd failed to load: failed prog bpf/objs/bpf_multi_kprobe_v61.o kern_version 393730 loadInstance: failed to insert string_maps_0_8: update: key too big for map: argument list too long"

it seems there's double values iteration, first in writeMatchValues and then in writeMatchStrings, but I still don't get fully the mapid/map_idx correlation.. so I might be off ;-)

That's odd. I'll take a look. Thanks for trying it out.

Copy link
Member

@mtardy mtardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think after slowly reviewing this I learned many things, thanks, that's a really cool use of BPF_MAP_TYPE_LPM_TRIE! I must admit the multiple maps + padding stuff makes everything a bit more complex and hard to read indeed. I wonder after your benchmarks if you want to remove that optimization or not.

Anyway, that would be helpful as well for #1278, as of now we use sets as well for In and NotIn that we could replace with those already implemented operators. Paths will be similar to string or char_buf.

I made some annoying remarks because it was confusing/or just convention, feel free to ignore them, just sayin.

It should be good to do a pass on the documentation as well (after that PR) to remove the trailing \0, even if they are harmless (thanks to your fix), they are confusing:

-  - "/etc/passwd\0"
+  - "/etc/passwd"

bpf/process/string_maps.h Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/sensors/tracing/selectors.go Outdated Show resolved Hide resolved
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from 503053e to 71ed2ff Compare September 6, 2023 17:32
@kevsecurity
Copy link
Contributor Author

it seems there's double values iteration, first in writeMatchValues and then in writeMatchStrings, but I still don't get fully the mapid/map_idx correlation.. so I might be off ;-)

Yep that's what it was – I fixed it in the last push. When I simplified the code I moved writeMatchStrings into writeMatchValues as I thought it made more sense there, but this introduced the double iteration. All looks good now.

Regarding mapid/map_idx: consider just the smallest key'd map (key size is 25 bytes). A single kprobe could have more than one string Equal match section – maybe for different parameters or different actions. Therefore there could be 2 or more hash tables of strings, one for each match section. Each of these is placed inside an array_of_hash map at indices 0, 1, 2... These indices are then stored in the kernel selectors, so each match knows which hash table to use.

But of course there are six key sizes, so therefore six array_of_hash maps, and therefore the above process six times. In the selector we store the index for the relevant hash map in the first array_of_hash, then the index for the relevant hash map in the second array_of_hash, and so on for all six. In BPF, we take the input string, work out which key size it would fall into, get the index from the selector for that hash map, and look it up in the relevant array_of_hash. Then we can pad our string to that key size and look up the string in that hash map.

HTH

@kevsecurity kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch from 71ed2ff to 923dd6f Compare September 7, 2023 13:45
Copy link
Contributor

@olsajiri olsajiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see more tests given the size of new code (also for prefix/postfix),
but I guess that can be added later, anyway it looks good to me now

@kevsecurity
Copy link
Contributor Author

I'd like to see more tests given the size of new code (also for prefix/postfix), but I guess that can be added later, anyway it looks good to me now

Thanks for the approval. My thought about tests was that this isn't new functionality, so while more tests might be a good idea, we were already okay with the testing of the existing implementation (which happened to have bugs that I fixed a little while back!).

@mtardy
Copy link
Member

mtardy commented Sep 8, 2023

I'd like to see more tests given the size of new code (also for prefix/postfix), but I guess that can be added later, anyway it looks good to me now

Thanks for the approval. My thought about tests was that this isn't new functionality, so while more tests might be a good idea, we were already okay with the testing of the existing implementation (which happened to have bugs that I fixed a little while back!).

yep understandable, it would be nice though to add some tests on the limits, like the one Jiri proposed #1408 (comment). But as he said, we can add them later.

@kevsecurity
Copy link
Contributor Author

yep understandable, it would be nice though to add some tests on the limits, like the one Jiri proposed #1408 (comment). But as he said, we can add them later.

I added tests to the first commit for the 6 map string look up, all focused on the limits. Prefix and postfix could potentially benefit from tests at the max string length, but I don't feel they are urgent. The fact that tests already exist for them (from the previous implementation) satisfies me.

Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me.

I have some suggestions, but I feel the PR is definitely an improvement from the current state of things, so I'm happy to merge it as is. The only thing I would like to clarify before merging is the numbers. First, thanks for measuring! The summary seems to be that this gives us a 2.3% improvement, which is small but an improvement nevertheless.

I'm not sure I understand the numbers though.

I've taken some measurements using 50,000,000 calls to a kprobe:

The difference between no hash and the hash based on 6 maps is 60.2ns/kprobe.
The difference between no hash and the hash based on a single map is 127.6ns/kprobe.
The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe.

Objectively, the hash for a single map is more than twice the cost of the hash for 6 maps. This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%.

For example:

The difference between the hash based on 6 maps and the hash based on a single map is 67.4ns/kprobe.

So my reading of this is that we are saving 67.4ns for every invocation of the hook.
But after you say that:

This is admittedly very small compared to the total cost of the kprobe (2.92us), and represents an increase of 2.3%.

So I'm not sure where the 2.3% comes from. Am I missing something?
I was: 67.4 / 2920 = 2.3%.

Could you add the measurements (maybe the absolute values?) for every run in the commit message so that it is available as part of git history?

Thanks!

bpf/process/types/basic.h Outdated Show resolved Hide resolved
pkg/selectors/kernel.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Show resolved Hide resolved
bpf/process/types/basic.h Show resolved Hide resolved
bpf/process/types/basic.h Show resolved Hide resolved
bpf/process/types/basic.h Outdated Show resolved Hide resolved
Full string matches in argMatch selectors are limited to two strings per
selector. This limitation is due to the complexity of looping over every
byte of long strings.

This commit reimplements this as a set of hash look ups to reduce
complexity and permit an infinite number of full string matches.

BPF hash maps use jhash to hash their keys. jhash operates on 12 byte
blocks, followed by a final block. It appears that the most efficient
use of jhash is when the input is 1 byte longer than a multiple of 12.

Using a single hash map would require all inputs to be the same length,
as the key is a fixed length. This is inefficient as short strings will
involve a lot of hashing of padding bytes. This commit implements six
string hash maps, for padded strings of sizes 25, 49, 73, 97, 121, and
145 bytes.

In order to not confuse character buffers of different lengths where one
buffer ends in padding and the other ends in matching 0s, the first byte
of the padded buffer is the length. This permits this approach to be
used for character buffers in addition to null-terminated strings.

Note, as a consequence of implementing hash look ups for string
matching, we no longer need to specify trailing null characters ("\0")
in tracing policies. Any trailing null characters will be removed
internally before being padded and inserted into the maps.

I've taken some measurements using 50,000,000 calls to a kprobe:

* The difference between no hash and the hash based on 6 maps is
60.2ns/kprobe.
* The difference between no hash and the hash based on a single map is
127.6ns/kprobe.
* The difference between the hash based on 6 maps and the hash based on a
single map is 67.4ns/kprobe.

Objectively, the hash for a single map is more than twice the cost of
the hash for 6 maps. This is admittedly very small compared to the total
cost of the kprobe (2.92us), and represents an increase of 2.3%.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
Prefix string matches are limited to two strings per selector. This
limitation is due to the complexity of looping over every byte in long
strings.

This commit replaces the prefix matching with LPM TRIE look ups. Prefix
strings are stored in a LPM TRIE map with the prefixes set to the length
of the strings (in bits). Look ups are performed by setting the look up
prefix to the length of the string to look up (in bits) and calling the
usual map look up helper.

This permits any number of string prefixes for strings to be compared
against.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/string-match-hash branch 2 times, most recently from ea8550c to 4d15cd7 Compare September 11, 2023 16:46
Postfix string matches are limited to two strings per selector. This
limitation is due to the complexity of looping over every byte in long
strings.

This commit replaces the postfix matching with LPM TRIE look ups.
Postfix strings are stored (in reverse) in a LPM TRIE map with the
prefixes set to the length of the strings (in bits). Look ups are
performed by setting the look up prefix to the length of the string to
look up (in bits), and the data to the reverse of the string to look up,
and calling the usual map look up helper.

This permits any number of string postfixes for strings to be compared
against.

As hash look ups are now used for postfix matching, the null characters
(\0) are no longer required on the end of match strings. Any null
characters found on the ends of the match strings will be gracefully
removed.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
Just like in the previous commit regarding string matches, this commit
converts file and FD equal matches to hash look ups. It also covers
notEqual matches too. See previous commits for details.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
As for the previous commits that converted string prefix and postfix
matches to LPM TRIE look ups, this commit converts file and FD prefix
and postfix matches to the same. Also includes 'not' matches.

Note: TestMultipleMountPathFiltered on kernels <v5.2 has been modified
as the policy specified a postfix path that was longer than could
possibly be matched. The test now specifies a policy with a path that
can be matched correctly. The test passed prior to this commit because
the postfix comparison started at the end and stopped as soon as one of
the strings was exhausted. This meant that the full specified postfix
string wasn't checked. This was an incorrect result as the start of the
path could have contained any directories and it would still have
matched. The updated test now only contains a postfix path that can
actually be checked.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
Having implemented all the equal/prefix/postfix matches in hashes, and
having implemented the not versions for file/fd, now implement the not
versions for string and char buffers.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
When the string and file matches relied on byte for byte matching in
BPF, postfix matches were limited to 50 chars for >=5.4 and 40 chars
<5.4. With the new hash match look ups, the >=5.4 limit was raised to
the standard 128 characters by default. For <5.4, we can now raise the
'small program' limit to 96 without exceeding the instruction count.

Signed-off-by: Kevin Sheldrake <kevin.sheldrake@isovalent.com>
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🚀

@kevsecurity kevsecurity merged commit bbfee9d into main Sep 12, 2023
26 checks passed
@kevsecurity kevsecurity deleted the pr/kevsecurity/string-match-hash branch September 12, 2023 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/minor This PR introduces a minor user-visible change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants