Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

link: do not use regexp pkg #642

Merged
merged 2 commits into from Apr 27, 2022
Merged

link: do not use regexp pkg #642

merged 2 commits into from Apr 27, 2022

Conversation

kolyshkin
Copy link
Contributor

For ID validity checks, replace regular expressions with custom functions checking input strings
byte by byte.

For sanitizeSymbol, replace a regexp with a for loop. Add some more test cases while at it.

Regular expressions bring in a big library as a dependency, and they
incur some overhead during startup (to compile regexes).

@lmb lmb self-requested a review April 26, 2022 06:52
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me. Out of curiosity, do you have numbers for size / start up time impact?

link/kprobe.go Outdated Show resolved Hide resolved
link/kprobe.go Outdated Show resolved Hide resolved
link/uprobe.go Show resolved Hide resolved
link/uprobe.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@ti-mo ti-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really worth it? I'd be curious to see some numbers.

link/kprobe.go Show resolved Hide resolved
link/perf_event.go Show resolved Hide resolved
@kolyshkin
Copy link
Contributor Author

kolyshkin commented Apr 27, 2022

Is this really worth it? I'd be curious to see some numbers.

Out of curiosity, do you have numbers for size / start up time impact?

The initialization overhead I mentioned in the description is probably negligible, so not doing any measurements (yet it calls regexp.MustCompile from the init section for every global variable like the removed rgxEventSymbol).

Let's see the binary size overhead. Using Go 1.18.1 on Linux/amd64, -orig is with cilium/ebpf v0.8.1, the no-suffix version is compiled with the version from this PR:

[kir@kir-rhat ebpf]$ ls -l ebpf-test-orig ebpf-test
-rwxrwxr-x. 1 kir kir 3187473 Apr 26 20:22 ebpf-test
-rwxrwxr-x. 1 kir kir 3387158 Apr 26 20:20 ebpf-test-orig

The difference is close to 200K. Let's see the size results:

$ size ebpf-test ebpf-test-orig 
   text	   data	    bss	    dec	    hex	filename
1993489	 112896	 217376	2323761	 237531	ebpf-test
2121479	 121456	 217760	2460695	 258c17	ebpf-test-orig

So, about 130KB from text+data+bss, I guess most of it is regexp and regexp/syntax packages.

Note the measurements might not be entirely valid as I'm comparing the latest released version of github.com/cilium/ebpf (v0.8.1) against the code from this PR.

I have used this non-program calling uprobe (which is being patched to not use regexp):

package main

import (
	"log"

	"github.com/cilium/ebpf"
	"github.com/cilium/ebpf/asm"
	"github.com/cilium/ebpf/link"
)

func main() {
	prog, err := ebpf.NewProgram(&ebpf.ProgramSpec{
		Type:    ebpf.Kprobe,
		License: "MIT",
		Instructions: asm.Instructions{
			asm.Mov.Imm(asm.R0, 0),
			asm.Return(),
		},
	})
	if err != nil {
		log.Fatal(err)
	}

	bashEx, err := link.OpenExecutable("/bin/bash")
	if err != nil {
		log.Fatal(err)
	}

	up, err := bashEx.Uprobe("main", prog, nil)
	if err != nil {
		log.Fatal(err)
	}
	up.Close()
}

with this go.mod file:

module github.com/kolyshkin/ebpf-test

go 1.18

require (
	github.com/cilium/ebpf v0.8.1
	golang.org/x/sys v0.0.0-20210906170528-6f6e22806c34 // indirect
)

replace github.com/cilium/ebpf => github.com/kolyshkin/ebpf v0.8.2-0.20220425190318-df1de70d40a1

(Remove the last line to go back to stock v0.8.1)

Replace regular expressions with custom functions checking input strings
byte by byte.

Regular expressions bring in a big library as a dependency, and they
incur some overhead during startup (to compile regexes).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Instead of using a regular expression in sanitizeSymbol, write a small
for loop to replace any repetitions of unknown characters with a "_".

This removes the last use of regexp from this package.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor Author

Resolved all review comments except #642 (comment), rebased.

@kolyshkin kolyshkin requested a review from ti-mo April 27, 2022 04:51
@lmb
Copy link
Collaborator

lmb commented Apr 27, 2022

Thanks for the numbers, seems worth it.

@ti-mo ti-mo merged commit 8a8aa58 into cilium:master Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants