Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kallsyms: use ebpf alternative when kallsyms not available #2280

Merged
merged 2 commits into from
Mar 28, 2024

Conversation

alban
Copy link
Member

@alban alban commented Dec 10, 2023

kallsyms: use ebpf alternative when kallsyms not available

On the minikube kernel 5.10.57, /proc/kallsyms does not provide the following symbols:

  • bpf_prog_fops (used by the top-ebpf gadget)
  • socket_file_ops (used by the socket enricher)

This patch provides an alternative to get the address of those symbols using ebpf.

Most of the required ebpf code was already implemented for the container-hook package. The file is moved to make it available in a generic package:
pkg/container-hook/bpf/privatedata.bpf.c -> pkg/kfilefields/bpf/filefields.bpf.c

How to use

No changes

Testing done

minikube start -p minikube-kvm2 --driver=kvm2
kubectl run -ti --rm --image=busybox --restart=Never shell01 -- sh
nc -l -p 8080 &

minikube cp ig minikube-kvm2:/usr/bin/ig
minikube ssh
sudo chmod +x /usr/bin/ig
sudo ig top ebpf
sudo ig trace dns
sudo ig trace network

Tested by adding a fmt.Printf:

$ sudo ig trace dns
...
symbol socket_file_ops found at ffffffff94ea0660 using ebpf
...
$ sudo ig top ebpf
...
symbol bpf_prog_fops found at ffffffff94d99ea0 using ebpf
...

Tested by reverting 681c370 (#2270). It works too.

GKE

integration tests run on GKE are passing now since symbol are being resolved:

Links

TODO

@@ -179,6 +179,12 @@ func specUpdateAddresses(
}
}

// List of symbols that we're able to find without using /proc/kallsyms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we cannot use this approach for all symbols and get rid of /proc/kallsyms?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kallsyms is also used in:

  • profile-cpu gadget (kAllSyms.LookupByInstructionPointer(ip))
  • top-block-io gadget (__blk_account_io_start, blk_account_io_start, __blk_account_io_done, blk_account_io_done

In those cases, the symbol is not used in any field of a struct file *, so the same trick to pass a file descriptor in a unix socket (SCM_RIGHTS) cannot be used.

@eiffel-fl
Copy link
Member

Should #2270 be reverted in this PR?

Why not, but if everything failed (reading /proc/kallsyms and using eBPF), we should still print a warning.

@alban
Copy link
Member Author

alban commented Dec 12, 2023

I've refactorised the code to make it possible to add some unit tests. It is now possible to test that the two methods to get the addresses of bpf_prog_fops and socket_file_ops return the same values.

Should #2270 be reverted in this PR?

Why not, but if everything failed (reading /proc/kallsyms and using eBPF), we should still print a warning.

After more thinking, I think both could be done in a future PR.

Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments but it is looking great:

pkg/kallsyms/kallsyms.go Outdated Show resolved Hide resolved
pkg/kallsyms/kallsyms.go Outdated Show resolved Hide resolved
@eiffel-fl
Copy link
Member

I've refactorised the code to make it possible to add some unit tests. It is now possible to test that the two methods to get the addresses of bpf_prog_fops and socket_file_ops return the same values.

Should #2270 be reverted in this PR?

Why not, but if everything failed (reading /proc/kallsyms and using eBPF), we should still print a warning.

After more thinking, I think both could be done in a future PR.

What do you mean by both?

Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it and it works fine but I am wondering about #2270.

@alban
Copy link
Member Author

alban commented Jan 17, 2024

I've refactorised the code to make it possible to add some unit tests. It is now possible to test that the two methods to get the addresses of bpf_prog_fops and socket_file_ops return the same values.

Should #2270 be reverted in this PR?

Why not, but if everything failed (reading /proc/kallsyms and using eBPF), we should still print a warning.

After more thinking, I think both could be done in a future PR.

What do you mean by both?

Sorry I was confused. At the moment, we print a warning when both methods (kallsyms and eBPF) fail, so I think it is fine as it is.

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. Will look at tests later on.

pkg/kallsyms/bpf/privatedata.bpf.c Outdated Show resolved Hide resolved
pkg/kallsyms/kallsyms.go Outdated Show resolved Hide resolved
pkg/kallsyms/kallsyms.go Show resolved Hide resolved
Comment on lines 183 to 191
// List of symbols that we're able to find in 'struct file *' using eBPF
symbolsBypass := map[string]fdType{
"socket_file_ops": fdTypeSocket,
"bpf_prog_fops": fdTypeEbpfProgram,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general golang question: is it better to use a map than a switch case?

pkg/kallsyms/kallsyms.go Outdated Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments about previous implementation:

  • Would merging symbolsMap and triedGetAddr maps in a single one containing a type symbolResolve struct { addr uint64; err error } as value make the code better?
  • Does SymbolExists have to acquire the lock? Why it doesn't trigger the lookup logic and only returns from cache?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps specUpdateAddresses can be refactored in a function that just returns the symbols and it'll be easier to test without having to always use the bpf spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't touch these to avoid changing to much but happy to revisit if you think they should be part of this PR.

@alban alban self-assigned this Jan 18, 2024
pkg/kallsyms/privatedata.go Outdated Show resolved Hide resolved
pkg/kallsyms/privatedata.go Outdated Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps specUpdateAddresses can be refactored in a function that just returns the symbols and it'll be easier to test without having to always use the bpf spec?

@mqasimsarfraz
Copy link
Member

It seems this is affecting integrations runs on GKE (OS: Container-Optimized OS from Google, Kernel: 5.15.133+) since I see following warnings in the log:

time="2024-02-09T13:38:05Z" level=warning msg="updating socket_file_ops address with ksyms: file does not exist\nEither you cannot access /proc/kallsyms or this file does not contain socket_file_ops"
time="2024-02-09T13:38:17Z" level=warning msg="updating socket_file_ops address with ksyms: file does not exist\nEither you cannot access /proc/kallsyms or this file does not contain socket_file_ops"

I tested it with the image from this PR and results are looking better.

Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

What is the status of this PR?
I did take a quick look but I can see there are still some previous unresolved comments.

Best regards.

pkg/kallsyms/kallsyms.go Outdated Show resolved Hide resolved
@mqasimsarfraz
Copy link
Member

What is the status of this PR? I did take a quick look but I can see there are still some previous unresolved comments.

Thanks for the quick look, I am still working on unresolved comments. I will mark this PR draft and open it for reviews once I am done!

@mqasimsarfraz mqasimsarfraz marked this pull request as draft February 26, 2024 09:48
@mqasimsarfraz mqasimsarfraz force-pushed the alban_privatedata branch 4 times, most recently from d08a73d to 661af6b Compare February 28, 2024 17:21
alban and others added 2 commits March 27, 2024 19:30
On the minikube kernel 5.10.57, /proc/kallsyms does not provide the
following symbols:
- bpf_prog_fops (used by the top-ebpf gadget)
- socket_file_ops (used by the socket enricher)

This patch provides an alternative to get the address of those symbols
using ebpf.

Most of the required ebpf code was already implemented for the
container-hook package. The file is moved to make it available in a
generic package:
pkg/container-hook/bpf/privatedata.bpf.c -> pkg/kfilefields/bpf/filefields.bpf.c

Co-authored-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
Co-authored-by: Alban Crequy <albancrequy@linux.microsoft.com>
Signed-off-by: Alban Crequy <albancrequy@linux.microsoft.com>
Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
@mqasimsarfraz
Copy link
Member

Thanks for the reviews!

@mqasimsarfraz mqasimsarfraz merged commit 150ec74 into main Mar 28, 2024
123 checks passed
@mqasimsarfraz mqasimsarfraz deleted the alban_privatedata branch March 28, 2024 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

4 participants