Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ig: Make container-hook optional #2071

Merged
merged 1 commit into from
Sep 27, 2023

Conversation

mauriciovasquezbernal
Copy link
Member

72e7751f9c11 ("pkg: Allow to get events for all processes.") introduced
support for showing events on the host, however, ig still requires
runc to be installed to work. This requirement doesn't make sense anymore
as one could want to get host events without having any container.

This commit makes the container-hook initialization optional for the ig
case, instead of error it prints out a warning message. 

How to test

  1. Get a system without runc installed (I used vagrant with "ubuntu/mantic64")
$ runc
Command 'runc' not found, but can be installed with:
apt install runc
Please ask your administrator.

Before this commit

$ sudo ./ig trace open --host
Error: initializing operators: initializing operator "LocalManager": initializing manager: starting container fanotify: no container runtime can be monitored with fanotify. The following paths were tested: /bin/runc,/usr/bin/runc,/usr/sbin/runc,/usr/local/bin/runc,/usr/local/sbin/runc,/usr/lib/cri-o-runc/sbin/runc,/run/torcx/unpack/docker/bin/runc,/usr/bin/crun,/usr/bin/conmon. You can use the RUNTIME_PATH env variable to specify a custom path. If you are successful doing so, please open a PR to add your custom path to runtimePaths

After this commit

$ sudo ./ig trace open --host
WARN[0001] failed to start container-hook: container enrichment and filtering won't work 
RUNTIME.CONTAINERNAME                                      PID        COMM             FD    ERR PATH
                                                           1          systemd          8     0   /proc/496/cgroup 

@@ -98,7 +98,7 @@ func NewManager(runtimes []*containerutilsTypes.RuntimeConfig) (*IGManager, erro
containercollection.WithCgroupEnrichment(),
containercollection.WithLinuxNamespaceEnrichment(),
containercollection.WithMultipleContainerRuntimesEnrichment(runtimes),
containercollection.WithContainerFanotifyEbpf(),
containercollection.WithContainerFanotifyEbpf(false),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be false only if users pass --host? If such a flag is not passed, it means users want to debug only containers, so it doesn't make sense to go ahead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts:

  • It doesn't seem too easy to get the value of such parameter there. I see it's called from LocalManager.Init(), and there the operator params are available, but --host is not an operator but a gadget param IIUC.

func (l *LocalManager) Init(operatorParams *params.Params) error {

  • In the current implementation we could know the value of --host, but once [RFE] Adopt client/server setup also for ig #1681 is implemented we won't know that value at the time the daemon is started. It's even possible that the user tries to run different gadgets with and without --host.
  • I'm not totally convinced that the default should be --host=false, so I rather prefer not to tie those two things there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the client/server approach, I agree it should go ahead. However, we need to provide some feedback when the user starts the server, the container-hook fails to get initialized, and they start a gadget without --host. Otherwise, container enrichment and filtering won't work, and they will have no idea what's happening because the error Error: initializing operators: initializing operator "LocalManager": initializing manager: starting will be only on the server side, when it was launched.

Maybe a new gadgetctl status could provide the server's status, including which hooks were installed and their statuses? It could also provide other info like the one for BTF: it was downloaded, or it's using the one from the system, etc. That could be useful to request when folks report an issue.

Copy link
Member

@alban alban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I wonder if some users would want a way to fail when they pass flags to filter by containers but the filters don't work. But for now I think the warning is good enough.

@mauriciovasquezbernal
Copy link
Member Author

I decided to take another approach with this PR, now the logic is one level above and we're able to provide warnings when the gadget is run:

$ sudo ./ig trace exec
WARN[0001] Failed to create container-collection        
RUNTIME.CONTAINERNAME                             PID        PPID       COMM             RET ARGS                                                             
Error: running gadget: gadget prerun: pre gadget run on operator "LocalManagerTrace": container-collection isn't available

$ sudo ./ig trace exec --host
WARN[0001] Failed to create container-collection        
RUNTIME.CONTAINERNAME                             PID        PPID       COMM             RET ARGS                                                             
WARN[0001] container-collection isn't available no container filtering will be done 

$ sudo ./ig trace dns
WARN[0001] Failed to create container-collection        
RUNTIME.CONTAINERNAME             PID        TID        COMM             QR TYPE      QTYPE      NAME                              RCODE              NUMANSW…
Error: running gadget: gadget prerun: pre gadget run on operator "LocalManagerTrace": container-collection isn't available

$ sudo ./ig trace dns --host
WARN[0001] Failed to create container-collection        
RUNTIME.CONTAINERNAME             PID        TID        COMM             QR TYPE      QTYPE      NAME                              RCODE              NUMANSW…
WARN[0002] container-collection isn't available, no containers will be traced 

What do you think?

Copy link
Member

@blanquicet blanquicet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this approach. It covers all the cases and provides proper info to users.

pkg/operators/localmanager/localmanager.go Outdated Show resolved Hide resolved
72e7751 ("pkg: Allow to get events for all processes.") introduced
support for showing events on the host, however, ig still requires
runc to be installed to work. This requirement doesn't make sense anymore
as one could want to get host events without having any container.

This commit makes the container-collection optional in the localmanager
case, instead of error it prints out a warning message.

Signed-off-by: Mauricio Vásquez <mauriciov@microsoft.com>
@mauriciovasquezbernal mauriciovasquezbernal merged commit 7f2ed5a into main Sep 27, 2023
49 checks passed
@mauriciovasquezbernal mauriciovasquezbernal deleted the mauricio/make-container-hook-optional branch September 27, 2023 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants