Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ig: add daemon mode #2078

Merged
merged 8 commits into from
Sep 29, 2023
Merged

ig: add daemon mode #2078

merged 8 commits into from
Sep 29, 2023

Conversation

flyth
Copy link
Member

@flyth flyth commented Sep 25, 2023

This adds support to run ig in daemon mode as a service. A new binary called gadgetctl has been added to serve as a client for ig running in said daemon mode. By default, ig listens on a unix socket when started with ./ig daemon and gadgetctl uses that same socket to connect to by default.

The grpc-runtime has been changed to also allow direct gRPC connections - still: most of the code is shared.

Running as root

First terminal

./ig daemon

Second terminal

./gadgetctl trace open

Daemon as root, gadgetctl as other user

Use the gid of a group that you want to have access to the service.

First terminal (as root)

./ig daemon --daemon-group 1000

Second terminal (as member of group 1000)

./gadgetctl trace open

Todos:

  • documentation
  • CI

Resolves #1681
Requires #2067

@flyth flyth force-pushed the michael/ig-daemon branch 4 times, most recently from abaf5e5 to ada23a9 Compare September 25, 2023 18:03
@flyth flyth marked this pull request as ready for review September 25, 2023 18:30
Copy link
Member

@alban alban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch 1 and 3 looks generally good, just small comments. I didn't review the gadgetctl patch yet.

cmd/ig/daemon.go Outdated Show resolved Hide resolved
cmd/ig/daemon.go Show resolved Hide resolved
@alban
Copy link
Member

alban commented Sep 26, 2023

How do I specify the unix socket to connect to?

$ ./gadgetctl trace mount
RUNTIME.CONTAINERNAME                                  COMM             PID        TID        CALL                                                                                                                                            
Error: running gadget: dialing gadget pod on node "": dialing node "": context deadline exceeded

It takes 30 seconds to print the error. The column line is printed immediately, leading the user to think there is no problem for 30 seconds.

$ time ./gadgetctl trace open 
RUNTIME.CONTAINERNAME                                                                           PID        COMM             FD        ERR PATH                                                                                                
Error: running gadget: dialing gadget pod on node "": dialing node "": context deadline exceeded

real	0m30.042s
user	0m0.026s
sys	0m0.023s

It needs a version command:

$ ./gadgetctl version
Error: unknown command "version" for "gadgetctl"
Run 'gadgetctl --help' for usage.

@flyth
Copy link
Member Author

flyth commented Sep 26, 2023

How do I specify the unix socket to connect to?

I expected it to be --remote-address but that didn't properly get added as a flag. Will check what I missed when splitting the code, thanks! :)

It needs a version command

Thanks, will add it!

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good. I added a bunch of nits but IMO it's almost ready to go.

cmd/ig/daemon.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
cmd/gadgetctl/main.go Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
cmd/ig/daemon.go Outdated

func addDaemonCommand(rootCmd *cobra.Command, runtime runtime.Runtime) {
daemonCmd := &cobra.Command{
Use: "daemon",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering about the meaning of daemon here. From wikipedia it's defined as "computer program that runs as a background process, rather than being under the direct control of an interactive user", but in this PR is used more like a server. Should we call it server instead of daemon? /cc @alban

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think in its current state "server" would indeed be a better choice, as it just serves requests and doesn't do any background running of things. With the long running gadgets that will however change again and things will indeed run in the background and you just control using what exactly that is using the socket (in addition to still be able to use it as a "server").

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be the daemon behaviour in the future? Will it fork and run in background? (Somehow related to Jose's comment in #2078 (comment))

Ideally the solution will be to have an gadgetd / igd and gadgetctl binaries, but we also want it to be able to work alone. I think the current solution is good enough, just be sure to document what daemon is about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with this comment as I tend to see daemon as software calling daemon() or following man daemon.7.
As ig daemon remains in the foreground, it does not really follow the daemon definition.
So, it depends it we want to make it a daemon in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref discussion here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it can still be called server and the matter it's a daemon or not is controlled by systemd?, i.e. we don't provide an option to run as a daemon directly. WDYT?

/cc @alban @blanquicet pinging you folks to consolidate the discussion on this thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree: let's call the CLI command "ig server" and let's run this in the foreground.

If ever needed later, we could add an option "ig server -d" to run in the background, but that should not be necessary with systemd, so I prefer not to add that in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this, but it seems™ like right now it's behaving the same as dockerd.

cmd/ig/daemon.go Outdated Show resolved Hide resolved
cmd/ig/daemon.go Show resolved Hide resolved
pkg/runtime/grpc/info.go Outdated Show resolved Hide resolved
@mauriciovasquezbernal
Copy link
Member

I also think there is something off with the timeout, ./gadgetctl -h takes 30 seconds even if the socket file doesn't exist, shouldn't it return immediately in that case?

Copy link
Member

@blanquicet blanquicet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it is looking good, and you kept it still "simple". My comments are mostly nits, so it is fine for me to go ahead with this PR (I just tested the direct mode)

cmd/ig/daemon.go Outdated
"github.com/inspektor-gadget/inspektor-gadget/pkg/runtime"
)

func addDaemonCommand(rootCmd *cobra.Command, runtime runtime.Runtime) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but I was expecting this to be run in the background by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean ig should fork itself and stay active in the background with the foreground process exiting? I think we just need a proper service definition that starts ig daemon - that way logging is also routed through systemd (or whatever).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, running ig as a server should not "daemonize" (fork/detach from terminal) by default, so it is easier to integrate with systemd.

cmd/ig/daemon.go Outdated Show resolved Hide resolved
cmd/ig/daemon.go Show resolved Hide resolved
cmd/ig/daemon.go Outdated Show resolved Hide resolved
cmd/ig/main.go Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
}
targets = append(targets, target{
name: t,
node: purl.Hostname(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, the node is always empty for me. What does node represent in the direct mode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's true for local unix socket connections (unix:///path/to/socket is missing a hostname); if you'd use tcp, this would be the IP or hostname. Should we use a hardcoded "localhost" when hostname is empty?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This target works very well for kubernetes, there you have a pod and node, but for connection mode direct it's confusing. I'll say to leave the node empty and to udpate all debug / error messages to only include it if it's not empty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it so that for unix sockets it'll now say "local" and for tcp targets it will be the target hostname.

@flyth flyth force-pushed the michael/ig-daemon branch 3 times, most recently from c09b97b to a01d20d Compare September 27, 2023 11:19
Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will test it later but so far it looks OK.
Please, think to split your commits as it would make review easier.

cmd/ig/daemon.go Show resolved Hide resolved
cmd/ig/daemon.go Outdated Show resolved Hide resolved
cmd/ig/daemon.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
cmd/ig/version.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/info.go Show resolved Hide resolved
.github/workflows/inspektor-gadget.yml Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
Makefile Show resolved Hide resolved
@flyth flyth force-pushed the michael/ig-daemon branch 2 times, most recently from b8756aa to 39cf7a6 Compare September 28, 2023 07:54
Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look and I have some comments but nothing I find critical.
Some documentation would nonetheless be welcome.
In the meanwhile, I will compile everything and test it to ensure it does not explode in flight.

Comment on lines 271 to 268
// If the given path is the default, try to create it and change permissions; if it's not the default, it is
// up to the user to manage it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do totally agree with you but maybe an improved error message like:

  return nil, fmt.Errorf("listening address %q, got: %w\nMaybe you forgot to create the socket?", address, err)

To indicate the user a potential problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean for listener, err := net.Listen(network, address)? There could be several reasons for that to fail (permissions, missing dir, ...), but you're right, it's missing the info on what it tried to do, so changing it to:

return nil, fmt.Errorf("creating listener of type %q at %q: %w", network, address, err)

cmd/common/sync.go Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/info.go Outdated Show resolved Hide resolved
pkg/runtime/grpc/info.go Show resolved Hide resolved
@eiffel-fl
Copy link
Member

I tested it and got this behavior:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ ./gadgetctl trace open --host                                 michael/ig-daemon % u=
Error: unknown flag: --host
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ./gadgetctl trace open --host                            michael/ig-daemon % u=
[sudo] Mot de passe de francis : 
RUNTIME.CONTAINERNAME                            PID        COMM             FD    ERR PATH                                              
                                                 3892       ksystemstats     11    0   /proc/meminfo                                     

It is OK, but it should rather explode yelling I should be root?
Also, is there any way to easily test the TCP way? I would like to take a look at it.

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some more comments.

pkg/gadget-service/service.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
pkg/gadget-service/service.go Outdated Show resolved Hide resolved
return fmt.Errorf("creating unix listener: %w", err)
}
s.listener = listener
default:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to have this default here, but them comment on L45 (SocketType) should be updated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it's better to have a "tcp" case here and make the default return an error. (Will leave the default case at the bottom ;))

cmd/ig/daemon.go Outdated Show resolved Hide resolved
cmd/ig/daemon.go Show resolved Hide resolved
pkg/runtime/grpc/grpc-runtime.go Outdated Show resolved Hide resolved
}
targets = append(targets, target{
name: t,
node: purl.Hostname(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This target works very well for kubernetes, there you have a pod and node, but for connection mode direct it's confusing. I'll say to leave the node empty and to udpate all debug / error messages to only include it if it's not empty.

@flyth flyth force-pushed the michael/ig-daemon branch 2 times, most recently from 8632162 to 974736c Compare September 28, 2023 21:32
@flyth
Copy link
Member Author

flyth commented Sep 28, 2023

I tested it and got this behavior:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ ./gadgetctl trace open --host                                 michael/ig-daemon % u=
Error: unknown flag: --host
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ./gadgetctl trace open --host                            michael/ig-daemon % u=
[sudo] Mot de passe de francis : 
RUNTIME.CONTAINERNAME                            PID        COMM             FD    ERR PATH                                              
                                                 3892       ksystemstats     11    0   /proc/meminfo                                     

It is OK, but it should rather explode yelling I should be root?

I'm not sure I fully understand the issue here. Did you specify a valid user group like this: ig daemon --group GID? You should not need to be root to use gadgetctl if you specified that (and of course are member of that group).

Also, is there any way to easily test the TCP way? I would like to take a look at it.

That should now work again:

./ig daemon -H tcp://127.0.0.1:9999
./gadgetctl trace open --remote-address tcp://127.0.0.1:9999

@eiffel-fl
Copy link
Member

I tested it and got this behavior:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ ./gadgetctl trace open --host                                 michael/ig-daemon % u=
Error: unknown flag: --host
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ./gadgetctl trace open --host                            michael/ig-daemon % u=
[sudo] Mot de passe de francis : 
RUNTIME.CONTAINERNAME                            PID        COMM             FD    ERR PATH                                              
                                                 3892       ksystemstats     11    0   /proc/meminfo                                     

It is OK, but it should rather explode yelling I should be root?

I'm not sure I fully understand the issue here. Did you specify a valid user group like this: ig daemon --group GID? You should not need to be root to use gadgetctl if you specified that (and of course are member of that group).

I ran ig daemon as root and then tried to run gadgetctl as myself.
So, it should have explode saying I cannot run it, instead it complained about an unknown flag, which should be known.

Also, is there any way to easily test the TCP way? I would like to take a look at it.

That should now work again:

./ig daemon -H tcp://127.0.0.1:9999
./gadgetctl trace open --remote-address tcp://127.0.0.1:9999

OK, I will take a look.

Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT 2: Apply the following and it does the trick:

diff --git a/cmd/ig/daemon.go b/cmd/ig/daemon.go
index f7624294..461894a6 100644
--- a/cmd/ig/daemon.go
+++ b/cmd/ig/daemon.go
@@ -67,13 +67,13 @@ func newDaemonCommand(runtime runtime.Runtime) *cobra.Command {
                }
 
                gid := 0
-               if tmpGroup, err := user.LookupGroup(group); err == nil {
+               if tmpGid, err := strconv.Atoi(group); err == nil {
+                       gid = tmpGid
+               } else if tmpGroup, err := user.LookupGroup(group); err == nil {
                        gid, err = strconv.Atoi(tmpGroup.Gid)
                        if err != nil {
                                return fmt.Errorf("unexpected non-numeric group id %q for group %q", tmpGroup.Gid, group)
                        }
-               } else if tmpGid, err := strconv.Atoi(group); err == nil {
-                       gid = tmpGid
                } else {
                        return fmt.Errorf("group %q not found", group)
                }

I will take a look to see if there is a problem upstream.


I tried using -G and I am getting this error:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ./ig daemon -G 42                                       michael/ig-daemon *% u=
fatal error: unexpected signal during runtime execution
[signal SIGFPE: floating-point exception code=0x1 addr=0x7f376a375d69 pc=0x7f376a375d69]

runtime stack:
runtime.throw({0x2bae339?, 0x215bd1a?})
        /usr/local/go/src/runtime/panic.go:1047 +0x5d fp=0x7ffdecb4d260 sp=0x7ffdecb4d230 pc=0x43cadd
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:823 +0x369 fp=0x7ffdecb4d2b0 sp=0x7ffdecb4d260 pc=0x453549

goroutine 1 [syscall]:
runtime.cgocall(0x20cd6b0, 0xc0008bfa00)
        /usr/local/go/src/runtime/cgocall.go:158 +0x5c fp=0xc0008bf9d8 sp=0xc0008bf9a0 pc=0x40597c
os/user._Cfunc_mygetgrnam_r(0xc0003fd250, 0xc000712920, 0x5a02ee0, 0x400, 0xc000598ac8)
        _cgo_gotypes.go:126 +0x4c fp=0xc0008bfa00 sp=0xc0008bf9d8 pc=0x51a22c
os/user.lookupGroup.func1.1({0xc0003fd250, 0x10?, 0x49d66a0?}, 0xc0003fd250?, 0x0?, 0x40f9a7?)
        /usr/local/go/src/os/user/cgo_lookup_unix.go:143 +0xbb fp=0xc0008bfa60 sp=0xc0008bfa00 pc=0x51b6fb
os/user.lookupGroup.func1()
        /usr/local/go/src/os/user/cgo_lookup_unix.go:143 +0x31 fp=0xc0008bfaa0 sp=0xc0008bfa60 pc=0x51b611
os/user.retryWithBuffer(0xc00061b660, 0xc0008bfb90)
        /usr/local/go/src/os/user/cgo_lookup_unix.go:244 +0x39 fp=0xc0008bfae8 sp=0xc0008bfaa0 pc=0x51c0f9
os/user.lookupGroup({0x7ffdecb4e897, 0x2})
        /usr/local/go/src/os/user/cgo_lookup_unix.go:138 +0x169 fp=0xc0008bfbe0 sp=0xc0008bfae8 pc=0x51b409
os/user.LookupGroup(...)
        /usr/local/go/src/os/user/lookup.go:51
main.newDaemonCommand.func1(0xc0005bd200?, {0xc000712900?, 0x2?, 0x2?})
        /go/src/github.com/inspektor-gadget/inspektor-gadget/cmd/ig/daemon.go:71 +0x1f9 fp=0xc0008bfd08 sp=0xc0008bfbe0 pc=0x20cd019
github.com/spf13/cobra.(*Command).execute(0xc0005bd200, {0xc0007128e0, 0x2, 0x2})
        /go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x862 fp=0xc0008bfe40 sp=0xc0008bfd08 pc=0x615162
github.com/spf13/cobra.(*Command).ExecuteC(0xc0003e2300)
        /go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd fp=0xc0008bfef8 sp=0xc0008bfe40 pc=0x615a5d
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
main.main()
        /go/src/github.com/inspektor-gadget/inspektor-gadget/cmd/ig/main.go:72 +0x239 fp=0xc0008bff80 sp=0xc0008bfef8 pc=0x20cd5f9
runtime.main()
        /usr/local/go/src/runtime/proc.go:250 +0x212 fp=0xc0008bffe0 sp=0xc0008bff80 pc=0x43f752
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0008bffe8 sp=0xc0008bffe0 pc=0x46f581

goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000084fb0 sp=0xc000084f90 pc=0x43fb16
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:369
runtime.forcegchelper()
        /usr/local/go/src/runtime/proc.go:302 +0xad fp=0xc000084fe0 sp=0xc000084fb0 pc=0x43f9ad
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc000084fe8 sp=0xc000084fe0 pc=0x46f581
created by runtime.init.6
        /usr/local/go/src/runtime/proc.go:290 +0x25

goroutine 18 [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000080790 sp=0xc000080770 pc=0x43fb16
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:369
runtime.bgsweep(0x0?)
        /usr/local/go/src/runtime/mgcsweep.go:297 +0xd7 fp=0xc0000807c8 sp=0xc000080790 pc=0x4297b7
runtime.gcenable.func1()
        /usr/local/go/src/runtime/mgc.go:178 +0x26 fp=0xc0000807e0 sp=0xc0000807c8 pc=0x41e346
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0000807e8 sp=0xc0000807e0 pc=0x46f581
created by runtime.gcenable
        /usr/local/go/src/runtime/mgc.go:178 +0x6b

goroutine 19 [GC scavenge wait]:
runtime.gopark(0xc00010e000?, 0x30dad28?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000080f70 sp=0xc000080f50 pc=0x43fb16
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:369
runtime.(*scavengerState).park(0x49d54e0)
        /usr/local/go/src/runtime/mgcscavenge.go:389 +0x53 fp=0xc000080fa0 sp=0xc000080f70 pc=0x427793
runtime.bgscavenge(0x0?)
        /usr/local/go/src/runtime/mgcscavenge.go:622 +0x65 fp=0xc000080fc8 sp=0xc000080fa0 pc=0x427da5
runtime.gcenable.func2()
        /usr/local/go/src/runtime/mgc.go:179 +0x26 fp=0xc000080fe0 sp=0xc000080fc8 pc=0x41e2e6
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc000080fe8 sp=0xc000080fe0 pc=0x46f581
created by runtime.gcenable
        /usr/local/go/src/runtime/mgc.go:179 +0xaa

goroutine 3 [finalizer wait]:
runtime.gopark(0x49d66a0?, 0xc000007520?, 0x0?, 0x0?, 0xc000084770?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000084628 sp=0xc000084608 pc=0x43fb16
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:369
runtime.runfinq()
        /usr/local/go/src/runtime/mfinal.go:180 +0x10f fp=0xc0000847e0 sp=0xc000084628 pc=0x41d3cf
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0000847e8 sp=0xc0000847e0 pc=0x46f581
created by runtime.createfing
        /usr/local/go/src/runtime/mfinal.go:157 +0x45

goroutine 4 [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000085750 sp=0xc000085730 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc0000857e0 sp=0xc000085750 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0000857e8 sp=0xc0000857e0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 20 [GC worker (idle)]:
runtime.gopark(0x621af85e9e1?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000081750 sp=0xc000081730 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc0000817e0 sp=0xc000081750 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0000817e8 sp=0xc0000817e0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 21 [GC worker (idle)]:
runtime.gopark(0x621af86b3df?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000081f50 sp=0xc000081f30 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc000081fe0 sp=0xc000081f50 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 5 [GC worker (idle)]:
runtime.gopark(0x621af85e367?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000085f50 sp=0xc000085f30 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc000085fe0 sp=0xc000085f50 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 6 [GC worker (idle)]:
runtime.gopark(0x621af85aafc?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000086750 sp=0xc000086730 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc0000867e0 sp=0xc000086750 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0000867e8 sp=0xc0000867e0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 34 [GC worker (idle)]:
runtime.gopark(0x621af875b6a?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000508750 sp=0xc000508730 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc0005087e0 sp=0xc000508750 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0005087e8 sp=0xc0005087e0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 35 [GC worker (idle)]:
runtime.gopark(0x621af85a5ce?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000508f50 sp=0xc000508f30 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc000508fe0 sp=0xc000508f50 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc000508fe8 sp=0xc000508fe0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

goroutine 36 [GC worker (idle)]:
runtime.gopark(0x621af85f2ca?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:363 +0xd6 fp=0xc000509750 sp=0xc000509730 pc=0x43fb16
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1235 +0xf1 fp=0xc0005097e0 sp=0xc000509750 pc=0x420491
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0005097e8 sp=0xc0005097e0 pc=0x46f581
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1159 +0x25

Note that, I got this with any group ID.
I do not know if this on our side or not...
It comes from here but I do not see anything wrong:

b9691f9#diff-9d3ae444cb907b9257ef6dfc4124c9b9df6bf3562f3bac164be73b06765292e9R70

EDIT: I also got it without the option, i.e. sudo ig daemon.

@eiffel-fl
Copy link
Member

eiffel-fl commented Sep 29, 2023

OK, it works well for the -G argument.
Regarding --host, gadgetctl is not aware of it as it is a ig global flag, hence:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ ./gadgetctl trace open --host                                michael/ig-daemon *% u=
Error: unknown flag: --host

This can be addressed later, as it would mean porting --host to gadgetctl.


I also tried the -H option and it works like a charm!

Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation was the only missing point and you added it, thank you!

Regarding the naming, I am fine with both (daemon or server).
But we should avoid to change our mind once we released and advertised it.

docs/ig.md Outdated Show resolved Hide resolved
docs/ig.md Show resolved Hide resolved
Global runtime flags have previously not been used but they will be used by gadgetctl for now.
However, gadgetctl will register those flags on the root command level and not per-gadget.
This change exports AddFlags so that it is available to the main packages and removes
registering it in AddCommandFromRegistry to avoid duplicate params.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
…ckets

This adds a RunConfig, which allows to configure the service to listen on
a tcp socket instead of just a previously used unix socket. In addition,
this now allows specifying a group id, which will then be used as the gid
for the unix socket to allow access to it using a usergroup.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
The version and sync commands will be used in multiple binaries. In order to reduce code,
this moves the commands to the common package.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
This adds a daemon mode to ig; ig will listen on either a unix or tcp socket
for incoming gRPC requests that it will then execute. This allows using ig as
a service rather than starting it individually for every gadget. A new binary
will be added to be able to serve as a client.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
Prior to this change, we'd always tunnel connections to our gRPC service through
the k8s API server. This allows a "direct mode" that will instead directly
connect to one or multiple gRPC endpoints.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ready to go! Last minor nits but it's looking fine!

Good job!

docs/ig.md Outdated Show resolved Hide resolved
docs/ig.md Show resolved Hide resolved
This change introduces a new binary (for now called `gadgetctl`), to serve as the client
for ig that is run in daemon mode. The connection is handled by the grpc-runtime, that
has been extended to allow direct connections to gRPC endpoints in addition to proxied
connections through the kubernetes API server.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
Signed-off-by: Michael Friese <mfriese@microsoft.com>
This adds explanations on how to use ig in combination with gadgetctl, both
using unix sockets (with an unprivileged user) and tcp sockets.

Signed-off-by: Michael Friese <mfriese@microsoft.com>
@flyth
Copy link
Member Author

flyth commented Sep 29, 2023

Thanks for the reviews!

@flyth flyth merged commit 781bea6 into main Sep 29, 2023
54 checks passed
@flyth flyth deleted the michael/ig-daemon branch September 29, 2023 19:52
}
client := api.NewGadgetManagerClient(conn)
defer conn.Close()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is conn closed? I think getClientFromRandomTarget should return the connection instead of the client, so the caller has the opportunity to call defer conn.Close(). (opened #2092 to fix it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFE] Adopt client/server setup also for ig
6 participants