Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trace exec: add upper_layer field #2353

Merged
merged 2 commits into from
Jan 18, 2024
Merged

trace exec: add upper_layer field #2353

merged 2 commits into from
Jan 18, 2024

Conversation

alban
Copy link
Member

@alban alban commented Jan 10, 2024

trace exec: add upper_layer field

The upper_layer field says whether the program file is on the upper layer of overlayfs, i.e. it was modified in the container.

How to use

$ sudo -E ig trace exec
RUNTIME.CONTAINERNAME PID   PPID  COMM  RET ARGS                                    UPPE…
lucid_carver          48098 48076 sh    0   /usr/bin/sh -c cp /bin/echo /bin/echo2… false
lucid_carver          48121 48098 cp    0   /usr/bin/cp /bin/echo /bin/echo2        false
lucid_carver          48122 48098 echo  0   /bin/echo lower                         false
lucid_carver          48123 48098 echo2 0   /bin/echo2 upper                        true
$ docker run -ti --rm ubuntu sh -c 'cp /bin/echo /bin/echo2 ; /bin/echo lower ; /bin/echo2 upper'
lower
upper

As you can see in the output, only /bin/echo2 is reported as being on the overlay upper layer.

Testing done

See above.

TODOs

  • Update documentation
  • Update tests
  • Should I use a bitfield? Use bools
  • How should the output look like for the user? Show booleans (true/false)

Limitations

The reporting is incorrect if the exec failed (e.g. due to lack of +x permission). Example:

$ docker run -ti --rm ubuntu sh -c 'cat /bin/echo > /bin/echo2 ; /bin/echo lower ; /bin/echo2 upper'
lower
sh: 1: /bin/echo2: Permission denied
$ go run -exec 'sudo -E' ./cmd/ig/... trace exec -o columns=comm,ret,args,upperlayer
INFO[0000] Experimental features enabled                
COMM             RET ARGS                                                   UPPERLAYER
sh               0   /usr/bin/sh -c cat /bin/echo > /bin/echo2 ; /bin/echo… 0         
cat              0   /usr/bin/cat /bin/echo                                 0         
echo             0   /bin/echo lower                                        0         
sh               -13 /bin/echo2 upper                                       0

Misc

Fixes #2345

See also #1913

cc @galofir-ms

Copy link
Member

@eiffel-fl eiffel-fl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

I took a quick look and I have some comments, I will test it later.

Should I use a bitfield?

For now, it is OK to use a uint8 but if we plan to add plenty of boolean at bitfield would definitely be welcomed.

How should the output look like for the user?

What about:

COMM             RET ARGS                                                   UPPERLAYER
sh               0   /usr/bin/sh -c cat /bin/echo > /bin/echo2 ; /bin/echo… false

or

COMM             RET ARGS                                                   UPPERLAYER
sh               0   /usr/bin/sh -c cat /bin/echo > /bin/echo2 ; /bin/echo… ❌

Best regards.

pkg/gadgets/trace/exec/tracer/bpf/execsnoop.bpf.c Outdated Show resolved Hide resolved
pkg/gadgets/trace/exec/tracer/bpf/execsnoop.bpf.c Outdated Show resolved Hide resolved
@@ -163,6 +163,7 @@ func (t *Tracer) run() {
Gid: bpfEvent.Gid,
LoginUid: bpfEvent.Loginuid,
SessionId: bpfEvent.Sessionid,
UpperLayer: uint32(bpfEvent.UpperLayer),
Copy link
Member

@eiffel-fl eiffel-fl Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why uint32 as it is an uint8 in eBPF code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to a bool so that it shows true or false in the output.

@eiffel-fl
Copy link
Member

By the way, would it make sense to port everything to the image-based flavor?

@alban
Copy link
Member Author

alban commented Jan 15, 2024

The reporting is incorrect if the exec failed (e.g. due to lack of +x permission).

My plan is to document this limitation and work on it in a future PR.

@alban
Copy link
Member Author

alban commented Jan 16, 2024

I updated the PR:

  • Add the feature in the image-based gadget too
  • Use bools and show true/false in the output
  • Add documentation, including about 2 limitations:
    • Upper_layer field not computed when execve returns an error
    • For script shells, we see upper_layer for the interpreter and not the script (both could be useful, but only one is implemented for now)
  • Add tests

@alban
Copy link
Member Author

alban commented Jan 17, 2024

The CI fails because:

  • README: temporarily dead links
  • cri-o test failure on TestTraceNetwork:
2024-01-17T11:13:56.2788453Z         time="2024-01-17T11:13:45Z" level=error msg="namespace enricher: failed to get mnt namespace on container 34ce46b7bda2cb050eba4573a68ddcf893f6e76ab5c2acfea8fed797eced2d73: stat /proc/7347/ns/mnt: no such file or directory"
2024-01-17T11:13:56.2790989Z         time="2024-01-17T11:13:45Z" level=warning msg="WithTracerCollection: failed to open mntns reference for container 34ce46b7bda2cb050eba4573a68ddcf893f6e76ab5c2acfea8fed797eced2d73: no such file or directory"
2024-01-17T11:13:56.2793665Z         time="2024-01-17T11:13:45Z" level=error msg="cgroup enricher: failed to get cgroup paths on container b500d81aeadc4796941c5162c46087af882309aa2544463fa8a4cde8539f7930: parsing cgroup: open /proc/4377/cgroup: no such file or directory"
2024-01-17T11:13:56.2796070Z         time="2024-01-17T11:13:45Z" level=error msg="namespace enricher: failed to get mnt namespace on container b500d81aeadc4796941c5162c46087af882309aa2544463fa8a4cde8539f7930: stat /proc/4377/ns/mnt: no such file or directory"
2024-01-17T11:13:56.2798323Z         time="2024-01-17T11:13:45Z" level=warning msg="WithTracerCollection: failed to open mntns reference for container b500d81aeadc4796941c5162c46087af882309aa2544463fa8a4cde8539f7930: no such file or directory"
2024-01-17T11:13:56.2800277Z         time="2024-01-17T11:13:46Z" level=warning msg="start tracing container \"test-pod\": getting network namespace of pid 8048: stat /proc/8048/ns/net: no such file or directory"
2024-01-17T11:13:56.2801525Z         time="2024-01-17T11:13:46Z" level=warning msg="stop tracing container \"test-pod\": pid 8048 is not attached"
...
2024-01-17T11:13:56.3781612Z     helpers.go:145: output doesn't contain the expected entry
2024-01-17T11:13:56.3782065Z         captured:
...
2024-01-17T11:13:56.4611587Z         expected:
2024-01-17T11:13:56.4613753Z         {"runtime":{"runtimeName":"cri-o","containerName":"test-pod","containerImageName":"docker.io/library/busybox:latest"},"k8s":{"namespace":"test-trace-network-2779704655061405693","podName":"test-pod","containerName":"test-pod"},"type":"normal","comm":"wget","uid":0,"gid":0,"pktType":"OUTGOING","proto":"TCP","port":80,"dst":{"addr":"10.244.0.12","version":4}}

Both are unrelated to this PR.

@alban alban requested a review from eiffel-fl January 17, 2024 11:38
@alban alban self-assigned this Jan 18, 2024
Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it and works fine. LGTM, thanks!

pkg/gadgets/trace/exec/tracer/bpf/execsnoop.bpf.c Outdated Show resolved Hide resolved
gadgets/trace_exec/gadget.yaml Outdated Show resolved Hide resolved
Tested with:

$ go run -exec 'sudo -E' ./cmd/ig/... trace exec
RUNTIME.CONTAINERNAME PID   PPID  COMM  RET ARGS                                    UPPE…
lucid_carver          48098 48076 sh    0   /usr/bin/sh -c cp /bin/echo /bin/echo2… false
lucid_carver          48121 48098 cp    0   /usr/bin/cp /bin/echo /bin/echo2        false
lucid_carver          48122 48098 echo  0   /bin/echo lower                         false
lucid_carver          48123 48098 echo2 0   /bin/echo2 upper                        true

$ docker run -ti --rm ubuntu sh -c 'cp /bin/echo /bin/echo2 ; /bin/echo lower ; /bin/echo2 upper'
lower
upper

Signed-off-by: Alban Crequy <albancrequy@linux.microsoft.com>
Tested with:

$ sudo -E ./ig image build -t execsnoop ./gadgets/trace_exec/

$ sudo -E ./ig run execsnoop
INFO[0000] Experimental features enabled
RUNTIME.CONTAINERNAME         PID              PPID            UID             GID             R… UPP… COMM
test                          97952            97928           0               0               0  fal… sh
test                          97976            97952           0               0               0  fal… cp
test                          97977            97952           0               0               0  fal… date
test                          97952            97928           0               0               0  true date

$ docker run -ti --rm --name=test busybox sh -c 'cp /bin/date /date ; date ; /date'

Signed-off-by: Alban Crequy <albancrequy@linux.microsoft.com>
@alban alban merged commit d7f610b into main Jan 18, 2024
54 checks passed
@alban alban deleted the alban_upperlayer branch January 18, 2024 15:59
@mauriciovasquezbernal
Copy link
Member

@alban it seems it broke the CI on AKS and ARO:

AKS

upperlayer for sh event is true, but it should be false:

captured:

{
        "runtime": {
                "containerImageName": "docker.io/library/busybox:latest"
        },
        "k8s": {
                "namespace": "test-trace-exec-7345717025894922932",
                "podName": "test-pod",
                "containerName": "test-pod"
        },
        "type": "normal",
        "comm": "sh",
        "args": [
                "/bin/sh",
                "-c",
                "cp /bin/date /date ; setuidgid 1000:1111 sh -c 'while true; do /date ; /bin/sleep 0.1; done'"
        ],
        "uid": 0,
        "gid": 0,
        "upperlayer": true,
        "loginuid": 0,
        "sessionid": 0,
        "cwd": "/"
}

expected:

{
        "runtime": {
                "containerImageName": "docker.io/library/busybox:latest"
        },
        "k8s": {
                "namespace": "test-trace-exec-7345717025894922932",
                "podName": "test-pod",
                "containerName": "test-pod"
        },
        "type": "normal",
        "comm": "sh",
        "args": [
                "/bin/sh",
                "-c",
                "cp /bin/date /date ; setuidgid 1000:1111 sh -c 'while true; do /date ; /bin/sleep 0.1; done'"
        ],
        "uid": 0,
        "gid": 0,
        "upperlayer": false,
        "loginuid": 0,
        "sessionid": 0,
        "cwd": "/"
}

https://github.com/inspektor-gadget/inspektor-gadget/actions/runs/7572632543/attempts/1#summary-20624245156
https://github.com/inspektor-gadget/inspektor-gadget/actions/runs/7572632543/attempts/1#summary-20624245512

ARO

Event for /date has upperlayer=false, but it should be true.

captured:

{
        "runtime": {
                "containerImageName": "docker.io/library/busybox:latest"
        },
        "k8s": {
                "namespace": "test-trace-exec-5865986315796987308",
                "podName": "test-pod",
                "containerName": "test-pod"
        },
        "type": "normal",
        "comm": "date",
        "args": [
                "/date"
        ],
        "uid": 1000,
        "gid": 1111,
        "upperlayer": false,
        "loginuid": 0,
        "sessionid": 0,
        "cwd": "/"
}

expected:

{
        "runtime": {
                "containerImageName": "docker.io/library/busybox:latest"
        },
        "k8s": {
                "namespace": "test-trace-exec-5865986315796987308",
                "podName": "test-pod",
                "containerName": "test-pod"
        },
        "type": "normal",
        "comm": "date",
        "args": [
                "/date"
        ],
        "uid": 1000,
        "gid": 1111,
        "upperlayer": true,
        "loginuid": 0,
        "sessionid": 0,
        "cwd": "/"
}

https://github.com/inspektor-gadget/inspektor-gadget/actions/runs/7572632543/attempts/1#summary-20624245813

@alban
Copy link
Member Author

alban commented Jan 22, 2024

I tested manually on AKS and it worked for me:

$ ./kubectl-gadget trace exec -o columns=k8s.node,k8s.container,comm,ret,args,upperlayer
INFO[0000] Experimental features enabled                
K8S.NODE                   K8S.CONTAINER              COMM           RET ARGS                               UPPERLAYER
aks-userpool-…5-vmss000000 shell01                    date           0   /usr/bin/date                      false     
aks-userpool-…5-vmss000000 shell01                    date           0   /date                              true
$ kubectl run -ti --rm --image=ubuntu --restart=Never shell01 -- sh
# bash
root@shell01:/# cp /usr/bin/date /date
root@shell01:/# date ; /date
root@shell01:/# uname -a
Linux shell01 5.15.0-1053-azure #61-Ubuntu SMP Tue Nov 21 14:16:01 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Also tested with a azure node pool, and it worked fine too:

Linux normal-pod-tc82l 5.15.138.1-4.cm2 #1 SMP Thu Nov 30 21:48:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

The kernel version in the CI logs is:

OS detected: CBL-Mariner/Linux
Kernel detected: 5.15.138.1-4.cm2

This is exactly the same version, so the difference is not explained by a different kernel.

More logs show that upperlayer should be false when it's true and should be true when it's false:

Command Actual upperlayer Expected upperlayer
/bin/sh true false
/date false true
/bin/sleep true false
/ # chroot /host /opt/ig trace exec -o columns=runtime.containerName,comm,ret,args,upperlayer|grep mkdir
privileged-pod                 mkdir            0   /bin/mkdir /tmp/1                        true      
privileged-pod                 mkdir            0   /host/bin/mkdir /tmp/1                   false     
privileged-pod                 cp               0   /bin/cp /bin/mkdir /mkdir                true      
privileged-pod                 mkdir            0   /mkdir /tmp/1                            false     
privileged-pod                 sh               -2  /mkdirX /tmp/1                           false     
  • When the execution fails, I get upperlayer=false (correct).
  • When the binary is not on overlayfs (/host/bin/mkdir), I get upperlayer=false (correct).
  • When the binary is on overlayfs, but not on the upper layer (/bin/mkdir), I get upperlayer=true (bug!)
  • When the binary is on overlayfs, and on the upper layer (/mkdir), I get upperlayer=false (bug!)

@alban
Copy link
Member Author

alban commented Jan 22, 2024

Kernel configuration:

config RANDSTRUCT
        def_bool !RANDSTRUCT_NONE
config GCC_PLUGIN_RANDSTRUCT
        def_bool GCC_PLUGINS && RANDSTRUCT
        help
          Use GCC plugin to randomize structure layout.

On my laptop:

$ cat /boot/config-`uname -r`|grep RANDSTRUCT
CONFIG_RANDSTRUCT_NONE=y
$ cat /boot/config-`uname -r`|grep CONFIG_GCC_PLUGINS  
# CONFIG_GCC_PLUGINS is not set

On the Azure node:

# cat /boot/config-`uname -r`|grep RANDSTRUCT
# CONFIG_GCC_PLUGIN_RANDSTRUCT is not set
# cat /boot/config-`uname -r`|grep CONFIG_GCC_PLUGINS  
CONFIG_GCC_PLUGINS=y

In our current code, we rely on fields relative positions:

// We only rely on vfs_inode and __upperdentry relative positions

But we cannot rely on fields relative positions in a struct when fields positions are randomized. It might or might not work, depending on how the random order was selected.

@alban
Copy link
Member Author

alban commented Jan 22, 2024

struct ovl_inode does not have __randomize_layout (and it does not consist of entirely function pointers causing its layout to be randomized) , so relying on relative positions should work fine. But struct inode does (see include/linux/fs.h#L733). I wonder if __randomize_layout causes a change of size (sizeof) or of packing, which would cause the problem.

According to this lkml email, GCC_PLUGIN_RANDSTRUCT generates invalid BTF. I tested with bpftool btf dump file /tmp/vmlinux-5.15.138.1-4.cm2 format c that /sys/kernel/btf/vmlinux (copied from the azure node) does not know the correct randomization for struct inode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[RFE] [trace exec] upper_layer
3 participants