Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recently running & actively running process events output at startup #1209

Closed
2 tasks done
rayjanoka opened this issue Jul 10, 2023 · 13 comments
Closed
2 tasks done

Recently running & actively running process events output at startup #1209

rayjanoka opened this issue Jul 10, 2023 · 13 comments
Labels
kind/bug Something isn't working

Comments

@rayjanoka
Copy link

rayjanoka commented Jul 10, 2023

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

For a few seconds after I restart my tetragon daemonSet I always get a bunch of events for things that I have set to be filtered out via the exportDenyList.

We have a lot of activity in the cluster (monitoring 5k pods across 70 nodes w/tetra) and normal executions are filtered out, but when I restart the ds I get a flood of events that should have been filtered out. A few seconds after the tetragon startup I stop receiving the excess events and the deny filters work as expected.

Tetragon Version

0.9.0 & 0.10.0-pre.2

Kernel Version

5.19.0-43-generic

Kubernetes Version

1.26.4

Bugtool

No response

Relevant log output

example deny list item
  exportAllowList: |-
    {"pod_regex":["^game-i-\\w+$"],"event_set":["PROCESS_EXEC", "PROCESS_KPROBE", "PROCESS_UPROBE"]}
exportDenyList: |-
    {"binary_regex":["^/usr/bin/curl$"],"arguments_regex":["^-s -k --retry 5 --retry-delay 10 https://asdf.bbb.svc:8080/version$"], "pod_regex":["^asdf-\\w+$"]}
    ...we have about 50 of these deny rules...

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@rayjanoka rayjanoka added the kind/bug Something isn't working label Jul 10, 2023
@kkourt
Copy link
Contributor

kkourt commented Jul 14, 2023

Thanks for the report!Could you please provide some examples of events that you 'd expect to be matched by the filter rules but are not?

@kkourt
Copy link
Contributor

kkourt commented Jul 14, 2023

Also, how are you gathering the events? Are you using the tetra CLI (tetra getevents) or do you export in JSON? Note that export filters configured in the daemonset do not apply to tetra getevents.

@rayjanoka
Copy link
Author

ahh, upon closer inspection I think I see what is going on.

the events I was looking at weren't filtered events, they were actually valid events. it looks like when tetragon starts up it prints out the process_exec start event for all the already running processes. Since I've already scanned these events when they first start I don't want to look at them again. I think I can look at the start_time to determine if these are old or new and get this solved. thanks!

@kkourt
Copy link
Contributor

kkourt commented Jul 17, 2023

ahh, upon closer inspection I think I see what is going on.

the events I was looking at weren't filtered events, they were actually valid events. it looks like when tetragon starts up it prints out the process_exec start event for all the already running processes. Since I've already scanned these events when they first start I don't want to look at them again. I think I can look at the start_time to determine if these are old or new and get this solved. thanks!

Cheers! Another way to distinguish these events is to check for "proc" in the flags field (since these events are generated form /proc).

@rayjanoka
Copy link
Author

rayjanoka commented Jul 17, 2023

thanks @kkourt

ok, getting closer, many of them are flags: procFS auid and those look good, but some are flags: execve clone.

As best I can tell these execve clone event pids are not running. I found the description for this clone flag, but I'm not sure why I'm seeing these events on startup.

https://github.com/cilium/tetragon/blob/main/pkg/api/flags.go#L69-L75

execve clone event
{
  "process_exec": {
    "process": {
      "exec_id": "Ym00NToyNzk3Mjc4NTQ4MzAyMTk2OjE2MDY2Njg=",
      "pid": 1606668,
      "uid": 100,
      "cwd": "/home/user/server",
      "binary": "/usr/bin/chmod",
      "arguments": "755 /tmp/sqlite-3.7.2-libsqlitejdbc.so",
      "flags": "execve clone",
      "start_time": "2023-07-17T00:33:10.815357493Z",
      "auid": 4294967295,
      "pod": {
        "namespace": "game",
        "name": "game-i-64aa20c54055f4ac2c902a9d",
        "container": {
          "id": "containerd://d9620432962d2f6e20c1622736631c0f2ca7dd97b7e322a763d6dabfc7d2291b",
          "name": "instance",
          "image": {
            "id": "x",
            "name": "x"
          },
          "start_time": "2023-07-17T00:32:56Z",
          "pid": 190
        },
        "pod_labels": {
          "app": "game-instance"
        }
      },
      "docker": "d9620432962d2f6e20c1622736631c0",
      "parent_exec_id": "Ym00NToyNzk3MjY5NjI3MTc3MTM2OjE2MDM5ODU=",
      "cap": {},
      "ns": {
        "uts": {
          "inum": 4026537567
        },
        "ipc": {
          "inum": 4026537568
        },
        "mnt": {
          "inum": 4026537570
        },
        "pid": {
          "inum": 4026537571
        },
        "pid_for_children": {
          "inum": 4026537571
        },
        "net": {
          "inum": 4026536499
        },
        "time": {
          "inum": 4026531834,
          "is_host": true
        },
        "time_for_children": {
          "inum": 4026531834,
          "is_host": true
        },
        "cgroup": {
          "inum": 4026537610
        },
        "user": {
          "inum": 4026531837,
          "is_host": true
        }
      },
      "tid": 1606668
    },
  },
  "node_name": "zm45",
  "time": "2023-07-17T00:33:10.815357303Z"
}

@rayjanoka rayjanoka reopened this Jul 17, 2023
@rayjanoka rayjanoka changed the title Filters are not applied immediately on startup Existing process event output at startup Jul 17, 2023
@kkourt
Copy link
Contributor

kkourt commented Jul 21, 2023

thanks @kkourt

ok, getting closer, many of them are flags: procFS auid and those look good, but some are flags: execve clone.

As best I can tell these execve clone event pids are not running. I found the description for this clone flag, but I'm not sure why I'm seeing these events on startup.

https://github.com/cilium/tetragon/blob/main/pkg/api/flags.go#L69-L75
execve clone event

The exec clone events are generated after tetragon starts running. I think the most plausible explanation is that the programs do not exist anymore (i.e., they were terminated). You can verify this by looking for ProcessExit events.

@rayjanoka
Copy link
Author

rayjanoka commented Aug 16, 2023

hmm, I'm not currently receiving the ProcessExit events at all...but I could.

If I restart tetragon in succession, on startup it will keep telling me about the same old ProcessExec clone events with the same exec_ids I've already seen for a short running process, like a chmod with a start_time from 14 hours ago.

I guess the old data is just buffered somewhere in bpf land?

It looks like I'll have to either check the start_time or the ProcessExit events...ya?

thanks!

@kkourt
Copy link
Contributor

kkourt commented Aug 21, 2023

Hi,

hmm, I'm not currently receiving the ProcessExit events at all...but I could.

If I restart tetragon in succession, on startup it will keep telling me about the same old ProcessExec clone events with the same exec_ids I've already seen for a short running process, like a chmod with a start_time from 14 hours ago.

I guess the old data is just buffered somewhere in bpf land?

It looks like I'll have to either check the start_time or the ProcessExit events...ya?

thanks!

If you start Tetraragon multiple times, you will get the events from proc (the ones with the procFS flag) multiple times.

@rayjanoka
Copy link
Author

rayjanoka commented Aug 22, 2023

These aren't proc though, they are execve clone from hours before the restart.

@kkourt
Copy link
Contributor

kkourt commented Aug 23, 2023

These aren't proc though, they are execve clone from hours before the restart.

How are you retrieving those events?

@rayjanoka
Copy link
Author

rayjanoka commented Aug 29, 2023

Just from the k8s logs on startup, I get a few of these from each node.

For instance, if I restart tetragon on this node now, Aug 29 05:31:56 UTC 2023, then on startup of the new pod I get this event from back on Aug 27 19:54:13 UTC 2023 (over a day old).

➜ stern -n kube-system  --since 600s tetragon-hbmf6 -c export-stdout
{
  "process_exec": {
    "process": {
      "exec_id": "Ym01MTo2NDA5MzE5ODk0MTE2MjAxOjIyNDMwMjQ=",
      "pid": 2243024,
      "uid": 100,
      "cwd": "/home/minecraft/server",
      "binary": "/usr/bin/cowsay",
      "arguments": "-d -W 15",
      "flags": "execve clone",
      "start_time": "2023-08-27T19:54:13.171416449Z",
      "auid": 4294967295,
      "pod": {
        "namespace": "game",
        "name": "game-i-xxx",
        "container": {
          "id": "containerd://76cc6f952f35bca49ac86df959be6c47a9fed2f295d4e691c6077dde08561796",
          "name": "instance",
          "image": {
            "id": "player-server@sha256:eb78f4ee1d7b4d0a8953966886e3ed8cea855c59bcd34f1047f169bff143f90a",
            "name": "player-server:b584d2adff9ceafe01f5cb8954c4cc58865040f9-live"
          },
          "start_time": "2023-08-27T19:43:19Z",
          "pid": 1713
        },
        "pod_labels": {
          
        }
      },
      "docker": "76cc6f952f35bca49ac86df959be6c4",
      "parent_exec_id": "Ym01MTo2NDA5MzE5ODkzMzM0MzQxOjIyNDMwMjI=",
      "cap": {},
      "ns": {
        "uts": {
          "inum": 4026539836
        },
        "ipc": {
          "inum": 4026539837
        },
        "mnt": {
          "inum": 4026539839
        },
        "pid": {
          "inum": 4026539840
        },
        "pid_for_children": {
          "inum": 4026539840
        },
        "net": {
          "inum": 4026539047
        },
        "time": {
          "inum": 4026531834,
          "is_host": true
        },
        "time_for_children": {
          "inum": 4026531834,
          "is_host": true
        },
        "cgroup": {
          "inum": 4026539841
        },
        "user": {
          "inum": 4026531837,
          "is_host": true
        }
      },
      "tid": 2243024
    },
    "parent": {
      "exec_id": "Ym01MTo2NDA5MzE5ODkzMzM0MzQxOjIyNDMwMjI=",
      "pid": 2243022,
      "uid": 100,
      "cwd": "/home/minecraft/server",
      "binary": "/usr/bin/python3",
      "arguments": "/usr/bin/python3 -version",
      "flags": "execve",
      "start_time": "2023-08-27T19:54:13.170634939Z",
      "auid": 4294967295,
      "pod": {
        "namespace": "game",
        "name": "game-i-xxx",
        "container": {
          "id": "containerd://76cc6f952f35bca49ac86df959be6c47a9fed2f295d4e691c6077dde08561796",
          "name": "instance",
          "image": {
            "id": "player-server@sha256:eb78f4ee1d7b4d0a8953966886e3ed8cea855c59bcd34f1047f169bff143f90a",
            "name": "player-server:b584d2adff9ceafe01f5cb8954c4cc58865040f9-live"
          },
          "start_time": "2023-08-27T19:43:19Z",
          "pid": 1712
        },
        "pod_labels": {
        }
      },
      "docker": "76cc6f952f35bca49ac86df959be6c4",
      "parent_exec_id": "Ym01MTo2NDA5MzE5ODkyMzQ5MTY1OjIyNDMwMjI=",
      "refcnt": 3,
      "cap": {},
      "ns": {
        "uts": {
          "inum": 4026539836
        },
        "ipc": {
          "inum": 4026539837
        },
        "mnt": {
          "inum": 4026539839
        },
        "pid": {
          "inum": 4026539840
        },
        "pid_for_children": {
          "inum": 4026539840
        },
        "net": {
          "inum": 4026539047
        },
        "time": {
          "inum": 4026531834,
          "is_host": true
        },
        "time_for_children": {
          "inum": 4026531834,
          "is_host": true
        },
        "cgroup": {
          "inum": 4026539841
        },
        "user": {
          "inum": 4026531837,
          "is_host": true
        }
      },
      "tid": 2243022
    }
  },
  "node_name": "xxx51",
  "time": "2023-08-27T19:54:13.171416329Z"
}
...

@kkourt
Copy link
Contributor

kkourt commented Aug 30, 2023

I think export-stdout just tails a file in the export directory, and this file is shared across multiple instances of the tetragon pod.

ref: https://github.com/cilium/tetragon/blob/main/install/kubernetes/templates/_container_export_stdout.tpl

So I believe what you are seeing is expected.

@rayjanoka
Copy link
Author

rayjanoka commented Aug 30, 2023

ahhh I see it uses a hostPath so it would persist between pod restarts.

Ok, adding -n 0 to the tail command resolved my final issue. thanks @kkourt!!!

export:
  stdout:
    commandOverride:
      - sh
    argsOverride:
      - -c
      - tail -n 0 -q -F /var/run/cilium/tetragon/tetragon.log 2> /dev/null

@rayjanoka rayjanoka changed the title Existing process event output at startup Formerly running & currently running process events output at startup Aug 30, 2023
@rayjanoka rayjanoka changed the title Formerly running & currently running process events output at startup Recently running & actively running process events output at startup Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants