Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracee event context has type overflows #3690

Open
rafaeldtinoco opened this issue Nov 15, 2023 · 4 comments
Open

tracee event context has type overflows #3690

rafaeldtinoco opened this issue Nov 15, 2023 · 4 comments
Assignees
Labels
Milestone

Comments

@rafaeldtinoco
Copy link
Contributor

rafaeldtinoco commented Nov 15, 2023

Description

While aware of #2870, we still need to support/fix current tracee event context, as it has data types overflows that are likely causing context to be lost (all uint32 values, from the eBPF context, might be overflowed when decoded into Event struct, such as the PIDNS value, declared as int, which defaults to very high numbers in my env).

The new event.proto takes care of correct types (using google.protobuf.UInt32Value in the right places, kudos to @josedonizetti).

More details

While playing with some marshalling and unmarshalling ideas (for other output printers, for example), I got the following error after a event -> json -> parquet conversion:

{
  "level": "error",
  "ts": 1700017058.446953,
  "msg": "Error encoding event to parquet",
  "error": "integer overflow on token 4026531841",
  "event": {
    "timestamp": 1700017058317424171,
    "threadStartTime": 1699918738371868249,
    "processorId": 7,
    "processId": 463085,
    "cgroupId": 4794,
    "threadId": 463085,
    "parentProcessId": 462937,
    "hostProcessId": 463085,
    "hostThreadId": 463085,
    "hostParentProcessId": 462937,
    "userId": 1000,
    "mountNamespace": 4026531841,
    "pidNamespace": 4026531836,
    "processName": "code",
    "executable": {
      "path": ""
    },
    "hostName": "rugged",
    "containerId": "",
    "container": {},
    "kubernetes": {},
    "eventId": "257",
    "eventName": "openat",
    "matchedPolicies": [
      ""
    ],
    "argsNum": 4,
    "returnValue": 45,
    "syscall": "openat",
    "stackAddresses": null,
    "contextFlags": {
      "containerStarted": false,
      "isCompat": false
    },
    "threadEntityId": 1656711223,
    "processEntityId": 1656711223,
    "parentEntityId": 723639701,
    "args": [
      {
        "name": "dirfd",
        "type": "int",
        "value": -100
      },
      {
        "name": "pathname",
        "type": "const char*",
        "value": "/proc/463548/cmdline"
      },
      {
        "name": "flags",
        "type": "string",
        "value": ""
      },
      {
        "name": "mode",
        "type": "mode_t",
        "value": 0
      }
    ]
  }
}

and:

$ sudo lsns | grep 4026531836
4026531836 pid       316      1 root            /sbin/init

and

11101111111111111111111111111100

isn't quite an unsigned 32 bit overflow but it is a signed 32 bit overflow.

Meaning that a bunch of Event fields that are int should actually be uint32:

type Context struct {
	Ts              uint64
	StartTime       uint64
	CgroupID        uint64
	Pid             uint32
	Tid             uint32
	Ppid            uint32
	HostPid         uint32
	HostTid         uint32
	HostPpid        uint32
	Uid             uint32
	MntID           uint32
	PidID           uint32
	Comm            [16]byte
	UtsName         [16]byte
	Flags           uint32
	LeaderStartTime uint64
	ParentStartTime uint64
	EventID         events.ID // int32
	Syscall         int32
	MatchedPolicies uint64
	Retval          int64
	StackID         uint32
	ProcessorId     uint16
	_               [2]byte // padding
}
type Event struct {
	Timestamp             int          `json:"timestamp"`
	ThreadStartTime       int          `json:"threadStartTime"`
	ProcessorID           int          `json:"processorId"`
	ProcessID             int          `json:"processId"`
	CgroupID              uint         `json:"cgroupId"`
	ThreadID              int          `json:"threadId"`
	ParentProcessID       int          `json:"parentProcessId"`
	HostProcessID         int          `json:"hostProcessId"`
	HostThreadID          int          `json:"hostThreadId"`
	HostParentProcessID   int          `json:"hostParentProcessId"`
	UserID                int          `json:"userId"`
	MountNS               int          `json:"mountNamespace"`
	PIDNS                 int          `json:"pidNamespace"`
	ProcessName           string       `json:"processName"`
	Executable            File         `json:"executable"`
	HostName              string       `json:"hostName"`
	ContainerID           string       `json:"containerId"`
	Container             Container    `json:"container,omitempty"`
	Kubernetes            Kubernetes   `json:"kubernetes,omitempty"`
	EventID               int          `json:"eventId,string"`
	EventName             string       `json:"eventName"`
	PoliciesVersion       uint16       `json:"-"`
	MatchedPoliciesKernel uint64       `json:"-"`
	MatchedPoliciesUser   uint64       `json:"-"`
	MatchedPolicies       []string     `json:"matchedPolicies,omitempty"`
	ArgsNum               int          `json:"argsNum"`
	ReturnValue           int          `json:"returnValue"`
	Syscall               string       `json:"syscall"`
	StackAddresses        []uint64     `json:"stackAddresses"`
	ContextFlags          ContextFlags `json:"contextFlags"`
	ThreadEntityId        uint32       `json:"threadEntityId"`  // thread task unique identifier (*)
	ProcessEntityId       uint32       `json:"processEntityId"` // process unique identifier (*)
	ParentEntityId        uint32       `json:"parentEntityId"`  // parent process unique identifier (*)
	Args                  []Argument   `json:"args"`            // args are ordered according their appearance in the original event
	Metadata              *Metadata    `json:"metadata,omitempty"`
}

The event struct dates from 2020 and since the beginning has uin32 types declared as ints. We probably never payed hard attention because most of the values don't overflow often (and if they do we wouldn't catch in any test I believe). But the default PID namespaces are overflowing in my environments. It happened by accident because Parquet does not support unsigned integers on its data format.

@geyslan
Copy link
Member

geyslan commented Nov 27, 2023

Is this awaiting any related changes or can I tackle the Event int change to uint32?

@rafaeldtinoco
Copy link
Contributor Author

Is this awaiting any related changes or can I tackle the Event int change to uint32?

Please do tackle this. Make sure to test it extensively (including internal E2E) because Im a bit afraid of this change. Let me know if you need anything. Thanks a lot!

@rafaeldtinoco rafaeldtinoco added this to the v0.20.0 milestone Nov 27, 2023
@geyslan
Copy link
Member

geyslan commented Nov 29, 2023

After offline discussion we decided to wait the Event be replaced by https://github.com/aquasecurity/tracee/blob/main/api/v1beta1/event.proto#L11 which already tackled the type normalisation. @josedonizetti please refer (close) this in your upcoming work. Thanks.

@josedonizetti
Copy link
Collaborator

@geyslan @rafaeldtinoco yeah, I think waiting is for the best now since it seems we will migrate the event structure for the 0.20.0 release, but if we see it is not the case in the next few weeks, this can should indeed be fixed.

@josedonizetti josedonizetti modified the milestones: v0.20.0, v0.21.0 Feb 3, 2024
@yanivagman yanivagman modified the milestones: v0.21.0, v0.22.0 Apr 18, 2024
@yanivagman yanivagman modified the milestones: v0.22.0, v0.23.0 May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants