Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CONTINT-3542] Send in-<inode> using the cgroup controller when container-id cannot be retrieved #291

Merged
merged 10 commits into from
Jan 11, 2024

Conversation

AliDatadog
Copy link
Contributor

@AliDatadog AliDatadog commented Nov 28, 2023

This Pull Request implements the support of Origin Detection when the container-id is unavailable from inside the application container only with cgroupv2.

It first retrieves the cgroup node path by parsing proc/self/cgroup which is formatted differently in cgroup v1 and v2. See https://man7.org/linux/man-pages/man7/cgroups.7.html.

The format is a list of hierarchy-ID:controller-list:cgroup-path similar to:

0::<path>
1:name=systemd:... // only in cgroupv1

Once the path is retrieved, we retrieve the inode of /sys/fs/cgroup + <path> where <path> is the path retrieved previously.

Not that:

  • We use os.Stat which follows symlinks. Thus, we assume that the path is not a symlink
  • We ignore cases where the inode is less than 2.

How to test

  1. Create a dummy app or use mine docker.io/alidatadog/dummy-dsd-app:0.1.0@sha256:0eca30d71b4fc4ff667b17f46e8e0333712566b7cac2bbc65f60515b8b13e0cc
package main

import (
	"log"
	"time"

	"github.com/DataDog/datadog-go/v5/statsd"
)

type dummyWriter struct{}

func (d dummyWriter) Write(p []byte) (n int, err error) {
	log.Println(string(p))
	return len(p), nil
}

func (d dummyWriter) Close() error {
	return nil
}

func main() {
	// start a statsd client that logs everything to stdout
	statsd, err := statsd.NewWithWriter(dummyWriter{})
	if err != nil {
		log.Fatal(err)
	}

	for {
		// send a counter metric with the value of 1
		statsd.Count("page.views", 1, nil, 1)
		// sleep for 10 seconds
		time.Sleep(10 * time.Second)
	}
}

Dockerfile:

# Use the official Golang image to create a build artifact.
# This is based on Debian and sets the GOPATH to /go.
FROM golang:latest as builder

# Set the Current Working Directory inside the container
WORKDIR /app

# Copy the source from the current directory to the Working Directory inside the container
COPY . .

# Build the Go app
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

# Command to run the executable
CMD ["./main"]
  1. Deploy the app on a kind 1.28.0 cluster
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: dummy-go-app
  spec:
    replicas: 1
    selector:
      matchLabels:
        app: dummy-go-app
    template:
      metadata:
        labels:
          app: dummy-go-app
          admission.datadoghq.com/enabled: "false"
      spec:
        containers:
        - name: dummy-go-app
          image: docker.io/alidatadog/dummy-dsd-app:0.1.0
  1. Look at the logs of the app
❯ k logs dummy-go-app-6688bbddd4-rn6mg
2023/11/29 16:44:57 page.views:1|c|c:in-35044

2023/11/29 16:45:05 datadog.dogstatsd.client.aggregated_context:1|c|#client:go,client_version:5.3.0,client_transport:custom|c:in-35044

…up to retrieve the cgroup node, read mountspoint+cgroupnode to retrieve the inode and store it as the new container id
@AliDatadog AliDatadog force-pushed the ali/send-inode branch 2 times, most recently from f106b04 to 7918f9e Compare November 29, 2023 16:20
@AliDatadog AliDatadog changed the title Send in-<inode> when container-id cannot be retrieved Send in-<inode> when container-id cannot be retrieved on cgroupv2 Nov 29, 2023
@carlosroman carlosroman requested a review from a team November 30, 2023 08:58
)

// initContainerID initializes the container ID.
func init() {
initContainerID = dummyInitContainerId
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary because currently most of the tests will be broken depending on your environment. I noticed that a containerID was added to a lot of metrics when the expected result was hardcoded without containerID.

@@ -153,6 +222,9 @@ func initContainerID(userProvidedID string, cgroupFallback bool) {
} else {
containerID = readMountinfo(selfMountInfoPath)
}
if containerID != "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we don't have a very reliable way to identify if we're inside a container. It means that even if the tracer is running on host, we'll send the inode of the cgroup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thinking more about that. If you are in cgroup host namespace, it's useless to get inode because /proc/self/cgroup is going yield a container id if present.

So I think it's fair to check for cgroup namesapce != host cgroup namespace to get inode. This will avoid some work for the lib and the Agent.
You can see how it's implemented from this code:
https://github.com/DataDog/datadog-agent/blob/main/pkg/util/containers/metrics/system/collector_linux.go#L84-L91

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, I applied it in the last commit

@AliDatadog AliDatadog changed the title Send in-<inode> when container-id cannot be retrieved on cgroupv2 Send in-<inode> using the cgroup controller when container-id cannot be retrieved Dec 4, 2023
@AliDatadog AliDatadog force-pushed the ali/send-inode branch 3 times, most recently from f98e4a1 to ab13dd5 Compare December 4, 2023 18:35
@@ -153,6 +215,9 @@ func initContainerID(userProvidedID string, cgroupFallback bool) {
} else {
containerID = readMountinfo(selfMountInfoPath)
}
if containerID != "" {
containerID = getCgroupInode(mountsPath, cgroupPath)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in trace lib.

@AliDatadog AliDatadog changed the title Send in-<inode> using the cgroup controller when container-id cannot be retrieved [CONTINT-3542] Send in-<inode> using the cgroup controller when container-id cannot be retrieved Dec 11, 2023
@@ -3,7 +3,7 @@

package statsd

func initContainerID(userProvidedID string, cgroupFallback bool) {
var initContainerID = func(userProvidedID string, cgroupFallback bool) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary because initContainerID would be used in every test, breaking them in some env

vboulineau
vboulineau previously approved these changes Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants