Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

Closed
wants to merge 2 commits into from
Closed

enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

wants to merge 2 commits into from

Conversation

spencergilbert
Copy link
Contributor

Signed-off-by: Spencer Gilbert spencer.gilbert@gmail.com

Closes #6967, based on this comment #6966 (comment)

Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>
@spencergilbert spencergilbert self-assigned this May 17, 2021
@spencergilbert spencergilbert requested review from a team May 17, 2021 14:49
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there might be something else going on here as the line limit does indeed appear to be 16 KiB rather than 16K characters:

https://github.com/moby/moby/blob/7b9275c0da707b030e62c96b679a976f31f929d3/daemon/logger/copier.go#L19-L21

Used down here:

https://github.com/moby/moby/blob/7b9275c0da707b030e62c96b679a976f31f929d3/daemon/logger/copier.go#L52-L179

It looks like log drivers can override that value, but the cloudwatch logs one is the only one I see doing that.

🤔

src/sources/kubernetes_logs/mod.rs Outdated Show resolved Hide resolved
Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>
Copy link
Contributor

@MOZGIII MOZGIII left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work.
If this keeps creating issues, I suggest taking a deeper look at the actual k8s source.
Here's a good pointer.
Note that in the surrounding code at the Kubernetes, the buffers are dynamically sized (https://github.com/kubernetes/kubernetes/blob/c5bd36ef90140a7f3b3d676d319196e684d3d802/pkg/kubelet/kuberuntime/logs/logs.go#L309).

Also, the limit of 16 KB mentioned in the code is for the log line contents only, not for the surrounding metadata (like the timestamp). However, I'm pretty positive that limit is intended to be the byte-size limit, rather than char-size limit by design. That said, things might've changed, or there may be a bug somewhere, so it makes sense to assume the worst-case on our end - which would be the max UTF-8 charset length. I'm all for this change.

@jszwedko
Copy link
Member

Thanks @MOZGIII !

Note that in the surrounding code at the Kubernetes, the buffers are dynamically sized (https://github.com/kubernetes/kubernetes/blob/c5bd36ef90140a7f3b3d676d319196e684d3d802/pkg/kubelet/kuberuntime/logs/logs.go#L309).

This seems to be for reading logs, yes? I'm assuming this codepath is hit by kubectl logs.

That said, things might've changed, or there may be a bug somewhere, so it makes sense to assume the worst-case on our end - which would be the max UTF-8 charset length

I'm not convinced this is the worst-case scenario though. It would be if they were actually counting characters instead of bytes, but I don't see that happening. It seems more likely to me that the docker metadata isn't counted in that limit and so even this new limit could be too low. We should be able to verify that empirically.

I was curious what fluent-bit does and they seem to:

@jszwedko
Copy link
Member

We should be able to verify that empirically.

I verified that it seems to be 16K, before metadata by running:

 docker run  debian:9 bash -c 'yes ❤ | head -n 17000 | tr -d "\n"'

and looking at the raw JSON logs in /var/lib/docker/containers.

I also verified that it seems to be bytes by running:

docker run  debian:9 bash -c 'yes ❤ | head -n 17000 | tr -d "\n"'

And observed that it sliced after 5463 characters / 16387 bytes. It is odd that it ended up with one more byte than 16K, but that might be due to it slicing in the middle of a character and the way that wc to counts them.

$ docker version
Client:
 Version:           20.10.4
 API version:       1.41
 Go version:        go1.15.8
 Git commit:        d3cb89e
 Built:             Mon Mar 29 18:54:36 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.4
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.15.8
  Git commit:       363e9a8
  Built:            Mon Mar 29 18:55:03 2021
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

@jszwedko
Copy link
Member

Given we haven't heard back from the user with some example long log lines to figure out what's going on here, I'm inclined to close this and let it bubble back up if another user reports it. What do you think @spencergilbert ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubernetes_logs hardcoded max_line_bytes is too low
3 participants