enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

spencergilbert · 2021-05-17T14:49:41Z

Signed-off-by: Spencer Gilbert spencer.gilbert@gmail.com

Closes #6967, based on this comment #6966 (comment)

Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>

jszwedko

I feel like there might be something else going on here as the line limit does indeed appear to be 16 KiB rather than 16K characters:

https://github.com/moby/moby/blob/7b9275c0da707b030e62c96b679a976f31f929d3/daemon/logger/copier.go#L19-L21

Used down here:

https://github.com/moby/moby/blob/7b9275c0da707b030e62c96b679a976f31f929d3/daemon/logger/copier.go#L52-L179

It looks like log drivers can override that value, but the cloudwatch logs one is the only one I see doing that.

🤔

src/sources/kubernetes_logs/mod.rs

Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>

MOZGIII

This should work.
If this keeps creating issues, I suggest taking a deeper look at the actual k8s source.
Here's a good pointer.
Note that in the surrounding code at the Kubernetes, the buffers are dynamically sized (https://github.com/kubernetes/kubernetes/blob/c5bd36ef90140a7f3b3d676d319196e684d3d802/pkg/kubelet/kuberuntime/logs/logs.go#L309).

Also, the limit of 16 KB mentioned in the code is for the log line contents only, not for the surrounding metadata (like the timestamp). However, I'm pretty positive that limit is intended to be the byte-size limit, rather than char-size limit by design. That said, things might've changed, or there may be a bug somewhere, so it makes sense to assume the worst-case on our end - which would be the max UTF-8 charset length. I'm all for this change.

jszwedko · 2021-05-24T13:47:53Z

Thanks @MOZGIII !

Note that in the surrounding code at the Kubernetes, the buffers are dynamically sized (https://github.com/kubernetes/kubernetes/blob/c5bd36ef90140a7f3b3d676d319196e684d3d802/pkg/kubelet/kuberuntime/logs/logs.go#L309).

This seems to be for reading logs, yes? I'm assuming this codepath is hit by kubectl logs.

That said, things might've changed, or there may be a bug somewhere, so it makes sense to assume the worst-case on our end - which would be the max UTF-8 charset length

I'm not convinced this is the worst-case scenario though. It would be if they were actually counting characters instead of bytes, but I don't see that happening. It seems more likely to me that the docker metadata isn't counted in that limit and so even this new limit could be too low. We should be able to verify that empirically.

I was curious what fluent-bit does and they seem to:

cap at 5 MB lines https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-configmap.yaml
skip any lines longer than that

jszwedko · 2021-05-24T13:59:44Z

We should be able to verify that empirically.

I verified that it seems to be 16K, before metadata by running:

 docker run  debian:9 bash -c 'yes ❤ | head -n 17000 | tr -d "\n"'

and looking at the raw JSON logs in /var/lib/docker/containers.

I also verified that it seems to be bytes by running:

docker run  debian:9 bash -c 'yes ❤ | head -n 17000 | tr -d "\n"'

And observed that it sliced after 5463 characters / 16387 bytes. It is odd that it ended up with one more byte than 16K, but that might be due to it slicing in the middle of a character and the way that wc to counts them.

$ docker version
Client:
 Version:           20.10.4
 API version:       1.41
 Go version:        go1.15.8
 Git commit:        d3cb89e
 Built:             Mon Mar 29 18:54:36 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.4
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.15.8
  Git commit:       363e9a8
  Built:            Mon Mar 29 18:55:03 2021
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

jszwedko · 2021-06-18T20:06:18Z

Given we haven't heard back from the user with some example long log lines to figure out what's going on here, I'm inclined to close this and let it bubble back up if another user reports it. What do you think @spencergilbert ?

enhancement(kubernetes_logs source): Increase default for max_line_bytes

bc9f579

Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>

spencergilbert requested a review from jszwedko May 17, 2021 14:49

spencergilbert self-assigned this May 17, 2021

spencergilbert requested review from a team May 17, 2021 14:49

jszwedko reviewed May 17, 2021

View reviewed changes

src/sources/kubernetes_logs/mod.rs Outdated Show resolved Hide resolved

Fix typo

e2f4de2

Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>

MOZGIII approved these changes May 22, 2021

View reviewed changes

spencergilbert closed this Jun 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

spencergilbert commented May 17, 2021

jszwedko left a comment

MOZGIII left a comment •

edited

jszwedko commented May 24, 2021

jszwedko commented May 24, 2021

jszwedko commented Jun 18, 2021

enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483

Conversation

spencergilbert commented May 17, 2021

jszwedko left a comment

Choose a reason for hiding this comment

MOZGIII left a comment • edited

Choose a reason for hiding this comment

jszwedko commented May 24, 2021

jszwedko commented May 24, 2021

jszwedko commented Jun 18, 2021

MOZGIII left a comment •

edited