New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483
enhancement(kubernetes_logs source): Increase default for max_line_bytes #7483
Conversation
Signed-off-by: Spencer Gilbert <spencer.gilbert@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there might be something else going on here as the line limit does indeed appear to be 16 KiB rather than 16K characters:
Used down here:
It looks like log drivers can override that value, but the cloudwatch logs one is the only one I see doing that.
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work.
If this keeps creating issues, I suggest taking a deeper look at the actual k8s source.
Here's a good pointer.
Note that in the surrounding code at the Kubernetes, the buffers are dynamically sized (https://github.com/kubernetes/kubernetes/blob/c5bd36ef90140a7f3b3d676d319196e684d3d802/pkg/kubelet/kuberuntime/logs/logs.go#L309).
Also, the limit of 16 KB mentioned in the code is for the log line contents only, not for the surrounding metadata (like the timestamp). However, I'm pretty positive that limit is intended to be the byte-size limit, rather than char-size limit by design. That said, things might've changed, or there may be a bug somewhere, so it makes sense to assume the worst-case on our end - which would be the max UTF-8 charset length. I'm all for this change.
Thanks @MOZGIII !
This seems to be for reading logs, yes? I'm assuming this codepath is hit by
I'm not convinced this is the worst-case scenario though. It would be if they were actually counting characters instead of bytes, but I don't see that happening. It seems more likely to me that the docker metadata isn't counted in that limit and so even this new limit could be too low. We should be able to verify that empirically. I was curious what fluent-bit does and they seem to:
|
I verified that it seems to be 16K, before metadata by running:
and looking at the raw JSON logs in I also verified that it seems to be bytes by running:
And observed that it sliced after 5463 characters / 16387 bytes. It is odd that it ended up with one more byte than 16K, but that might be due to it slicing in the middle of a character and the way that
|
Given we haven't heard back from the user with some example long log lines to figure out what's going on here, I'm inclined to close this and let it bubble back up if another user reports it. What do you think @spencergilbert ? |
Signed-off-by: Spencer Gilbert spencer.gilbert@gmail.com
Closes #6967, based on this comment #6966 (comment)