-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruct split docker log lines in Promtail #2281
Comments
Doing some checking on this, unfortunately that metadata isn't present in kubernetes logs, it looks like the most common alternative is looking for moby/moby#34855 has some more details and references to different solutions |
In our case
So we have new line at the end of every splitted log line. |
You got me scared for a second. I checked the log format of our AWS EKS (1.18) nodes and can confirm the behaviour regarding new lines is as expected. When split, the split entries have no newline.
I still dont like all of this, because relying on newlines seems unreliable (e.g. what happens if there is an intentional newline just at 64k limit?). |
Sorry, my bad - in our case new lines were added by php monolog app at about 10Kb. |
You might be interested in the work Vector did to address this problem: vectordotdev/vector#1488 I use it to ingest docker logs into Loki and it seems to work fine. |
The multi-line support recently merged into promtail could possibly solve this however it's not been tested. |
Can confirm that the multiline can be used to restructure JSON. After multiline the line breaks would need to be removed to keep the JSON valid.
|
Awesome @triplaaj! Thanks for this update!!! |
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
I have a related Question. Is the loki docker-logging-driver using promtail to read the json.log and then already applying pipeline stages to extract the message from json lines? How can multiline logs be combined with the docker-logging driver? Example: Source json.log file in the loki plugin root:
The config in docker-compose: logging:
driver: loki
options:
loki-batch-size: '102400'
loki-external-labels: container_name={{.Name}},category=dockerlogs
loki-pipeline-stages: |
- multiline:
# Identify zero-width space as first line of a multiline block. Note the string should be in single quotes.
firstline: '^\x{200B}\['
max_wait_time: 3s
loki-retries: '10'
loki-url: https://username:pass@loki.example.com/loki/api/v1/push
max-buffer-size: 10m
max-file: '50'
max-size: 100m
mode: non-blocking @triplaaj The replace-stage in your comment would also remove |
@triplaaj @slim-bean I tried a combination of multiline and the json stage and that didn't seem to work. The multiline did work but promtail was apparently not able to json parse/extract labels from the multiline json. Is there a way to make this work?
but the label is not applied for multiline entries (apparently?) |
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
* Allow usage of host lookups for memcache discovery Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Move docs to arguments which is a better place? Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Address feedback Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
For easier debugging I've created a docker-compose repo that reproduces this problem: https://github.com/MurzNN/loki-long-lines |
Workaround from #2281 (comment) must work well for most cases, but not always: if the long json line is split by Docker exactly on |
Is there any issue exists on Docker side to allow extending max length of it's log line? |
I've the same issue when I want to parse the Elasticsearch log files. ES prints json logs and in case of errors, it prints multiline jsons. I can't find out how to create promtail pipelines. My current pipeline part: - match:
selector: '{container_name="elasticsearch_es_1"}'
stages:
- multiline:
firstline: '(^\{(.*))|([^\}\s]\s$)'
max_wait_time: 1s
source: eslog
- replace:
expression: '(\n)'
replace: ''
source: eslog
- json:
expressions:
output: message
time: timestamp
level: level
node_name: node.name
#component: component
source: eslog |
Hi all, I also been facing this same issue. The multiline parser suggested by triplaaj seems to work in the majority of cases. Does this happens when using containerd instead of docker and the |
What do you mean by "this"? containerd also splits logs. Unfortunatly I couldn't find a lot of references online how that is happening besides this rather old issue: containerd/cri#283 |
Hey, I'm currently facing the same issue with containerd and
Each log message that is split has the parts tagged with
Both Fluent-Bit and FileBeat have working options to enable reconstruction of split Docker as well as containerd/cri logs: FileBeat
Fluent-Bit
It would be really cool if Promtail had similair options that work out of the box :) |
Hi, yes sorry for the lack of context, by "this" I meant the long log line getting spitted. |
For JSON, this works:
To keep Grafana happy and make it valid JSON again:
Result: you can now happily ingest JSON up till the Loki item size limit (which is 64KB in Grafana Cloud at the time of writing) 🥳 🎉 |
@icebob This configuration seems works fine for elasticsearch to collect error level logs correctly pipeline_stages:
- multiline:
firstline: '^\{'
max_wait_time: 3s
- replace:
expression: '([\n])'
replace: '' |
@nantiferov This will work well until the line break occurs exactly before the And to properly handle this case - you should write a lot more complex construction, that will count all opening and closing braces... 😞 |
@nantiferov thanks, currently we are using this (similar for your solution) and it also works: - match:
selector: '{container_name="elasticsearch_es_1"}'
stages:
- multiline:
firstline: '(^\{)'
max_wait_time: 1s
- replace:
expression: '(\n)'
replace: ' '
- json:
expressions:
message: message
time: timestamp
level: level
node_name: node.name
#component: component
- labels:
level:
node_name:
- timestamp:
format: RFC3339Nano
source: time
action_on_failure: fudge
- output:
source: message |
Hi, there is something that I don't understand. I am seeing the same behaviour with broken lines in 16k in Loki but when I execute "kubectl logs my-pod" I see the whole lines. Is that correct? I am using EKS v1.21.14 and Loki/Promtail 2.5.0 |
@icebob - will trat work with Grafana Agent too, or it's Promtail specific? |
I think it should work with Grafana Agent as well. |
@icebob this |
For |
It seems that |
Got bit by this recently, and seems to still be an issue with loki and/or the docker loki log plugin. Also found a discussion about making the size limit in docker configurable (moby/moby#32923) and in that issue it seems like the maintainers of docker suggests that the best way forward is for consumers (plugins among them) to start handle split log lines. From the rest of the discussion it also seems like maybe split log lines are marked in some way: https://github.com/moby/moby/blob/cd14846d0cde098bb83037d99104db6fadfef039/daemon/logger/copier.go#L139-L152 |
Starting in moby/moby#22982 docker now splits log lines longer than ~16kb
This can totally break JSON processing of these lines in Promtail.
There are a few moving pieces here to support this, the big one is adding support for multi-line logs in promtail which we have been avoiding. Mostly because there were other valid solutions like
show context
in explore.However, not being able to parse JSON in Promtail because of this split is certainly something we want to fix so I think we are going to finally have to go down the multi-line road.
It does appear docker has some specific metadata to help with this however: moby/moby@0b4b0a7 which could help special case reconstructing split docker lines.
The text was updated successfully, but these errors were encountered: