Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconstruct split docker log lines in Promtail #2281

Closed
slim-bean opened this issue Jul 1, 2020 · 31 comments · Fixed by #12374
Closed

Reconstruct split docker log lines in Promtail #2281

slim-bean opened this issue Jul 1, 2020 · 31 comments · Fixed by #12374
Labels
component/promtail keepalive An issue or PR that will be kept alive and never marked as stale.

Comments

@slim-bean
Copy link
Collaborator

Starting in moby/moby#22982 docker now splits log lines longer than ~16kb

This can totally break JSON processing of these lines in Promtail.

There are a few moving pieces here to support this, the big one is adding support for multi-line logs in promtail which we have been avoiding. Mostly because there were other valid solutions like show context in explore.

However, not being able to parse JSON in Promtail because of this split is certainly something we want to fix so I think we are going to finally have to go down the multi-line road.

It does appear docker has some specific metadata to help with this however: moby/moby@0b4b0a7 which could help special case reconstructing split docker lines.

@slim-bean slim-bean added the keepalive An issue or PR that will be kept alive and never marked as stale. label Jul 1, 2020
@slim-bean
Copy link
Collaborator Author

Doing some checking on this, unfortunately that metadata isn't present in kubernetes logs, it looks like the most common alternative is looking for \n character at the end of the log line, I guess if the line is split the partial lines don't have a newline character:

moby/moby#34855 has some more details and references to different solutions

@123BLiN
Copy link

123BLiN commented Oct 29, 2020

In our case Docker version 18.09.9, build 039a7df9ba
We have docker logs like:

{"log":"{\"@timestamp\":\"2020-10-28T15:53:14.460060-04:00\",first part of the huge one line JSON here\n","stream":"stderr","time":"2020-10-28T19:53:13.080198941Z"}
{"log":"second part of huge one line JSON here\n","stream":"stderr","time":"2020-10-28T19:53:13.080332682Z"}
{"log":"last part of huge one line JSON here\"}\n","stream":"stderr","time":"2020-10-28T19:53:13.080332682Z"}

So we have new line at the end of every splitted log line.
Just FYI that this solution just may not work for all. You should check your docker json logs first.

@cf-sewe
Copy link

cf-sewe commented Oct 29, 2020

So we have new line at the end of every splitted log line.
Just FYI that this solution just may not work for all. You should check your docker json logs first.

You got me scared for a second. I checked the log format of our AWS EKS (1.18) nodes and can confirm the behaviour regarding new lines is as expected. When split, the split entries have no newline.

{"log":"{\"@timestamp\":\"2020-10-29T11:12:03.015Z\",\"@version\":\"1\",\"message\":\"==== INCREMENTAL DUMP ====\\n\\nauditSlidingTimeFrameTimer ... 4)\\njava.util.concurr","stream":"stdout","time":"2020-10-29T11:12:03.01653623Z"}

I still dont like all of this, because relying on newlines seems unreliable (e.g. what happens if there is an intentional newline just at 64k limit?).

@123BLiN
Copy link

123BLiN commented Oct 30, 2020

Sorry, my bad - in our case new lines were added by php monolog app at about 10Kb.
I like containerd solution and format for logs - they have P/F field that tells collector that line is partial or full.

@nivekuil
Copy link

You might be interested in the work Vector did to address this problem: vectordotdev/vector#1488 I use it to ingest docker logs into Loki and it seems to work fine.

@slim-bean slim-bean removed the keepalive An issue or PR that will be kept alive and never marked as stale. label Jan 21, 2021
@slim-bean
Copy link
Collaborator Author

The multi-line support recently merged into promtail could possibly solve this however it's not been tested.

@triplaaj
Copy link

Can confirm that the multiline can be used to restructure JSON. After multiline the line breaks would need to be removed to keep the JSON valid.

- multiline:
    firstline: '(^\{(.*))|([^\}\s]\s$)'
- replace:
    expression: '(\n)'
    replace: ''

@slim-bean
Copy link
Collaborator Author

Awesome @triplaaj! Thanks for this update!!!

@stale
Copy link

stale bot commented Mar 20, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Mar 20, 2021
@pgassmann
Copy link
Contributor

pgassmann commented Mar 22, 2021

I have a related Question. Is the loki docker-logging-driver using promtail to read the json.log and then already applying pipeline stages to extract the message from json lines?
We have java applications and want to use multiline stage to combine the stacktraces etc. to one message again. The application is configured to log with the zero-width-space character as recommended in the multiline docs.

How can multiline logs be combined with the docker-logging driver?

Example:

Source json.log file in the loki plugin root: /var/lib/docker/plugins/f7623531faa0d4bc052a32f5878645955bd9c81296b88992d29dde8f6b7c2836/rootfs/var/log/docker/a7491ff6dd01dc84f9d7a14a6c64ab1d470498b196a71826d96ef0049006fdb5/json.log

{"log":"<U+200B>[22-03-2021 10:07:32.467] [main] \u001b[34mINFO \u001b[0;39m \u001b[36mo.e.j.s.AbstractConnector\u001b[0;39m - Started ServerConnector@642505c7{HTTP/1.1, (http/1.1)}{0.0.0.0:8086}\n","stream":"stdout","time":"2021-03-22T10:07:32.468129427Z"}
{"log":"<U+200B>[22-03-2021 10:07:32.468] [main] \u001b[34mINFO \u001b[0;39m \u001b[36mo.e.j.s.Server\u001b[0;39m - Started @2626ms\n","stream":"stdout","time":"2021-03-22T10:07:32.468132422Z"}
{"log":"WARNING: An illegal reflective access operation has occurred\n","stream":"stderr","time":"2021-03-22T10:07:32.96668075Z"}
{"log":"WARNING: Illegal reflective access by com.thoughtworks.xstream.core.util.Fields (file:/app/libs/xstream-1.4.11.1.jar) to field java.util.TreeMap.comparator\n","stream":"stderr","time":"2021-03-22T10:07:32.966686732Z"}
{"log":"WARNING: Please consider reporting this to the maintainers of com.thoughtworks.xstream.core.util.Fields\n","stream":"stderr","time":"2021-03-22T10:07:32.966688455Z"}
{"log":"WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations\n","stream":"stderr","time":"2021-03-22T10:07:32.966689677Z"}
{"log":"WARNING: All illegal access operations will be denied in a future release\n","stream":"stderr","time":"2021-03-22T10:07:32.966690118Z"}

The config in docker-compose:

    logging:
      driver: loki
      options:
        loki-batch-size: '102400'
        loki-external-labels: container_name={{.Name}},category=dockerlogs
        loki-pipeline-stages: |
          - multiline:
              # Identify zero-width space as first line of a multiline block. Note the string should be in single quotes.
              firstline: '^\x{200B}\['
              max_wait_time: 3s
        loki-retries: '10'
        loki-url: https://username:pass@loki.example.com/loki/api/v1/push
        max-buffer-size: 10m
        max-file: '50'
        max-size: 100m
        mode: non-blocking

@triplaaj The replace-stage in your comment would also remove \ns that are part of the message. Like in my example.
@slim-bean can you prevent this from becoming stale&closed

@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Mar 23, 2021
@patrickhuy
Copy link
Contributor

patrickhuy commented Apr 9, 2021

@triplaaj @slim-bean I tried a combination of multiline and the json stage and that didn't seem to work. The multiline did work but promtail was apparently not able to json parse/extract labels from the multiline json. Is there a way to make this work?
I had something like this:

 - multiline:
    firstline: '(^\{(.*))|([^\}\s]\s$)'
- replace:
    expression: '(\n)'
    replace: ''
- json: 
    expressions:
      msg: msg
      level: level
- labels:
    level:

but the label is not applied for multiline entries (apparently?)
Is there something that could be done to make this work?

@stale
Copy link

stale bot commented Jun 2, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Jun 2, 2021
@stale stale bot closed this as completed Jun 9, 2021
cyriltovena pushed a commit to cyriltovena/loki that referenced this issue Jun 11, 2021
* Allow usage of host lookups for memcache discovery

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Move docs to arguments which is a better place?

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Address feedback

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
@dannykopping dannykopping reopened this Jul 29, 2021
@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Jul 29, 2021
@dannykopping dannykopping added component/promtail keepalive An issue or PR that will be kept alive and never marked as stale. labels Jul 29, 2021
@MurzNN
Copy link

MurzNN commented Sep 29, 2021

For easier debugging I've created a docker-compose repo that reproduces this problem: https://github.com/MurzNN/loki-long-lines

@MurzNN
Copy link

MurzNN commented Sep 29, 2021

Workaround from #2281 (comment) must work well for most cases, but not always: if the long json line is split by Docker exactly on { symbol, it will not merge JSON into one line.

@MurzNN
Copy link

MurzNN commented Sep 29, 2021

Is there any issue exists on Docker side to allow extending max length of it's log line?

@icebob
Copy link

icebob commented Jan 13, 2022

I've the same issue when I want to parse the Elasticsearch log files. ES prints json logs and in case of errors, it prints multiline jsons. I can't find out how to create promtail pipelines.

My current pipeline part:

    - match:
        selector: '{container_name="elasticsearch_es_1"}'
        stages:
          - multiline:
              firstline: '(^\{(.*))|([^\}\s]\s$)'
              max_wait_time: 1s
              source: eslog     
          - replace:
              expression: '(\n)'
              replace: ''        
              source: eslog     
          - json:
              expressions:
                output: message
                time: timestamp
                level: level
                node_name: node.name
                #component: component
              source: eslog

@hgranillo
Copy link

hgranillo commented Jan 19, 2022

Hi all, I also been facing this same issue. The multiline parser suggested by triplaaj seems to work in the majority of cases.

Does this happens when using containerd instead of docker and the criparser? Might be a good excuse to just switch container runtimes

@patrickhuy
Copy link
Contributor

Does this happens when using containerd instead of docker and the criparser? Might be a good excuse to just switch container runtimes

What do you mean by "this"? containerd also splits logs. Unfortunatly I couldn't find a lot of references online how that is happening besides this rather old issue: containerd/cri#283

@stefan-fast
Copy link

stefan-fast commented Jan 21, 2022

Hey, I'm currently facing the same issue with containerd and cri parser. containerd also splits logs lines greater than 16kb by default. I found a parameter max_container_log_line_size here but I'm unable to change it because of missing access to that config in my cloud environment.
From what I can tell, containerd logs seem to be splitt in the following way:

2022-01-21T10:10:24.993695084Z stdout P <very long log message 1>
2022-01-21T10:10:24.993695085Z stdout P <still log message 1>
2022-01-21T10:10:24.993846586Z stdout F <final part of log message 1>

Each log message that is split has the parts tagged with P and has a final part F indicating that the next line is a new log message.
Shorter log messages that don't need splitting have the F tag right away:

2022-01-21T10:10:24.993846587Z stdout F <short log message 2>
2022-01-21T10:10:24.993846587Z stdout F <short log message 3>

Both Fluent-Bit and FileBeat have working options to enable reconstruction of split Docker as well as containerd/cri logs:

FileBeat

Fluent-Bit

It would be really cool if Promtail had similair options that work out of the box :)

@hgranillo
Copy link

Does this happens when using containerd instead of docker and the criparser? Might be a good excuse to just switch container runtimes

What do you mean by "this"? containerd also splits logs. Unfortunatly I couldn't find a lot of references online how that is happening besides this rather old issue: containerd/cri#283

Hi, yes sorry for the lack of context, by "this" I meant the long log line getting spitted.

@BitProcessor
Copy link

For JSON, this works:

                        - multiline:
                            # Identify zero-width space as first line of a multiline block.
                            # Note the string should be in single quotes.
                            firstline: '^\x{200B}\{'
                            max_wait_time: 3s
                        - replace:
                            expression: '^(\x{200B})'
                            replace: ''
                        - replace:
                            expression: '([\n])'
                            replace: ''

To keep Grafana happy and make it valid JSON again:

  • the zero-width space character is removed again
  • the newlines inserted in the promtail reconstruction are removed again

Result: you can now happily ingest JSON up till the Loki item size limit (which is 64KB in Grafana Cloud at the time of writing) 🥳 🎉

@nantiferov
Copy link

nantiferov commented Oct 17, 2022

@icebob This configuration seems works fine for elasticsearch to collect error level logs correctly

    pipeline_stages:
      - multiline:
          firstline: '^\{'
          max_wait_time: 3s
      - replace:
          expression: '([\n])'
          replace: ''

@MurzNN
Copy link

MurzNN commented Oct 17, 2022

@nantiferov This will work well until the line break occurs exactly before the { character in the middle of a log line text. The probability of this is small, but still not zero!

And to properly handle this case - you should write a lot more complex construction, that will count all opening and closing braces... 😞

@icebob
Copy link

icebob commented Oct 17, 2022

@nantiferov thanks, currently we are using this (similar for your solution) and it also works:

    - match:
        selector: '{container_name="elasticsearch_es_1"}'
        stages:
          - multiline:
              firstline: '(^\{)'
              max_wait_time: 1s
          - replace:
              expression: '(\n)'
              replace: ' '        
          - json:
              expressions:
                message: message
                time: timestamp
                level: level
                node_name: node.name
                #component: component
          - labels:
              level:
              node_name:
          - timestamp:
              format: RFC3339Nano
              source: time
              action_on_failure: fudge
          - output:
              source: message

@mbonanata
Copy link

Hi, there is something that I don't understand. I am seeing the same behaviour with broken lines in 16k in Loki but when I execute "kubectl logs my-pod" I see the whole lines. Is that correct?

I am using EKS v1.21.14 and Loki/Promtail 2.5.0

@marinnedea
Copy link

@icebob - will trat work with Grafana Agent too, or it's Promtail specific?

@icebob
Copy link

icebob commented Jul 11, 2023

I think it should work with Grafana Agent as well.

@hu-chia
Copy link

hu-chia commented Oct 24, 2023

@icebob this replace stage is replacing all line breaks to space? why doesn't just remove them?

@stefan-fast
Copy link

Hey, I'm currently facing the same issue with containerd and cri parser. containerd also splits logs lines greater than 16kb by default. I found a parameter max_container_log_line_size here but I'm unable to change it because of missing access to that config in my cloud environment. From what I can tell, containerd logs seem to be splitt in the following way:

2022-01-21T10:10:24.993695084Z stdout P <very long log message 1>
2022-01-21T10:10:24.993695085Z stdout P <still log message 1>
2022-01-21T10:10:24.993846586Z stdout F <final part of log message 1>

Each log message that is split has the parts tagged with P and has a final part F indicating that the next line is a new log message. Shorter log messages that don't need splitting have the F tag right away:

2022-01-21T10:10:24.993846587Z stdout F <short log message 2>
2022-01-21T10:10:24.993846587Z stdout F <short log message 3>

Both Fluent-Bit and FileBeat have working options to enable reconstruction of split Docker as well as containerd/cri logs:

FileBeat

* [cri.parse_flags](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-docker.html#_cri_parse_flags) (cri)

* [combine_partial](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-docker.html#_combine_partial) (docker)

Fluent-Bit

* [multiline-parsing](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/multiline-parsing) (`docker` & `cri` parser)

It would be really cool if Promtail had similair options that work out of the box :)

For cri stage, reconstructing partial logs is implemented in the latest versions of Promtail (see docs for cri stage) and (for me) works out of the box without any configurations. Thanky you! 🎉

@jonaslb
Copy link
Contributor

jonaslb commented Mar 13, 2024

It seems that docker_sd_configs does combine the split logs from docker now, without any multiline configuration. It just seems to do it in an incorrect manner (see #12197).

@andoks
Copy link

andoks commented Mar 19, 2024

Got bit by this recently, and seems to still be an issue with loki and/or the docker loki log plugin.

Also found a discussion about making the size limit in docker configurable (moby/moby#32923) and in that issue it seems like the maintainers of docker suggests that the best way forward is for consumers (plugins among them) to start handle split log lines. From the rest of the discussion it also seems like maybe split log lines are marked in some way: https://github.com/moby/moby/blob/cd14846d0cde098bb83037d99104db6fadfef039/daemon/logger/copier.go#L139-L152

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/promtail keepalive An issue or PR that will be kept alive and never marked as stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.