Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker_mode to recombine multiline records in json-log from docker #1115

Open
epcim opened this issue Feb 15, 2019 · 5 comments

Comments

@epcim
Copy link
Contributor

@epcim epcim commented Feb 15, 2019

Problem
If the application in kubernetes logs multiline messages, docker split this message to multiple json-log messages.

The actual output from the application

[2019-02-15 10:36:31.224][38][debug][http] source/common/http/conn_manager_impl.cc:521] [C463][S12543431219240717937] request headers complete (end_stream=true):
':authority', 'customer1.demo1.acme.us'
':path', '/api/config/namespaces/test/routes'
':method', 'GET'
'user-agent', 'Go-http-client/1.1'
'cookie', 'X-ACME-GW-AUTH=eyJpc3N1ZWxxxxxxxx948b94'
'accept-encoding', 'gzip'
'connection', 'close'

Now this becomes in docker log, to be parsed by fluentbit in_tail: (example differs from the above)

{"log":"[2019-02-15 11:00:08.688][9][debug][router] source/common/router/router.cc:303] [C0][S14319188767040639561] router decoding headers:\n","stream":"stderr","time":"2019-02-15T11:00:08.688733409Z"}
{"log":"':method', 'POST'\n","stream":"stderr","time":"2019-02-15T11:00:08.688736209Z"}
{"log":"':path', '/envoy.api.v2.ClusterDiscoveryService/StreamClusters'\n","stream":"stderr","time":"2019-02-15T11:00:08.688757909Z"}
{"log":"':authority', 'xds_cluster'\n","stream":"stderr","time":"2019-02-15T11:00:08.688760809Z"}
{"log":"':scheme', 'http'\n","stream":"stderr","time":"2019-02-15T11:00:08.688763609Z"}
{"log":"'te', 'trailers'\n","stream":"stderr","time":"2019-02-15T11:00:08.688766209Z"}
{"log":"'content-type', 'application/grpc'\n","stream":"stderr","time":"2019-02-15T11:00:08.688768809Z"}
{"log":"'x-envoy-internal', 'true'\n","stream":"stderr","time":"2019-02-15T11:00:08.688771609Z"}
{"log":"'x-forwarded-for', '192.168.6.6'\n","stream":"stderr","time":"2019-02-15T11:00:08.688774309Z"}
{"log":"\n","stream":"stderr","time":"2019-02-15T11:00:08.688777009Z"}

docker_mode: 0n shall - recombine split Docker log lines before passing them to any parser as configured above.

I would expect it will apply to this case as well, however I it does not. Below I provided my configuration.

Describe the solution you'd like

in_tail/docker_mode - shall have the possibility to read docker's json-log as a stream of original text. json parser, here is just pre-processor that will buffer the "log" key, so multiline regexp patterns can be used later.

Describe alternatives you've considered

I believe this problem can be avoided if:

  1. docker logs are sent directly to fluentd (docker fluentd driver, https://docs.docker.com/config/containers/logging/fluentd/)
  2. docker logs are sent to journal/syslog etc

However:

  • on hosted kubernetes platforms, you are not allowed to change docker loging driver. (Azure AKS for example)
  • on hosted environments, mixing your docker logs with system (journal, syslog) is note desired.

Fluent bit FILTERS are applied after the parsing, so can't transform the stream early.

Additional context

Fluentbit config I am using:

  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Skip_Long_Lines   Off
        Docker_Mode       On
        Refresh_Interval  10
        Chunk_Size        32k
        Buffer_Max_Size   2M
  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Merge_Log           On
        K8S-Logging.Parser  On

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        Decode_Field_As   escaped_utf8    log    do_next
        Decode_Field_As   escaped         log    do_next
        Decode_Field_As   json            log
  
@epcim

This comment has been minimized.

Copy link
Contributor Author

@epcim epcim commented Feb 15, 2019

Related:

@epcim

This comment has been minimized.

Copy link
Contributor Author

@epcim epcim commented Feb 19, 2019

Repository that can be used for testing: https://github.com/epcim/fluentbit-sandbox

@etwillbefine

This comment has been minimized.

Copy link

@etwillbefine etwillbefine commented May 26, 2019

Hey I'm struggling with the same right now. Is there any additional planned feature or bug fix for this? Docker_Mode On is exactly what I want. My parsers can then extract fields. I struggle finding a solution for spring boot stack traces with fluent bit at all (using Multiline or Docker_Mode). Any update or feedback would be appreciated.

@sysword

This comment has been minimized.

Copy link

@sysword sysword commented Oct 31, 2019

I'm struggling with this right now. Do you have any solution for k8s's multiline log?

@sreedharbukya

This comment has been minimized.

Copy link

@sreedharbukya sreedharbukya commented Nov 4, 2019

I am also stuck in same issue. Multiline log parser is not working in K8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.