Nested JSON parsing stopped working with fluent/fluentd-kubernetes-daemonset:v0.12-debian-elasticsearch #2073

arikunbotify · 2018-07-15T17:49:33Z

Hi,
I'm using fluent/fluentd-kubernetes-daemonset:v0.12-debian-elasticsearch and after updating to the new image (based on 0.12.43 and after solving the UID=0 issue reported here) I've stopped getting parsed nested objects. I get the kubernetes and docker fields parsed but the inside message in "log", which is a standard JSON from the application i run, is no longer parsed.
Have anyone encountered this issue with the new image?

(Also, the image based on 0.12.33 doesn't start at all form some reason, and I can't find older version tags to try).

Best,
AA

repeatedly · 2018-07-16T02:19:17Z

Maybe, the problem is kubernetes-metadata-filter did breaking changes.
Using parser filter resolve the problem. See #2021

arikunbotify · 2018-07-17T08:33:10Z

Thanks. In case anyone else will wonder how to combine nested json parsing with kubernetes fields, that's what works for me (in kubernetes.conf):

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
    </filter>

    <filter kubernetes.var.log.containers.**>
      @type parser
      <parse>
        @type json
        json_parser json
      </parse>
      replace_invalid_sequence true
      emit_invalid_record_to_error false
      key_name log
      reserve_data true
    </filter>

calinah · 2018-07-27T16:48:20Z

hey @arikunbotify can you please share your full configuration if you can ? I have been troubleshooting this problem for days now and my log messages are not passed as json to both elasticsearch and stdout. I have added the filter you suggested to my configuration as well still no luck. I can see they are escaped accordingly but when passed, they are passes as text and not json

arikunbotify · 2018-08-01T06:53:22Z

@calinah I totally forgot to mention i switched to:
fluent/fluentd-kubernetes-daemonset:v1.2.2-debian-elasticsearch

I think this is the relevant config part:

   <match fluent.**>
      @type null
    </match>

    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head false
      <parse>
        @type json
        json_parser json
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
    </filter>

    <filter kubernetes.var.log.containers.**>
      @type parser
      <parse>
        @type json
        json_parser json
      </parse>
      replace_invalid_sequence true
      emit_invalid_record_to_error false
      key_name log
      reserve_data true
    </filter>

hope it helps

Datise · 2018-10-23T21:48:13Z

@arikunbotify Sorry to drudge up but what is your strategy for adding the filter to the daemonset? I'm attempting to load via configmap and am not having much luck. Would love to avoid the initcontainer solution I see here:
fluent/fluentd-kubernetes-daemonset#174 (comment)

theothermike · 2018-11-14T15:58:18Z

Since this feature used to work, why can't you just add that config in the docker image by default so everyone doesn't need to manually override with custom configmaps?

warrenackerman · 2019-02-07T22:42:32Z

We are having this parsing issue and followed @arikunbotify example but the log field is not returning individual fields in kibana. it is a single log entry and the json is still showing escape characters.

    <source>
      @id goapp_logs
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/goapp-log.pos
      tag kubernetes.*
      read_from_head false
      <parse>
        @type json
        json_parser json
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>

    <filter kubernetes.var.log.containers.**>
      @type parser
      <parse>
        @type json
        json_parser json
      </parse>
      replace_invalid_sequence true
      emit_invalid_record_to_error false
      key_name log
      reserve_data true
    </filter>

results

"time=\"2019-02-07T22:29:13Z\" level=info msg=\"started handling request\" method=GET remote=\"10.1.1.1:34234\" request=/healthz source=\"blahh@v1.1.0/entry.go:111\"\n"

Any advice. we want the kibana table results to show:

level         info
msg         started handling request
method    GET
remote    10.1.1.1:34234
etc...

Gambler13 · 2019-02-22T16:51:29Z

@Datise
Did you solve your problem? I'm struggling with the exact same one.

kompiuter · 2019-05-23T14:14:41Z

The following worked for me:

fuentd-config-map.yml

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
data:
  fluent.conf: |
    <match fluent.**>
      @type null
    </match>

    <match kubernetes.var.log.containers.**fluentd**.log>
      @type null
    </match>

    <match kubernetes.var.log.containers.**kube-system**.log>
      @type null
    </match>

    <match kubernetes.var.log.containers.**kibana**.log>
      @type null
    </match>

    <source>
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head false
      <parse>
        @type json
        json_parser oj
        time_format %Y-%m-%dT%H:%M:%S
      </parse>
    </source>

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
    </filter>

    <filter kubernetes.var.log.containers.**>
      @type parser
      <parse>
        @type json
        json_parser oj
        time_format %Y-%m-%dT%H:%M:%S
      </parse>
      key_name log
      replace_invalid_sequence true
      emit_invalid_record_to_error true
      reserve_data true
    </filter>

    <match kubernetes.**>
      @type elasticsearch
      @log_level debug
      host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
      port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
      scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
      ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
      user "#{ENV['FLUENT_ELASTICSEARCH_USER']}" # remove these lines if not needed
      password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD']}" # remove these lines if not needed
      logstash_format true
      logstash_prefix fluentd
      logstash_dateformat %Y%m%d
      include_tag_key true
      reload_connections true
      log_es_400_reason true
      <buffer>
        flush_thread_count 8
        flush_interval 5s
        chunk_limit_size 2M
        queue_limit_length 32
        retry_max_interval 30
        retry_forever true
      </buffer>
    </match>

fluentd-daemonset.yml

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch-1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.default"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENT_UID
            value: "0"
          - name: FLUENT_ELASTICSEARCH_USER
            value: "foo"
          - name: FLUENT_ELASTICSEARCH_PASSWORD 
            value: "bar"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluentd-config
          mountPath: /fluentd/etc
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluentd-config
        configMap:
          name: fluentd-config

elasticsearch image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
kibana image: docker.elastic.co/kibana/kibana:7.1.0

jpugliesi · 2020-05-01T20:22:01Z

In our case, running fluent/fluentd-kubernetes-daemonset/v1.7.4-debian-elasticsearch7-1.0, we saw that only some types of kubernetes json logs were not being parsed by fluentd. The fix was adding the reserve_time true to the filter, like so:

..... # standard kubernetes.conf
<filter kubernetes.**>
  @type kubernetes_metadata
  @id filter_kube_metadata
</filter>

# Fixes parsing nested json in the docker json logs
<filter kubernetes.**>
  @id filter_parser
  @type parser
  key_name log
  reserve_data true
  remove_key_name_field true
  replace_invalid_sequence true
  reserve_time true
  <parse>
    @type multi_format
    <pattern>
      format json
      json_parser json
    </pattern>
    <pattern>
      format none
    </pattern>
  </parse>
</filter>

In our case, the json logs failing to parse had a time field that apparently doesn't play nicely with the fluentd configuration unless reserve_time true is added.

peetasan · 2020-09-10T12:56:30Z

I had an issue with this config (and the original from https://github.com/fluent/fluentd-kubernetes-daemonset/tree/master/docker-image/v1.11/debian-graylog/conf) where my json log was parsed correctly but the k8s metadata was packed in a kubernetes key as one json value. This way I can't filter for pod_name or anything like this. Any ideas why this data is not on the top level of the log which is sent to wherever (graylog in my case)?

ediezh · 2020-12-07T01:05:40Z

I'm having the same issue as @peetasan

Sieabah · 2020-12-17T04:44:03Z

For those wondering why the "Fixed" version might also still not work anymore (thanks fluentd, really making me work to get my logs ingested) is because using multi_format and the filter causes the following error to arise.

/fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin.rb:125:in `new_parser': undefined method `[]' for nil:NilClass (NoMethodError)
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-multi-format-parser-1.0.0/lib/fluent/plugin/parser_multi_format.rb:21:in `block in configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-multi-format-parser-1.0.0/lib/fluent/plugin/parser_multi_format.rb:17:in `each'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-multi-format-parser-1.0.0/lib/fluent/plugin/parser_multi_format.rb:17:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin.rb:173:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin_helper/parser.rb:90:in `block in configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin_helper/parser.rb:85:in `each'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin_helper/parser.rb:85:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin/in_tail.rb:128:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/plugin.rb:173:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/root_agent.rb:317:in `add_source'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/root_agent.rb:158:in `block in configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/root_agent.rb:152:in `each'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/root_agent.rb:152:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/engine.rb:105:in `configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/engine.rb:80:in `run_configure'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/supervisor.rb:555:in `run_supervisor'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/lib/fluent/command/fluentd.rb:341:in `<top (required)>'
        from /usr/local/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
        from /usr/local/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.5/bin/fluentd:8:in `<top (required)>'
        from /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd:23:in `load'
        from /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd:23:in `<main>'

Below is the config that works for me while excluding the fluent logs which the previous one still breaks with. It breaks out the kubernetes metadata as well and looks like the following within kibana.

<source>
  @type tail
  @id in_tail_container_logs
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  tag kubernetes.*
  exclude_path ["/var/log/containers/fluent*"]
  read_from_head true
  <parse>
    @type regexp
    expression /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
  </parse>
</source>
<filter kubernetes.**>
  @type kubernetes_metadata
  @id filter_kube_metadata
</filter>

<filter kubernetes.var.log.containers.**>
  @type parser
  <parse>
    @type json
    json_parser json
  </parse>
  replace_invalid_sequence true
  emit_invalid_record_to_error false
  key_name log
  reserve_data true
</filter>

jamietanna · 2021-09-29T20:13:33Z

Sorry to necrobump, but this StackOverflow worked for me, which handles multiple formats using the Multi format parser plugin:

   <filter **>
     @type parser
     key_name message
     reserve_data true
     remove_key_name_field true
     <parse>
       @type multi_format
       <pattern>
         format json
       </pattern>
       <pattern>
         format none
       </pattern>
     </parse>
   </filter>

repeatedly closed this as completed Jul 16, 2018

repeatedly mentioned this issue Aug 12, 2018

json log not getting parsed to the output record fields fluent/fluentd-kubernetes-daemonset#181

Closed

Dinidu mentioned this issue Oct 9, 2019

Json in 'log' field not parsed/exploded after migration from 0.12 to 1.2 #2021

Closed

sam-github mentioned this issue Nov 22, 2019

logfile format is undocumented containerd/cri#1339

Closed

giorgiosironi mentioned this issue Jul 2, 2020

Use a proper logger sciety/sciety#71

Closed

8 tasks

Sieabah mentioned this issue Dec 17, 2020

[in_tail_container_logs] pattern not matched - tried everything, not sure what I am missing fluent/fluentd-kubernetes-daemonset#434

Closed

CatalinMustata mentioned this issue Sep 17, 2021

Tail on most container logs does not pick up changes fluent/fluentd-kubernetes-daemonset#1302

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested JSON parsing stopped working with fluent/fluentd-kubernetes-daemonset:v0.12-debian-elasticsearch #2073

Nested JSON parsing stopped working with fluent/fluentd-kubernetes-daemonset:v0.12-debian-elasticsearch #2073

arikunbotify commented Jul 15, 2018

repeatedly commented Jul 16, 2018

arikunbotify commented Jul 17, 2018

calinah commented Jul 27, 2018

arikunbotify commented Aug 1, 2018 •

edited

Loading

Datise commented Oct 23, 2018

theothermike commented Nov 14, 2018

warrenackerman commented Feb 7, 2019

Gambler13 commented Feb 22, 2019

kompiuter commented May 23, 2019 •

edited

Loading

jpugliesi commented May 1, 2020

peetasan commented Sep 10, 2020 •

edited

Loading

ediezh commented Dec 7, 2020

Sieabah commented Dec 17, 2020

jamietanna commented Sep 29, 2021

Nested JSON parsing stopped working with fluent/fluentd-kubernetes-daemonset:v0.12-debian-elasticsearch #2073

Nested JSON parsing stopped working with fluent/fluentd-kubernetes-daemonset:v0.12-debian-elasticsearch #2073

Comments

arikunbotify commented Jul 15, 2018

repeatedly commented Jul 16, 2018

arikunbotify commented Jul 17, 2018

calinah commented Jul 27, 2018

arikunbotify commented Aug 1, 2018 • edited Loading

Datise commented Oct 23, 2018

theothermike commented Nov 14, 2018

warrenackerman commented Feb 7, 2019

Gambler13 commented Feb 22, 2019

kompiuter commented May 23, 2019 • edited Loading

jpugliesi commented May 1, 2020

peetasan commented Sep 10, 2020 • edited Loading

ediezh commented Dec 7, 2020

Sieabah commented Dec 17, 2020

jamietanna commented Sep 29, 2021

arikunbotify commented Aug 1, 2018 •

edited

Loading

kompiuter commented May 23, 2019 •

edited

Loading

peetasan commented Sep 10, 2020 •

edited

Loading