Skip to content

Kubernetes Pods Logging (escaped nested) JSON to Std Out Isn't Correctly Parsed #691

@therealdwright

Description

@therealdwright

Bug Report

Describe the bug

When running a docker container inside a Kubernetes cluster where the container already outputs JSON lines, it is not possible to parse the nested JSON in a way that Elasticsearch will correctly index the key:value pairs. Running the Docker parser over the log using the escaped json decoder correctly parses everything in the log except for the original JSON message.

If this is a misconfiguration on my end, assistance would be greatly appreciated, however I think I have run in to a limitation that if the original log is JSON which results in escaped JSON with a nested escaped JSON object, there appears to be no way to solve this in fluent-bit.

To Reproduce

  • In a Kubernetes or Minikube cluster, run the fluentbit daemonset which can be found here: https://github.com/fluent/fluent-bit-kubernetes-logging (I have set my environment up as a forward to fluentd to send to logz.io but it would yield the same result to send directly to an elasticsearch instance.)
  • Start a container that outputs JSON to std out, I did so by running the command kubectl run echo-chamber --quiet --image=quay.io/therealdwright/echo-chamber:latest
  • Observe the logs of the container by running the command kubectl logs -f deployments/echo-chamber and see the logs that will be output as follows:
{"Message": "JSON Parsing in Kubernetes is fun!"}

Expected behavior

I expect the logs to be parsed in a way that Elasticsearch is able to index the key "Message" in a way that can be queried by Lucene.

Screenshots

screen shot 2018-07-21 at 11 19 04 am

Your Environment

  • Version used: Fluentbit Offical Docker image from fluent/fluent-bit:0.13.4 (as a Daemonset) in a forward configuration passing logs to a fluentd container running version 1.2.3 using the logz.io plugin.
  • Configuration:
    fluent-bit input config:
[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/*.log
    Parser            docker
    DB                /var/log/flb_kube.db

docker parser config:

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On
    # Command      |  Decoder | Field   | Optional Action
    # =============|==========|=========|=================
    Decode_Field_As   escaped    log
  • Environment name and version: Kubernetes 1.9.6
  • Server type and version: AWS EC2 Instance
  • Operating System and version: Debian Jessie k8s-1.9-debian-jessie-amd64-hvm-ebs-2018-03-11 (ami-db8546b9)
  • Filters and plugins: fluent-bit tail with Docker parser.

Additional context

I have tried many variants of configuration based on the GitHub issues I could see, I also have looked through the source and it looks like only a single "Decode_Field_As" match can be applied to a single log.

Additionally when running the Merge_JSON_Log on the Kubernetes filter, I see many errors: "could not merge JSON log as requested" in the fluentbit container.

An example raw log sent to logz.io (retrieved looking at the JSON payload) is as follows:

{
  "_index": "logzioCustomerIndex180721_v3",
  "_type": "http-bulk",
  "_id": "AWS6ag4I1yq4qwVVbz95.account-38207",
  "_version": 1,
  "_score": null,
  "_source": {
    "kubernetes": {
      "container_name": "echo-chamber",
      "host": "ip-10-41-24-84.ap-southeast-2.compute.internal",
      "annotations": {
        "kubernetes.io/limit-ranger": "LimitRanger plugin set: cpu request for container echo-chamber"
      },
      "docker_id": "d012be600cdb0f322203da4a4161da42e898cc8ce4a4dd5d79a1cad1a4d80537",
      "pod_id": "89f03ebb-8c82-11e8-b430-0a948d0850cc",
      "pod_name": "echo-chamber-59bbd6495d-4n9x7",
      "namespace_name": "default",
      "labels": {
        "pod-template-hash": "1566820518",
        "run": "echo-chamber"
      }
    },
    "@timestamp": "2018-07-21T01:17:48.000+00:00",
    "log": "{\"Message\": \"JSON Parsing in Kubernetes is fun!\"}\n",
    "stream": "stdout",
    "@log_name": "kube.var.log.containers.echo-chamber-59bbd6495d-4n9x7_default_echo-chamber-d012be600cdb0f322203da4a4161da42e898cc8ce4a4dd5d79a1cad1a4d80537.log",
    "time": "2018-07-21T01:17:48.4295262Z",
    "type": "http-bulk",
    "tags": [
      "_logz_http_bulk_json_8070"
    ]
  },
  "fields": {
    "@timestamp": [
      1532135868000
    ]
  },
  "highlight": {
    "kubernetes.labels.run": [
      "@kibana-highlighted-field@echo-chamber@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1532135868000
  ]
}

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions