Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable JSON logs parsing by default #234

Open
clouedoc opened this issue May 27, 2023 · 2 comments
Open

Enable JSON logs parsing by default #234

clouedoc opened this issue May 27, 2023 · 2 comments
Assignees

Comments

@clouedoc
Copy link

clouedoc commented May 27, 2023

I'm trying to enable parsing of the JSON logs of my Kubernetes cluster.

Each line of my logs is a JSON object, like this:

{"account":"xxxxxxxxxxxxxxx","date":"2023-05-27T00:35:00.122Z","level":"debug","message":"\t[xx] : xxxxxxxxxxxxxxxxxxxx","ms":"+0ms","service":"xxxxxxxxxxxxxxxxxxxxx","timestamp":"2023-05-27T00:35:00.122Z","uuid":"xxxxxxxxxxxxxxxxxxxxxxxxx"}

They are successfully ingested into SigNoz, but I cannot use the fields for indexing.

image


What I have tried already

I have tried the following Helm configurations, without success.

This one has no effect:

        presets:
          logsCollection:
            operators:
              # Find out which format is used by kubernetes
              - type: router
                id: get-format
                routes:
                  - output: parser-docker
                    expr: 'body matches "^\\{"'
                  - output: parser-crio
                    expr: 'body matches "^[^ Z]+ "'
                  - output: parser-containerd
                    expr: 'body matches "^[^ Z]+Z"'
              # Parse CRI-O format
              - type: regex_parser
                id: parser-crio
                regex: '^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$'
                output: extract_metadata_from_filepath
                timestamp:
                  parse_from: attributes.time
                  layout_type: gotime
                  layout: '2006-01-02T15:04:05.000000000-07:00'
              # Parse CRI-Containerd format
              - type: regex_parser
                id: parser-containerd
                regex: '^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$'
                output: extract_metadata_from_filepath
                timestamp:
                  parse_from: attributes.time
                  layout: '%Y-%m-%dT%H:%M:%S.%LZ'
              # Parse Docker format
              - type: json_parser
                id: parser-docker
                output: extract_metadata_from_filepath
                timestamp:
                  parse_from: attributes.time
                  layout: '%Y-%m-%dT%H:%M:%S.%LZ'
              # Extract metadata from file path
              - type: regex_parser
                id: extract_metadata_from_filepath
                regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
                parse_from: attributes["log.file.path"]
                output: add_cluster_name
              # Add cluster name attribute from environment variable
              - id: add_cluster_name
                type: add
                field: resource["k8s.cluster.name"]
                value: EXPR(env("K8S_CLUSTER_NAME"))
                output: move_stream
              # Rename attributes
              - type: move
                id: move_stream
                from: attributes.stream
                to: attributes["log.iostream"]
                output: move_container_name
              - type: move
                id: move_container_name
                from: attributes.container_name
                to: resource["k8s.container.name"]
                output: move_namespace
              - type: move
                id: move_namespace
                from: attributes.namespace
                to: resource["k8s.namespace.name"]
                output: move_pod_name
              - type: move
                id: move_pod_name
                from: attributes.pod_name
                to: resource["k8s.pod.name"]
                output: move_restart_count
              - type: move
                id: move_restart_count
                from: attributes.restart_count
                to: resource["k8s.container.restart_count"]
                output: move_uid
              - type: move
                id: move_uid
                from: attributes.uid
                to: resource["k8s.pod.uid"]
                output: move_log
              # Clean up log body
              - type: move
                id: move_log
                from: attributes.log
                to: body
              #### CUSTOM: parse JSON logs ####
              - type: json_parser
                timestamp:
                  parse_from: attributes.timestamp
                  layout: '%Y-%m-%dT%H:%M:%S.%fZ'
              - type: move
                from: attributes.message
                to: body
              - type: remove
                field: attributes.timestamp

This one causes an error to appear in the otel-collector pod:

        otelCollector:
          config:
            receivers:
              otlp:
                protocols:
                  grpc:
                    endpoint: 0.0.0.0:4317
                    max_recv_msg_size_mib: 16
                  http:
                    endpoint: 0.0.0.0:4318
                operators:
                  - type: json_parser
                    timestamp:
                      parse_from: attributes.timestamp
                      layout: '%Y-%m-%dT%H:%M:%S.%fZ'
                  - type: move
                    from: attributes.message
                    to: body
                  - type: remove
                    field: attributes.timestamp

Error:

Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'receivers': error reading configuration for "otlp": 1 error(s) decoding:

* '' has invalid keys: operators
2023/05/27 00:23:06 application run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:


How should I go about parsing my JSON logs?
I believe this should be a feature that should be enabled by default, or at least have a better documentation.


Thanks for your attention 🙏

@prashant-shahi
Copy link
Member

/cc @nityanandagohain

@egandro
Copy link

egandro commented Jun 22, 2023

I am wrting an article about how to do this. Please be patient (or ping me here in a few days).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants