Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not parsing, but regex is matching #1203

Closed
simonwh opened this issue Mar 16, 2019 · 8 comments
Closed

Not parsing, but regex is matching #1203

simonwh opened this issue Mar 16, 2019 · 8 comments
Labels

Comments

@simonwh
Copy link

simonwh commented Mar 16, 2019

Bug Report

My nginx ingress controller logs are not getting parsed, even though they match the regex defined.

Rubular: https://rubular.com/r/KCWlI2X95tabLI

Log message

10.244.0.1 - [10.244.0.1] - - [16/Mar/2019:19:50:51 +0000] "GET /swagger/v3/swagger.json HTTP/2.0" 200 765 "https://api.stage.mydomain.se/index.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36" 86 0.010 [stage-my-api-80] 10.244.0.232:80 2740 0.008 200 de00d568d2952f165f402328d4519bef

Parser

[PARSER]
        Name        k8s-nginx-ingress
        Format      regex
        Regex       ^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<last>[^$]*)
        Time_Key    time
        Time_Format %d/%b/%Y:%H:%M:%S %z

Result
Skærmbillede 2019-03-16 kl  21 52 57

Expected behavior
Expected parser to extract data from the log string.

Your Environment

  • Version used: 1.0.4

Followed https://fluentbit.io/documentation/0.14/installation/kubernetes.html

I've been trying to make this work for hours on end... :-)

Is there anything I can do to:

  • See if it has registered parser correctly?
  • Call fluentbit directly from CLI to see how it parses different log messages?

Any hints appreciated!

@edsiper
Copy link
Member

edsiper commented Mar 17, 2019

Please share your Fluent Bit configuration file or configmap.

@simonwh
Copy link
Author

simonwh commented Mar 17, 2019

This my entire configmap with all configs:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-elasticsearch.conf

  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Merge_Log           On
        K8S-Logging.Parser  On

  output-elasticsearch.conf: |
    [OUTPUT]
        Name            es
        Match           *
        Host            ${FLUENT_ELASTICSEARCH_HOST}
        Port            ${FLUENT_ELASTICSEARCH_PORT}
        HTTP_User       ${FLUENT_ELASTICSEARCH_USER}
        HTTP_Passwd     ${FLUENT_ELASTICSEARCH_PASSWORD}
        tls             On
        Logstash_Format On
        Retry_Limit     False
        Logstash_Prefix ${FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX}

  parsers.conf: |
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name        k8s-nginx-ingress
        Format      regex
        Regex       ^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<last>[^$]*)
        Time_Key    time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        Decode_Field_As   escaped    log

    [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S

@kiich
Copy link

kiich commented Mar 21, 2019

@simonwh Not sure if this helps but for me, I've been struggling with similar issue when I was using fluent-bit sidecar in kubernetes to forward to another fluent-bit that runs on our nodes as Daemonsets which then sends to our Splunk - I was never able to see my logs even though my regex in rubular matched it perfectly.

In the end, when I increased the Flush to 30 (or some higher number), i was finally able to see it in my Splunk dashboard.

As for

Call fluentbit directly from CLI to see how it parses different log messages?

I was also doing this via the docker container and passing my log files and fluent-bit conf where I was able to see it was matching it fine which made me think my regex is fine in fluent-bit also. FYI the issue i had was #1214

I suspect my issue was something to do with me sending the output to another fluent-bit and then to Splunk so not sure if that was affecting it.

@kiich
Copy link

kiich commented Mar 21, 2019

Ah though after seeing the result picture, it looks like it is picking up your logs but not matching it so probably the Flush won't have any affect. Apologies for confusion!

@michtek
Copy link

michtek commented Mar 21, 2019

@simonwh I had the same issue as long as I was using the following config:

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On
    # Command      |  Decoder | Field | Optional Action
    # =============|==================|=================
    Decode_Field_As   escaped    log

it's actually missing an extra line at the end:

    Decode_Field_As   escaped    stream

compared to the https://raw.githubusercontent.com/fluent/fluent-bit/master/conf/parsers.conf

Additionally as I was getting compaints about multiple @timestamp fields, I had to add an extra filter:

[FILTER]
Name record_modifier
Match *
Remove_key @timestamp

@d-dmitry
Copy link

@simonwh see this tiket #729

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Jan 28, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2022

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as completed Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants