Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

one of my nginx access log report "_jsonparsefailure" after adding to elk? #7779

Closed
KeithTt opened this issue Jul 24, 2017 · 4 comments
Closed

one of my nginx access log report "_jsonparsefailure" after adding to elk? #7779

KeithTt opened this issue Jul 24, 2017 · 4 comments

Comments

@KeithTt
Copy link

@KeithTt KeithTt commented Jul 24, 2017

ELK version: 5.4.1

Here is one piece of my nginx access log:

{"@timestamp":"2017-07-22T17:14:23+08:00","host":"117.119.33.237","clientip":"182.246.61.241","remote_user":"-","request":"GET /s?slot=-1933190001&cb=jsonpCallback_37&timestamp=1500714862506 HTTP/1.1","http_user_agent":"Mozilla/5.0 (Linux; Android 5.1; ZTE BA510 Build/LMY47D) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/33.0.0.0 Mobile Safari/537.36","cookie_uid":"-","size":4452,"responsetime":0.290,"upstreamtime":"0.289","upstreamhost":"192.168.10.12:8080","http_host":"c.bxb.oupeng.com","url":"/s","domain":"c.bxb.oupeng.com","xff":"-","referer":"http://www.opgirl.cn/?did=202","status":"200"}

Here is the info from kibana discover json column:

{
  "_index": "c-adbxb-cn-nginx-access-2017.07.22",
  "_type": "c-adbxb-cn-nginx-access",
  "_id": "AV1pkriwOTwuD3j9tnO5",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2017-07-22T09:13:18.554Z",
    "offset": 116378647,
    "@version": "1",
    "beat": {
      "hostname": "uy03-03",
      "name": "uy03-03",
      "version": "5.5.0"
    },
    "input_type": "log",
    "host": "uy03-03",
    "source": "/usr/local/nginx/logs/c.adbxb.cn.access.log",
    "message": "120.132.95.115 - - [22/Jul/2017:00:25:19 +0800] \"POST /c/ads/wifi HTTP/1.1\" \"\\x0A\\x10a125f4c8acbf7f2a\\x12\\x0E223.88.161.186\\x1AFDalvik/1.6.0 (Linux; U; Android 4.0.4; ZTE U795+ Build/IMM76D)(\\x022\\x07android:\\x054.0.4B\\x1C\\x08\\x01\\x10\\xE0\\x03\\x18\\xA0\\x06 \\x01(\\x002\\x03ZTE:\\x09ZTE U795+J&\\x0A\\x05A0008\\x12\\x10wifi\\xE4\\xB8\\x87\\xE8\\x83\\xBD\\xE9\\x92\\xA5\\xE5\\x8C\\x99\\x1A\\x043060*\\x05baiduR6\\x0A\\x0F868155010512989\\x12\\x11b4:98:42:e2:2c:80\\x1A\\x10ee0a80afd554b380Z7\\x0A\\x0825513984\\x10\\xD0\\x05\\x18\\xC8\\x01 \\x00(\\x000\\x038\\xAC\\x02@\\x04h\\x03h\\x06h\\xBF\\xFB\\xC2\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x01h\\xBE\\xFB\\xC2\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x01p\\x09\\xBA\\x01\\x011\\xC8\\x01\\x00\" 200 29 \"-\" \"Dalvik/1.6.0 (Linux; U; Android 4.0.4; ZTE U795+ Build/IMM76D)\" - 0.006  0.002",
    "type": "c-adbxb-cn-nginx-access",
    "tags": [
      "_jsonparsefailure",
      "beats_input_codec_json_applied"
    ]
  },
  "fields": {
    "@timestamp": [
      1500714798554
    ]
  },
  "sort": [
    1500714798554
  ]
}
{
  "_index": "www-opgirl-cn-nginx-access-2017.07.22",
  "_type": "www-opgirl-cn-nginx-access",
  "_id": "AV1ppNkgOTwuD3j97yQF",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2017-07-22T09:33:06.494Z",
    "offset": 679426855,
    "@version": "1",
    "input_type": "log",
    "beat": {
      "hostname": "uy01-04",
      "name": "uy01-04",
      "version": "5.5.0"
    },
    "host": "uy01-04",
    "source": "/usr/local/nginx/logs/www.opgirl.cn.access.log",
    "message": "182.202.168.176 - - [22/Jul/2017:06:16:45 +0800] \"GET /picture/list?gid=2885296&pl=0&strategy=1 HTTP/1.1\" \"-\" 200 683 \"http://www.opgirl.cn/?did=72\" \"Mozilla/5.0 (Linux; Android 5.1.1; vivo Y31A Build/LMY47V) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/38.0.0.0 Mobile Safari/537.36 VivoBrowser/5.1.2\" - 0.002  0.002",
    "type": "www-opgirl-cn-nginx-access",
    "tags": [
      "_jsonparsefailure",
      "beats_input_codec_json_applied"
    ]
  },
  "fields": {
    "@timestamp": [
      1500715986494
    ]
  },
  "sort": [
    1500715986494
  ]
}

Is there any string can not be recognize?


I found there many errors in /var/log/logstash/logstash-plain.log:

[2017-07-22T21:28:00,201][ERROR][logstash.codecs.json     ] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unexpected character ('.' (code 46)): Expected space separating root-level values
 at [Source: 101.227.103.243 - - [22/Jul/2017:06:08:37 +0800] "POST /b/ads HTTP/1.1" "\x0A*1500674919139_1638310_101227103243_22671_d\x12\x02\x08\x03\x1A*\x12\x17com.pajk.personaldoctor\x1A\x0F\xE5\xB9\xB3\xE5\xAE\x89\xE5\xA5\xBD\xE5\x8C\xBB\xE7\x94\x9F\x22V\x08\x02\x12\x04\x08\x01\x10\x01*\x0C223.85.218.22\x05Apple:\x07iPhone6B\x06\x08\xD0\x05\x10\x80\x0AJ$7BEC5316-0661-4CB2-965B-E943F74BCF8E`\x01*\x15\x0A\x09972676580\x12\x06\x08\x80\x05\x10\xC0\x07\x18\x01" 200 569 "-" "Jakarta Commons-HttpClient/3.1" - 0.177  0.176; line: 1, column: 9]>, :data=>"101.227.103.243 - - [22/Jul/2017:06:08:37 +0800] \"POST /b/ads HTTP/1.1\" \"\\x0A*1500674919139_1638310_101227103243_22671_d\\x12\\x02\\x08\\x03\\x1A*\\x12\\x17com.pajk.personaldoctor\\x1A\\x0F\\xE5\\xB9\\xB3\\xE5\\xAE\\x89\\xE5\\xA5\\xBD\\xE5\\x8C\\xBB\\xE7\\x94\\x9F\\x22V\\x08\\x02\\x12\\x04\\x08\\x01\\x10\\x01*\\x0C223.85.218.22\\x05Apple:\\x07iPhone6B\\x06\\x08\\xD0\\x05\\x10\\x80\\x0AJ$7BEC5316-0661-4CB2-965B-E943F74BCF8E`\\x01*\\x15\\x0A\\x09972676580\\x12\\x06\\x08\\x80\\x05\\x10\\xC0\\x07\\x18\\x01\" 200 569 \"-\" \"Jakarta Commons-HttpClient/3.1\" - 0.177  0.176"}
@jakelandis

This comment has been minimized.

Copy link
Contributor

@jakelandis jakelandis commented Jul 24, 2017

Are you using the beats input with JSON codec ?

@KeithTt

This comment has been minimized.

Copy link
Author

@KeithTt KeithTt commented Jul 25, 2017

@jakelandis yes, exactly.

Here is my pipeline config file:

input {
  beats {
    port => 5044
    codec => "json"
  }
}

output {
    if [type] == "zixun-nginx-access" {
    elasticsearch {
        hosts => ["192.168.3.56:9200","192.168.3.49:9200","192.168.3.57:9200"]
        index => "zixun-nginx-access-%{+YYYY.MM.dd}"
        document_type => "%{[@metadata][type]}"
        template_overwrite => true
    }}
    ...
}

Is there a method to solve this issue, or filter these failures?

@KeithTt

This comment has been minimized.

Copy link
Author

@KeithTt KeithTt commented Jul 25, 2017

I tried to add a condition:

input {
  beats {
    port => 5044
    codec => "json"
  }
}

output {
    if "_jsonparsefailure" not in [tags] {
        if [type] == "zixun-nginx-access" {
        elasticsearch {
            hosts => ["192.168.3.56:9200","192.168.3.49:9200","192.168.3.57:9200"]
            index => "zixun-nginx-access-%{+YYYY.MM.dd}"
            document_type => "%{[@metadata][type]}"
            template_overwrite => true
        }}
        ...
    }
}

BUT, there still are many errors in the log:

# tail -f logstash-plain.log
[2017-07-25T11:01:12,924][ERROR][logstash.codecs.json     ] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120)
 at [Source: {"@timestamp":"2017-07-25T11:01:06+08:00","host":"117.119.33.237","clientip":"106.75.6.196","remote_user":"-","request":"POST /c/ads/wifi HTTP/1.1","http_user_agent":"Dalvik/2.1.0 &#40;Linux; U; Android 5.1.1; \xC3\xA3\xC2\x80\xC2\x80\xC3\xA3\xC2\x80\xC2\x80 Build/LMY47V&#41;","cookie_uid":"-","size":7513,"responsetime":0.142,"upstreamtime":"0.141","upstreamhost":"192.168.10.38:8080","http_host":"c.adbxb.cn","url":"/c/ads/wifi","domain":"c.adbxb.cn","xff":"-","referer":"-","status":"200"}; line: 1, column: 213]>, :data=>"{\"@timestamp\":\"2017-07-25T11:01:06+08:00\",\"host\":\"117.119.33.237\",\"clientip\":\"106.75.6.196\",\"remote_user\":\"-\",\"request\":\"POST /c/ads/wifi HTTP/1.1\",\"http_user_agent\":\"Dalvik/2.1.0 &#40;Linux; U; Android 5.1.1; \\xC3\\xA3\\xC2\\x80\\xC2\\x80\\xC3\\xA3\\xC2\\x80\\xC2\\x80 Build/LMY47V&#41;\",\"cookie_uid\":\"-\",\"size\":7513,\"responsetime\":0.142,\"upstreamtime\":\"0.141\",\"upstreamhost\":\"192.168.10.38:8080\",\"http_host\":\"c.adbxb.cn\",\"url\":\"/c/ads/wifi\",\"domain\":\"c.adbxb.cn\",\"xff\":\"-\",\"referer\":\"-\",\"status\":\"200\"}"}
...

confused...

@robbavey

This comment has been minimized.

Copy link
Contributor

@robbavey robbavey commented Jul 25, 2017

@KeithTt The issue is that the User-Agent field is being populated with '\x__' characters, which is the default encoding for NGINX access logs, and, unfortunately, invalid JSON. If you are using a newer version of NGINX (>=1.11.8), then you can set escape=json as an argument, which will supply a properly encoded version of the data.

See https://github.com/elastic/examples/tree/master/Common%20Data%20Formats/nginx_json_logs#warning-invalid-json for more details.

I've added a copy of this reply to your forum post and we can follow up with any questions there

Thanks!

@robbavey robbavey closed this Jul 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.