Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent format of error field in JSON logs #3335

Closed
jakubgs opened this issue Jan 28, 2022 · 17 comments
Closed

Inconsistent format of error field in JSON logs #3335

jakubgs opened this issue Jan 28, 2022 · 17 comments
Labels
bug Something isn't working 🛂 logging

Comments

@jakubgs
Copy link
Member

jakubgs commented Jan 28, 2022

Similar as in #2287, #2252, and #2283 I've found another JSON field in logs that causes Logstash parsing errors:

[WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2022.01.28", :routing=>nil}, {"syslogtag"=>"beacon-node-mainnet-stable-02[702486]:", "pid"=>"702486", "severity"=>7, "@timestamp"=>2022-01-28T13:38:31.435Z, "severity_name"=>"debug", "nim_protocol"=>"mainnet", "facility"=>22, "logsource"=>"metal-01.he-eu-hel1.nimbus.mainnet", "type"=>"rsyslog", "nim_network"=>"stable", "fleet"=>"nimbus.mainnet", "raw"=>"{\"lvl\":\"DBG\",\"ts\":\"2022-01-28 13:38:31.435+00:00\",\"msg\":\"A future has failed, enable trace logging for details\",\"topics\":\"libp2p connmanager\",\"error\":\"DialFailedError\"}", "host"=>"10.14.0.17", "facility_name"=>"local6", "logsource_ip"=>"95.217.87.121", "data_center"=>"he-eu-hel1", "@version"=>"1", "json"=>{"error"=>"DialFailedError", "ts"=>"2022-01-28 13:38:31.435+00:00", "topics"=>"libp2p connmanager", "lvl"=>"DBG", "msg"=>"A future has failed, enable trace logging for details"}, "program"=>"beacon-node-mainnet-stable-02"}], :response=>{"index"=>{"_index"=>"logstash-2022.01.28", "_type"=>"_doc", "_id"=>"PmXpoH4BY1PMA8j3cU1n", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [json.error] tried to parse field [error] as object, but found a concrete value"}}}}

Log examples:

{"lvl":"DBG","ts":"2022-01-28 13:44:48.180+00:00","msg":"A future has failed, enable trace logging for details","topics":"libp2p connmanager","error":"LPStreamEOFError"}
{"lvl":"DBG","ts":"2022-01-28 13:45:09.098+00:00","msg":"Dropping aggregate","error":{"Field0":2,"Field1":"Aggregate: validator has already aggregated in epoch"},"aggregate":{"aggregation_bits":"0x77f99df2ff75ffdfd77befdfe6bffb7f7e","data":{"slot":2239125,"index":15,"beacon_block_root":"cb492ed2","source":"69971:6cba3436","target":"69972:7d799c16"},"signature":"88b2ea26"},"wallSlot":2239125,"aggregator_index":223010,"signature":"973cc708","selection_proof":"ac9c621c"}

Once as string, once as object.

Also, why is there both err and error used in log messages?

@jakubgs
Copy link
Member Author

jakubgs commented Feb 17, 2022

Can we please fix this?

admin@node-01.he-eu-hel1.log-aggr.hq:~ % grep 'failed to parse field \[json.error\]' /var/log/docker/logstash-app/docker.log | wc -l
213592

@jakubgs
Copy link
Member Author

jakubgs commented Feb 17, 2022

For now I'm going to just ignore these fields in beacon node logs: https://github.com/status-im/infra-hq/commit/d0defdcf

@arnetheduck
Copy link
Member

Also, why is there both err and error used in log messages?

the log names are "free form", there's no standard - in an application composed of several libraries, it's difficult to keep all logs in sync globally (different libraries, different devs etc)

@arnetheduck
Copy link
Member

cc @Menduist @dryajov the "future has failed" messages should really be hunted down and handled, they are an indication that that somewhere, error flows where not properly considered

@jakubgs
Copy link
Member Author

jakubgs commented Feb 18, 2022

it's difficult to keep all logs in sync globally (different libraries, different devs etc)

So couldn't they just be stringified into a single text field? I don't know of a way to index fields that have mixed type.

What I could do is apply index: not_analyzed to the err and error fields so it's not indexed and avoid the errors altogether.

@arnetheduck
Copy link
Member

What I could do is apply index: not_analyzed to the err and error fields so it's not indexed and avoid the errors altogether.

can you do this only for the conflicty fields?

@jakubgs
Copy link
Member Author

jakubgs commented Feb 18, 2022

No, I don't think it's that clever.

@arnetheduck
Copy link
Member

What I could do is apply index: not_analyzed to the err and error fields so it's not indexed and avoid the errors altogether.

sgtm

@arnetheduck
Copy link
Member

it'll still index message, right?

@arnetheduck
Copy link
Member

So couldn't they just be stringified into a single text field? I don't know of a way to index fields that have mixed type.

because sometimes errors contain error-specific data beyond a string, and ideally that should be captured in a structured way

@jakubgs
Copy link
Member Author

jakubgs commented Feb 18, 2022

A field that's not indexed is still available in the store, just won't be easily or quickly searchable.

@jakubgs
Copy link
Member Author

jakubgs commented Feb 20, 2022

One idea I had, is that maybe I could force these fields to be strings on Logstash level, that way they can be indexed.

I'll see about that next week.

@Menduist
Copy link
Contributor

"future has failed"

Most probably coming from

proc onConnEvent(node: Eth2Node, peerId: PeerID, event: ConnEvent) {.async.} =

libp2p expects that conn event handlers don't fail, but errors are not handled properly in this conn handler

Shouldn't cause anything bad though

@jakubgs
Copy link
Member Author

jakubgs commented Feb 21, 2022

I though I could use json_encode filter, but it appears to not be installed by default:

Trying to load the json_encode filter plugin resulted in this error:
Unable to load the requested plugin named json_encode of type filter. The plugin is not installed.

@jakubgs
Copy link
Member Author

jakubgs commented Feb 21, 2022

I've added configuration in Logstash to stringify json.error and json.err fields: https://github.com/status-im/infra-hq/commit/3168e286

    # Avoid errors due to mixed field data types.
    if ([json][error]) {
        json_encode {
          source => "[json][error]"
          target => "[json][error]"
        }
    }
    if ([json][err]) {
        json_encode {
          source => "[json][err]"
          target => "[json][err]"
        }
    }

https://github.com/status-im/infra-hq/blob/3168e286e417afad19f73c46dd7bff9e464a0455/ansible/roles/logstash/templates/pipe/52-filter-beacon-node.conf#L19-L31

Which results in a string that's unnecessarily quotes bare strings, but at least any format of error will be parsed and indexed:

image

What do you think @arnetheduck ?

@tersec
Copy link
Contributor

tersec commented Feb 11, 2024

#5839 and #5878 switch Dropping aggregate and the other Dropping gossip message logging to use the reason, not error field. Is this issue still applicable?

@jakubgs
Copy link
Member Author

jakubgs commented Feb 12, 2024

I think this can be closed. Thanks.

@jakubgs jakubgs closed this as completed Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 🛂 logging
Projects
None yet
Development

No branches or pull requests

4 participants