Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better logging with MDC and maybe jsonevent-layout #36

Closed
danizen opened this issue Mar 6, 2019 · 4 comments
Closed

Better logging with MDC and maybe jsonevent-layout #36

danizen opened this issue Mar 6, 2019 · 4 comments

Comments

@danizen
Copy link
Contributor

danizen commented Mar 6, 2019

@simonjupp , I have been working on enhancing logging a bit.

One piece of the puzzle has been getting to a structured lodestar.log - and the simplest way to do that is by adding a dependency for https://github.com/logstash/log4j-jsonevent-layout, e.g.

        <dependency>
            <groupId>net.logstash.log4j</groupId>
            <artifactId>jsonevent-layout</artifactId>
            <version>1.7</version>
        </dependency>

However, another part of it is to make the query and other parameters appear as structured extra variables, something done with Mapped Diagnostic Context, something like

MDC.put("query", query);
...
MDC.remove("query");

I noted some very interesting stuff in the 1.4 branch, which is what I'm running with currently:

log.info("What user information can we retrieve here");

I'm tempted also to upgrade to log4j 2, which would mean using a different Maven package for structured logging. The benefit of upgrading to log4j 2 is that there would be current documentation for that configuration. log4j 1.2 is getting so old that a search on apache log4j is going to find version 2.

@simonjupp
Copy link
Collaborator

Sounds good to me. The 'interesting stuff' was some additional logging that @LLTommy added so we could do some deeper analysis of queries from the logs. We have an ELK stack configured to analyse this output.

@danizen
Copy link
Contributor Author

danizen commented Mar 7, 2019

Sounds good to me. The 'interesting stuff' was some additional logging that @LLTommy added
so we could do some deeper analysis of queries from the logs. We have an ELK stack configured
to analyse this output.

Would you prefer a pull request to the 1.4 branch or to master? Which is really most current at this point?

@danizen
Copy link
Contributor Author

danizen commented Mar 7, 2019

@simonjupp , I decided I should of course merge to master, but that 1.4-release should also be merged into master. So, I've created a pull request just for that. I'll proceed on the assumption that 1.4 release should be merged first, but if I send a new pull request, it will be against master.

@danizen
Copy link
Contributor Author

danizen commented Mar 8, 2019

I've looked at three different ways to get to logs already in JSON, e.g. each line is JSON:

The first two are part of the logstash organization on github, but the last is just some guy. So, I think it is best to stick with log4j 1.2.17, however, moving to a JSON encoding and making use of Mapped Diagnostic Context (MDC) would be a better way to capture the extra information - you guys probably have some complicated logstash configuration turning multiple log messages into one event, or you just look at them and know.

With the first of these, I was able to achieve the following log entry from our SPARQL endpoint:

{
    "@timestamp": "2019-03-07T23:32:36.097Z",
    "source_host": "NLM01890936OCCS",
    "file": "SparqlController.java",
    "method": "query",
    "level": "INFO",
    "line_number": "74",
    "thread_name": "http-nio-8080-exec-2",
    "@version": 1,
    "logger_name": "gov.nih.nlm.lode.api",
    "message": "sparql query",
    "class": "gov.nih.nlm.lode.servlet.SparqlController",
    "mdc": {
        "cliaddr": "0:0:0:0:0:0:0:1",
        "format": "JSON",
        "query": "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>\r\nPREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\r\nPREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\r\nPREFIX owl: <http://www.w3.org/2002/07/owl#>\r\nPREFIX meshv: <http://id.nlm.nih.gov/mesh/vocab#>\r\nPREFIX mesh: <http://id.nlm.nih.gov/mesh/>\r\nPREFIX mesh2019: <http://id.nlm.nih.gov/mesh/2019/>\r\nPREFIX mesh2018: <http://id.nlm.nih.gov/mesh/2018/>\r\nPREFIX mesh2017: <http://id.nlm.nih.gov/mesh/2017/>\r\n\r\nSELECT distinct ?d ?dLabel \r\nFROM <http://id.nlm.nih.gov/mesh>\r\nWHERE {\r\n  ?d meshv:allowableQualifier ?q .\r\n  ?q rdfs:label 'adverse effects'@en . \r\n  ?d rdfs:label ?dLabel . \r\n} \r\nORDER BY ?dLabel \r\n",
        "session-id": "C4A0A12448D59EB069F628C00F21A2A5",
        "offset": 0,
        "session-time": "2019-03-07T18:32:35.018-05:00",
        "inference": false,
        "ua": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36",
        "limit": 50,
        "webui": true
    }
}

@danizen danizen closed this as completed Mar 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants