Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement info logging #11

Closed
renekrie opened this issue Mar 16, 2021 · 2 comments
Closed

Implement info logging #11

renekrie opened this issue Mar 16, 2021 · 2 comments
Milestone

Comments

@renekrie
Copy link
Collaborator

renekrie commented Mar 16, 2021

Implement the ES equivalent of info logging (https://docs.querqy.org/querqy/solr-plugin-configuration.html#info-logging).

The goal is that we can track information that was emitted by rewriters. The existing implementation at Querqy Core level provides an InfoLogging framework and the Common Rules Rewriter already produces the information which rules where applied. As part of the general info logging framework, this information can be sent to a Sink.

In Solr, the only Sink implementation returns the log messages as part of the search request response. This option is not available in ES as we cannot manipulate the response. The idea is to create a Sink implementation that simply logs the messages using Java logging (as usual in ES). We should provide some request parameter that will be passed through and appended to the log message so that the message can be related to a query or to a request id.

@renekrie
Copy link
Collaborator Author

renekrie commented Mar 26, 2021

First implementation (commit f0dda61 ):

Logging must be enabled per rewriter. This must be done in the rewriter configuration (https://docs.querqy.org/querqy/rewriters.html). For this, the rewriter configuration is extended by an info_logging object. It specifies to which logging sinks the log messages should be routed. Currently, the only available sink is log4j:

{
 "class": "querqy.elasticsearch.rewriter.SimpleCommonRulesRewriterFactory",
"info_logging": { 
   "sinks": ["log4j"]
},
 "config": {
     "rules" : "notebook =>\nSYNONYM: laptop"
 }
}

We can an additional specification for logging per search request:

  {

   "query": {

       "querqy": {

           "matching_query": {
               "query": "notebook",
               
           },

           "query_fields": [
               "title^3.0", "brand^2.1", "shortSummary"
           ],

           "info_logging": {
              "id":"Some request or query identifier",
              "type": "DETAIL"
          },
           "rewriters": [
               "common_rules"
           ]
      }
  }

The id can be used to identify the request or the query or anything else in the log output. type defines what will be written to the log output. Valid values:

  • REWRITER_ID (default): Only the rewriter ID (like 'common_rules') will be logged, provided that the rewriter issues a log message at all
  • DETAIL: Log a map with rewriter IDs as keys and complete messages
  • NONE: Don't log anything.

Example log messages:

Rewriters 'common_rules1' and 'common_rules2' logging their full details. Request id is id-1001, the 'applied rules' had their log messages configured as 'msg1' and 'msg2':

[2021-03-26T13:23:43,006][INFO ][q.e.i.Log4jSink ] [node_s_0]DETAIL[ QUERQY ] {"id":"id-1001","msg":{"common_rules1":[{"APPLIED_RULES":["msg1"]}],"common_rules2":[{"APPLIED_RULES":["msg2"]}]}}

Rewriters 'common_rules' logging its rewriter ID. Request id is id-1002:

[2021-03-26T13:28:47,454][INFO ][q.e.i.Log4jSink ] [node_s_0]REWRITER_ID[ QUERQY ] {"id":"id-1002","msg":["common_rules"]}

The log output prints out DETAIL[ QUERQY ] and REWRITER_ID[ QUERQY ]. These are Log4j markers that you can use to route/configure the log output. Both cases use the parent marker QUERQY and they add a child marker depending on whether the log message type is DETAIL or REWRITER_ID.

Unfortunately, we will have to change the mapping for the .querqy index in ES to:

{
    "properties": {
      "class": {"type": "keyword"},
      "type": {"type": "keyword"},
      "info_logging": {
        "properties": {
          "sinks": {"type" : "keyword" }
        }
      },
      "config": {
        "type" : "keyword",
        "index": false
      }

    }
}

We need to add the info_logging object. TODO: check/update this on startup or leave it to Querqy users.

@renekrie
Copy link
Collaborator Author

renekrie commented May 9, 2021

Implemented via #13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant