# Grokker

This presentations goal it to introduce the features of the `Grokker` and how to configure it.

### The challenge

I want to dissect a field to different target fields by logstash based grok patterns.

from this:

In [1]:
document = {"message": "2020-07-16T19:20:30.45+01:00 DEBUG This is a sample log"}

to this:

In [2]:
expected = {
    "message": "2020-07-16T19:20:30.45+01:00 DEBUG This is a sample log",
    "@timestamp": "2020-07-16T19:20:30.45+01:00",
    "logLevel": "DEBUG",
    "logMessage": "This is a sample log",
}

### Create rule and processor

create the rule:

In [3]:
import sys
sys.path.append("../../../../../")
import tempfile
from pathlib import Path

rule_yaml = """---
filter: "message"
grokker:
  mapping:
    message: "%{TIMESTAMP_ISO8601:@timestamp} %{LOGLEVEL:logLevel} %{GREEDYDATA:logMessage}"
"""

rule_path = Path(tempfile.gettempdir()) / "grokker"
rule_path.mkdir(exist_ok=True)
rule_file = rule_path / "grokker.yml"
rule_file.write_text(rule_yaml)

135

create the processor config:

In [4]:
processor_config = {
    "mygrokker":{   
        "type": "grokker",
        "specific_rules": [str(rule_path)],
        "generic_rules": ["/dev"],
        }
    }

create the processor with the factory:

In [5]:
from unittest import mock
from logprep.factory import Factory

mock_logger = mock.MagicMock()
grokker = Factory.create(processor_config, mock_logger)
grokker.setup()

### Process event

In [6]:
from copy import deepcopy
mydocument = deepcopy(document)


print(f"before: {mydocument}")
grokker.process(mydocument)
print(f"after: {mydocument}")
print(mydocument == expected)

before: {'message': '2020-07-16T19:20:30.45+01:00 DEBUG This is a sample log'}
after: {'message': '2020-07-16T19:20:30.45+01:00 DEBUG This is a sample log', '@timestamp': '2020-07-16T19:20:30.45+01:00', 'logLevel': 'DEBUG', 'logMessage': 'This is a sample log'}
True
