support input transform pipeline on raw log lines #148

bkcsfi · 2018-12-14T21:04:06Z

I am running a web server in a docker container, managed by containerpilot inside the container and deployed via docker stack

containerpilot captures stdout from the web server and prefixes it with additional fields before writing to it's own stdout, which logagent eventually processes

e.g a sample output line as seen by logagent and recorded as 'message' in elk

2018-12-14T16:22:40.497203741Z quote-server 20 10.0.6.16 - - [14/Dec/2018:16:22:40 +0000] "GET /-/metrics?module=http_2xx&target=10.0.6.24%3A8000 HTTP/1.1" 200 12485 "-" "Prometheus/2.5.0"

or, as seen by docker logs command

2018-12-14T20:54:35.108408562Z 2018-12-14T20:54:35.108156363Z quote-server 20 127.0.0.1 - - [14/Dec/2018:20:54:35 +0000] "GET /-/health HTTP/1.1" 200 16 "-" "curl/7.52.1"

I can handle this situation by copying the existing httpd pattern and adding additional fields corresponding to datestamp, process name, and loglevel.

However I will have other processes deployed in a similar manner that won't be web servers. Rather than writing custom plugins just to handle additional fields prefixed by containerpilot, I wonder if it would be possible to have some kind of globalTransform that runs BEFORE existing pattern matching, on non-json input.

This could be something like the grep input filter, except adding the ability to:

a. transform the data before passing it on to the callback for subsequent processing

b. optionally capturing meta-data in this stage of the pipeline (such as loglevel, in my example)

In some sense, this could be a group of transforms that try to match the raw input line, the first input transform that "matches" would be the only one that gets to alter the raw input line, and then processing continues just as it does now with input filters and patterns.

maybe this can also handle the case where a container application outputs json, which needs to be split back into regular text for pattern match (e.g transform from json to 'text'), or even nested json extraction, etc..

The text was updated successfully, but these errors were encountered:

megastef · 2018-12-17T22:03:51Z

I did not know containerpilot.

+1 To create some input filter for container pilot. The input filter could set a log context object, and an output filter could add the fields from log context back to the log message object. Just an idea to implement what you want with the existing Logagent mechanisms.

If you have more questions, please feel free to reach out ...

megastef · 2019-04-04T14:51:56Z

@bkcsfi We did recently something similar, parsing containerd log headers.
See the plugin code: https://github.com/sematext/logagent-js/blob/master/lib/plugins/input-filter/kubernetesContainerd.js

mikolajroszak mentioned this issue Mar 19, 2024

[Snyk] Upgrade flat from 5.0.2 to 6.0.1 mikolajroszak/logagent-js#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support input transform pipeline on raw log lines #148

support input transform pipeline on raw log lines #148

bkcsfi commented Dec 14, 2018

megastef commented Dec 17, 2018 •

edited

Loading

megastef commented Apr 4, 2019

support input transform pipeline on raw log lines #148

support input transform pipeline on raw log lines #148

Comments

bkcsfi commented Dec 14, 2018

megastef commented Dec 17, 2018 • edited Loading

megastef commented Apr 4, 2019

megastef commented Dec 17, 2018 •

edited

Loading