Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Beats] Hints autodiscovery support with Filestream input #35984

Closed
gizas opened this issue Jul 3, 2023 · 4 comments
Closed

[Beats] Hints autodiscovery support with Filestream input #35984

gizas opened this issue Jul 3, 2023 · 4 comments
Assignees
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team

Comments

@gizas
Copy link
Contributor

gizas commented Jul 3, 2023

Describe the enhancement:
The filestream input is the suggested input type for log processing with filebeat.

The hints autodiscovery is based on container input type (see code here) . So by default logs will be retrieved from the container using the container input.

So with this enhancement request we would like users to define hints based autodiscovery with filestream input type and to be able:

  1. To use the existing co.elastic.logs/* annotations in the hints annotations of the application pods and
  2. The annotations configure the relevant parsers of the filestream input

Describe a specific use case for the enhancement or feature:

Filebeat configuration:

filebeat.autodiscover:
      providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints.enabled: true
          hints.default_config:
            type: filestream
            prospector.scanner.symlinks: true
            id: filestream-kubernetes-pod-${data.kubernetes.container.id}
            take_over: true
            paths:
            - /var/log/containers/*-${data.kubernetes.container.id}.log
            parsers:
            - container: ~

See above that type: filestream

The user will define in the pod:

annotations:
        co.elastic.logs/json.add_error_key: "true"
        co.elastic.logs/json.expand_keys: "true"
        co.elastic.logs/json.ignore_decoding_error: "true"
        co.elastic.logs/json.keys_under_root: "true"
        co.elastic.logs/json.message_key: "message"

And those will produce the following block:

parsers:
   - ndjson: 
         ignore_decoding_error: "true"
         expand_keys: "true"
         keys_under_root: "true"
         message_key: "message"

So overall configuration of filebeat should transform to:

filebeat.autodiscover:
      providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints.enabled: true
          hints.default_config:
            type: filestream
            prospector.scanner.symlinks: true
            id: filestream-kubernetes-pod-${data.kubernetes.container.id}
            take_over: true
            paths:
            - /var/log/containers/*-${data.kubernetes.container.id}.log
            parsers:
            - container: ~
            - ndjson: 
                ignore_decoding_error: "true"
                expand_keys: "true"
                keys_under_root: "true"
                message_key: "message"

Same logic needs to be supported for rest of parsers

This was also previously discussed at #34354. Check also that one for additional context.

@gizas gizas added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Jul 3, 2023
@gizas gizas changed the title Hints autodiscovery support with Filestream input [Beats] Hints autodiscovery support with Filestream input Jul 3, 2023
@ChrsMark
Copy link
Member

ChrsMark commented Jul 3, 2023

Thanks @gizas for filing this. I'm gonna close #34354 then and reference it here in the description so as not to lose the content/discussions happened there.

@MichaelKatsoulis
Copy link
Contributor

All the json options of log input are identical to the ndjson ones of filestream, except one.

In log input, keys_under_root boolean is used to either put all decoded json values under new json object or under root.

In filestream input target ndjson option is used to specifically set the object under which the decoded data will be put. If left empty, they will go under root( this equals to keys_under_root ==true)

We need to decide how to tackle this. 2 options:

  1. In case keys_under_root=false in hints or not set(defaults to false) then we set target=json. When keys_under_root=true then we leave target empty(defaults to root).
  2. We allow users to also set the target option in the hints. This needs to be documented because right now the hints doc leads only to log's input json options. We should also link to filestream's ndjson options and note that when filestream is used as default input, the ndjson options should be used but keeping the same hints format co.elastic.logs/json.*

@gizas , @ChrsMark what are your thoughts on this?

@gizas
Copy link
Contributor Author

gizas commented Oct 17, 2023

For me we need to document and make clear:

  • co.elastic.logs/json.* with Container input ---> Full example with parameters -> Link to logs
  • co.elastic.logs/json.* with Filestreamm input ----> Full example with parameters -> Link to ndjson parser

And in the second block with filestream I would say to allow users to set target.

@ChrsMark
Copy link
Member

+1 for the 2nd option. Since only the Filestream input will be used under the hood I would go with creating a docs matrix to map the Log's input json options to the Filestream's ones.
Most of them will be the same right?
For the keys_under_root we would map it to target and then we can additionally introduce a new explicit hint called target for those that would like to use it directly.

In this way we don't break the old users and we also have an update set of hints that correspond to the Filestream input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

No branches or pull requests

3 participants