Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event log file gets flooded with copy_fields processor error #5299

Open
belimawr opened this issue Aug 14, 2024 · 5 comments
Open

Event log file gets flooded with copy_fields processor error #5299

belimawr opened this issue Aug 14, 2024 · 5 comments
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@belimawr
Copy link
Contributor

For confirmed bugs, please report:

  • Version: 8.15.0

The filestream-monitoring deployed by Elastic-Agent to collect its own logs is flooding the event logs with an copy_fields processor error:

{
  "log.level": "debug",
  "@timestamp": "2024-08-14T13:40:27.920Z",
  "message": "Failed to copy fields in copy_fields processor: target field data_stream.dataset already exists, drop or rename this field first",
  "component": {
    "binary": "filebeat",
    "dataset": "elastic_agent.filebeat",
    "id": "filestream-monitoring",
    "type": "filestream"
  },
  "log": {
    "source": "filestream-monitoring"
  },
  "log.type": "event",
  "ecs.version": "1.6.0",
  "log.logger": "copy_fields",
  "log.origin": {
    "file.line": 81,
    "file.name": "actions/copy_fields.go",
    "function": "github.com/elastic/beats/v7/libbeat/processors/actions.(*copyFields).Run"
  },
  "service.name": "filebeat"
}

On my test, that accounted for about 38% of the entries in the event log.

This seems to be coming from the following chain of processors (from components/filestream-monitoring/beat-rendered-config.yml):

        - copy_fields:
            fields:
                - from: data_stream.dataset
                  to: data_stream.dataset_original
        - drop_fields:
            fields:
                - data_stream.dataset
        - copy_fields:
            fail_on_error: false
            fields:
                - from: component.dataset
                  to: data_stream.dataset
            ignore_missing: true
        - copy_fields:
            fail_on_error: false
            fields:
                - from: data_stream.dataset_original
                  to: data_stream.dataset

The third and forth processors copy different fields to the same destination, so if the third runs successfully, then the forth will always fail and generate the log message above.

While this can be the intended behaviour (trying to set data_stream.dataset from multiple sources), it is flooding our logs.

We can use some processor conditions to avoid running the copy_fields processor if the field is already present, thus avoiding the flood of debug logs.

Steps to reproduce

  • Generate logs with a reasonable rate for testing, like 1 event per second, keep it running
  • Add the Custom Logs integration to an Elastic-Agent policy to ingest the log file
  • Let the Elastic-Agent run for a minute or so
  • Collect the diagnostics
  • Inspect the event log file (logs/elastic-agent-8.15.0-25075f/events/*.ndjosn)
@belimawr belimawr added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Aug 14, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@belimawr
Copy link
Contributor Author

Another option is to move the log entry inside the fail_on_error check, so we only log in case the processor actually failed.

Here is where we're logging it: https://github.com/elastic/beats/blob/f6b8701e8c8034836becb9ccaf3f4b2449fc589f/libbeat/processors/actions/copy_fields.go#L81C4-L87

@blakerouse
Copy link
Contributor

@belimawr I believe the change in beats is a better change, seems weird that a fail_on_error: true would flood the logs.

@belimawr
Copy link
Contributor Author

@belimawr I believe the change in beats is a better change, seems weird that a fail_on_error: true would flood the logs.

Did you mean fail_on_error: false? I'd expect the processor to be verbose if it fails so users can easily know something is not working as expected.

Even with fail_on_error: false it fails on every event, hence, it floods the logs. Regardless of changing when/how it logs, I believe the Elastic-Agent should not be relying on processors silently failing when trying to populate some fields, we have mechanisms to avoid this, and I believe they should be used.

@jgreene-TrappTech
Copy link

jgreene-TrappTech commented Oct 26, 2024

Bump

{
    "log.level": "debug",
    "@timestamp": "2024-10-26T09:53:50.428Z",
    "message": "Failed to copy fields in copy_fields processor: target field data_stream.dataset already exists, drop or rename this field first",
    "component": {
        "binary": "filebeat",
        "dataset": "elastic_agent.filebeat",
        "id": "filestream-monitoring",
        "type": "filestream"
    },
    "log": {
        "source": "filestream-monitoring"
    },
    "log.logger": "copy_fields",
    "log.origin": {
        "file.line": 81,
        "file.name": "actions/copy_fields.go",
        "function": "github.com/elastic/beats/v7/libbeat/processors/actions.(*copyFields).Run"
    },
    "service.name": "filebeat",
    "log.type": "event",
    "ecs.version": "1.6.0",
    "ecs.version": "1.6.0"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

4 participants