Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration for 30s metrics #3854

Open
3 tasks
leehinman opened this issue Nov 30, 2023 · 4 comments
Open
3 tasks

Integration for 30s metrics #3854

leehinman opened this issue Nov 30, 2023 · 4 comments
Assignees
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@leehinman
Copy link
Contributor

Describe the enhancement:

Pull together #3826 and #3853 into an integration

Describe a specific use case for the enhancement or feature:

Easy for users & developers to install

What is the definition of done?

  • integration package that can be installed
  • Loads Dashboards
  • Loads Alerts
@leehinman leehinman self-assigned this Nov 30, 2023
@leehinman
Copy link
Contributor Author

Starting place will be integration based on filestream input, the variable to configure will be the path to the logs when the diagnostic bundle is unzipped.

logs will be ingested into data_stream.dataset: elastic_agent_diagnostics.logs

the ndjson parser for filestream should be used and the publisher_pipeline disabled.

parsers:
- ndjson:
    expand_keys: true
    overwrite_keys: true
    add_error_key: true
publisher_pipeline.disable_host: true

The Dashboards should include the following:

  • Events per second. chart both monitoring.metrics.libbeat.output.events.total and monitoring.metrics.libbeat.output.events.acked
  • Bytes written per second: chart monitoring.metrics.libbeat.output.write.bytes
  • Errors. chart each of monitoring.metrics.libbeat.output.events.failed, monitoring.metrics.libbeat.outoput.events.dropped, monitoring.metrics.libbeat.outoput.events.toomany, monitoring.metrics.libbeat.output.batches.split, monitoring.metrics.libbeat.output.write.errors
  • Depth of Queue, chart monitoring.metrics.libbeat.pipeline.events.active
  • Events per batch, chart monitoring.metrics.libbeat.output.events.total divided by monitoring.metrics.libbeat.output.events.batches

@joshdover
Copy link
Contributor

@leehinman couple commonets here:

@leehinman
Copy link
Contributor Author

  • What's the reasoning behind a separate integration?

The separate integration is mostly for support/developers. In that scenario you have a diagnostic from someone else's cluster and you want to do some analysis on those logs with Kibana. For that it is easier to ingest that data as a "log" into it's own datastream and not a "metric" like it comes in during monitoring. And the names are slightly different when you look at monitoring vs the 30sec metrics. And lastly, the 30sec metrics are deltas, so you only get the change from the last 30sec metric, not counters like you do during monitoring, so the types can be different and have to be handled differently.

When we find things that "work", we can implement them in the elastic-agent package. That is what happened with elastic/integrations#8834

  • I think we have to skip the alert part since it's not yet supported by packages:

Even if we just end up documenting the alerts that are useful for catching performance issues or "canary" events that signal some kind of degradation I'd consider it a win. The where isn't the important part, getting it written down and available to all team members is.

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jun 4, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

4 participants