Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] AIOps: Add API for log rate analysis for stack / solutions usage #178613

Open
3 of 4 tasks
Tracked by #178501
walterra opened this issue Mar 13, 2024 · 3 comments
Open
3 of 4 tasks
Tracked by #178501

[ML] AIOps: Add API for log rate analysis for stack / solutions usage #178613

walterra opened this issue Mar 13, 2024 · 3 comments
Assignees
Labels
Feature:ML/AIOps ML AIOps features: Change Point Detection, Log Pattern Analysis, Log Rate Analysis :ml v8.16.0

Comments

@walterra
Copy link
Contributor

walterra commented Mar 13, 2024

The Kibana API for Log Rate Analysis is internal and was treated as an implementation detail for the UI so far. There are several use cases where it would be useful to have a more generic API for usage by other parts of the platform and solutions:

  • Augment existing date histogram charts like in Discover or Logs Explorer with Log Rate Analysis
  • Enable AI Assistant to call log rate analysis as a registered function. Note for the AI Assistant we don't need a REST API, but the registered function can reuse the same logic exposed by that API. ([ML] Enhance log rate analysis API for use with AI Assistant #178501)
  • Integrate log rate analysis into alerting

The existing internal API is also a bit complex as it streams custom NDJSON and includes data like detailed histogram data to populate the results that might not be necessary for the above use cases.

The public API could look like this:

{
    index: string;
    start: number; // histogram start timestamp
    end: number; // histogram end timestamp
    interval?: number; // bucket interval for the histogram
    fields?: string[]; // Override to provide index fields up front and avoid auto detection
    query?: object; // Optional DSL query
    groupResults = false; // Whether to run additional analysis that groups co-occurring field/values.
    randomSamplingProbability = 1; // Optional random sampler probability
 ...
}

The API would then internally run:

  • Run change point detection on the provided time range
  • Identify baseline and deviation time ranges based on the identified change point
  • If no fields are provided, auto-identify suitable fields for analysis
  • Run significant terms on the fields with the option p-value score.
  • Optionally group co-occurring field/values.

The API would not support streaming but would return results as a single JSON object.

The existing UI and internal API is GA so it's not a good candidate for PoCs and experimentation. This additional public API could be tagged experimental until the first use cases are picking it up and are more fleshed out.

Tasks

  1. :ml Feature:ML/AIOps backport:skip release_note:skip v8.16.0
    walterra
@walterra walterra added :ml Feature:ML/AIOps ML AIOps features: Change Point Detection, Log Pattern Analysis, Log Rate Analysis v8.14.0 labels Mar 13, 2024
@walterra walterra self-assigned this Mar 13, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@walterra
Copy link
Contributor Author

walterra commented Apr 4, 2024

Thoughts on a future alerts integration: It is likely too expensive to run fields caps on every alert run. What we could do instead would be something similar to how the anomaly alert caches field formats.

Identifying fields for analysis is not necessary on every alert run as long as the underlying index structure doesn't change. So caching those fields would greatly improve runtime of the analysis. This could even be part of the setup process of the alert: We'd run field identification once and the user decides which fields they'd like to use for the alert. This would mean for an alert check we'd just have to run significant terms and that's it (without grouping).

@peteharverson peteharverson changed the title [ML] AIOps: Public API for log rate analysis [ML] AIOps: Add API for log rate analysis for stack / solutions usage Apr 10, 2024
@sorenlouv
Copy link
Member

So caching those fields would greatly improve runtime of the analysis.

This sounds like a great idea. There's no need to do this for every alert execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:ML/AIOps ML AIOps features: Change Point Detection, Log Pattern Analysis, Log Rate Analysis :ml v8.16.0
Projects
None yet
Development

No branches or pull requests

4 participants