Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

label breakdown API #12341

Open
4 of 6 tasks
trevorwhitney opened this issue Mar 25, 2024 · 1 comment
Open
4 of 6 tasks

label breakdown API #12341

trevorwhitney opened this issue Mar 25, 2024 · 1 comment
Assignees
Labels
type/feature Something new we should do

Comments

@trevorwhitney
Copy link
Collaborator

trevorwhitney commented Mar 25, 2024

Is your feature request related to a problem? Please describe.
To improve the experience of exploring logs, it is useful to show a few different histograms, beyond the default aggregate by level that we have now in explore. However, in order for frontends to be able to provide this functionality, they needs a subset of indexed labels that are "interesting" or "useful". For example, breaking logs down by GUIDs is not as useful as breaking down by component or job.

It would be nice if Loki had an endpoint that leveraged the information about streams in the index to return a list of "used" labels, ordered by "usefulness"

Describe the solution you'd like
An API that returns an array of label names, ordered by usefulness, where the criteria to determine usefulness may include:

  • less than a configurable number of values
  • more than a configurable minimum of values (ie. 2?)
  • known common labels (ie. namespace, app, cluster) are prioritzed
  • redundant labels are removed (ie. job and component overlap by ~80%)
  • ids, shas, and other non human readable/understandable values are removed

This endpoint needs to be quick, so the queryable time range might need to be hard coded, and it should leveraging some form of caching.

Describe alternatives you've considered
Determine this in the frontend, based on # of series being over some threshold, or a metrics query returning a 500. Problem here is the frontend still has to issue the metric query, thus not preventing the load on the backend.

Tasks

API Contract:

POST /api/alphav1/detected_fields

    start: <timestamp>
    end: <timestamp>
    query: <string>

Response:
[ ... labels ]

@cyriltovena
Copy link
Contributor

Moving to done for now but there's still some outstanding work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Something new we should do
Projects
None yet
Development

No branches or pull requests

4 participants