Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hubble: Introduce metric for lost events #15112

Closed
gandro opened this issue Feb 25, 2021 · 1 comment · Fixed by #22865
Closed

hubble: Introduce metric for lost events #15112

gandro opened this issue Feb 25, 2021 · 1 comment · Fixed by #22865
Labels
kind/feature This introduces new functionality. pinned These issues are not marked stale by our issue bot. sig/hubble Impacts hubble server or relay

Comments

@gandro
Copy link
Member

gandro commented Feb 25, 2021

Hubble currently emits LostEvents whenever it detects that an internal buffer overflowed. These are currently stored in the ring buffer and sent down downstream via the GetFlows API, but this has not proven to be very useful:

  • Filtering does not apply to lost events, resulting very confusing behavior for users
  • When the system is overloaded and produces lost events, they replace (useful) flow events with less useful lost events in the historic Hubble ring buffer

It is probably more useful to expose this information as Prometheus metrics and rate limitted warning messages. If they are exposed as part of the API, the message sent to the client should be a single aggregated statistic of how many events were lost during the request, rather than sending individual messages.

@gandro gandro added kind/feature This introduces new functionality. sig/hubble Impacts hubble server or relay labels Feb 25, 2021
@gandro gandro self-assigned this Feb 25, 2021
@gandro gandro removed their assignment Jul 18, 2022
@github-actions
Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Sep 17, 2022
@rolinh rolinh added pinned These issues are not marked stale by our issue bot. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Sep 17, 2022
lambdanis added a commit to lambdanis/cilium that referenced this issue Mar 27, 2023
Currently numbers of lost events are visible in logs and in the observer, but
they tend to be noisy and difficult to consume in such form. A Prometheus
counter is much more suitable for this kind of information, thus this patch
adds hubble_lost_events_total metric.

Fixes cilium#15112

Signed-off-by: Anna Kapuscinska <anna@isovalent.com>
michi-covalent pushed a commit that referenced this issue Mar 30, 2023
Currently numbers of lost events are visible in logs and in the observer, but
they tend to be noisy and difficult to consume in such form. A Prometheus
counter is much more suitable for this kind of information, thus this patch
adds hubble_lost_events_total metric.

Fixes #15112

Signed-off-by: Anna Kapuscinska <anna@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature This introduces new functionality. pinned These issues are not marked stale by our issue bot. sig/hubble Impacts hubble server or relay
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants