Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

promtail: add metrics on sent and dropped log entries #952

Merged

Conversation

pracucci
Copy link
Contributor

What this PR does / why we need it:
If the ingester is unavailable, log entries batches are dropped (discarded) once all maxretries are exausted, but there's no metric - exported by promtail - to alert on it.

In this PR I've added the following metrics exported by promtail:

  • promtail_dropped_bytes_total: Number of bytes of log lines dropped because failed to be sent to the ingester after all retries
  • promtail_log_entries_sent_total: Total number of log entries sent to the ingester
  • promtail_log_entries_dropped_total: Total number of log entries dropped because failed to be sent to the ingester after all retries

Which issue(s) this PR fixes:
Fixes #925

Checklist

  • Documentation added
  • Tests updated

@pracucci pracucci force-pushed the add-further-metrics-to-promtail-client branch from d2c411f to 7b1336b Compare August 30, 2019 16:17
@slim-bean
Copy link
Collaborator

:) will merge this tomorrow, I was just looking at adding these exact metrics today

Copy link
Collaborator

@slim-bean slim-bean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@slim-bean
Copy link
Collaborator

Interesting, a conversation around #963 just came up which points out the accuracy of these metrics as implemented in this PR will be off a little. That is, the pushes to Loki are not atomic, some of the entries may have succeeded while some didn't.

I'm tempted to merge this as is until improvements are made to return the actual count of success vs failure in the error response. (I think @wardbekker is looking at adding this)

WDYT @pracucci ?

@pracucci
Copy link
Contributor Author

pracucci commented Sep 5, 2019

@slim-bean As you noted, the metrics added in this PR are specular to the already existing ones, which also suffers the issue described in the PR #963. These metrics are inaccurate in case the same push request only a subset (but not all) of them fail to be ingested.

I would personally merge this PR given it doesn't introduce a new issue (the metrics we already have suffer it too), but at the same time start working to fix #963 (I will follow up there).

Up to you picking the last decision 🙏

@slim-bean slim-bean merged commit 62d5122 into grafana:master Sep 6, 2019
@slim-bean
Copy link
Collaborator

I agree this is still valuable as is, merged!

@pracucci pracucci deleted the add-further-metrics-to-promtail-client branch September 6, 2019 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

promtail: lack of metrics on dropped batches because of ingester unavailable
2 participants