Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(statsd): Add metric name as a tag for Sentry errors #1797

Merged
merged 4 commits into from Jan 31, 2023

Conversation

iker-barriocanal
Copy link
Member

@iker-barriocanal iker-barriocanal commented Jan 30, 2023

From time to time, we still get errors from relay dropping metrics due to the channel being full, see #1680. This PR attempts to add a tag to that Sentry issue, to identify more easily what metrics may be worsening the bottleneck. Next steps TBD, based on conclusions from the errors. This code is likely to be temporary and be removed after we figure out what and how to deal with the errors.

The approach

The original idea was to extract the error and directly include it in the error here:

if let Err(error) = metric.try_send() {
relay_log::error!(

There are two problems with that approach. Firstly, the error types come from statsd's source code and it's hard to consider every case. Secondly, the code is too generic and closed that it's hard to inject something nicely without a decent amount of workarounds. Lastly, I also tried replicating the issue locally generating a lot of metrics (lots of threads, queue size to 0 and 1, sample rate 1.0), but I have not been successful.

In the end, the simplest solution was to add the name as a tag to the scope where there's access to it, wrapping the call where the actual error is logged.

@iker-barriocanal iker-barriocanal requested a review from a team January 30, 2023 21:50
@iker-barriocanal iker-barriocanal self-assigned this Jan 30, 2023
@jan-auer
Copy link
Member

jan-auer commented Jan 31, 2023

Adding the name as tag will not help us to know, which metrics are being logged most. You simply know the one that's next in line. Compared with the increased complexity, I'm not sure this change is worth it.

@iker-barriocanal
Copy link
Member Author

Adding the name as tag will not help us to know, which metrics are being logged most.

No, but metrics that are logged more often are more likely to be in tags, so I believe it can give us an approximation. Building something to efficiently (we don't want to lose even more metrics, especially these) record the metrics we log and analyzing results is too much effort, vs going through the metrics in the codebase and directly deleting some. In the end, this is about the best effort and finding a temporary solution. If this, still, doesn't provide enough value, we can call it off.

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there's ways to reduce code duplication here, but since these macro definitions are rarely touched, I think we can go ahead with the change as-is.

relay-statsd/src/lib.rs Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: Oleksandr <1931331+olksdr@users.noreply.github.com>
CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: Oleksandr <1931331+olksdr@users.noreply.github.com>
@iker-barriocanal iker-barriocanal merged commit d873ce2 into master Jan 31, 2023
@iker-barriocanal iker-barriocanal deleted the iker/ref/lost-metrics-tag-names branch January 31, 2023 15:56
jan-auer added a commit that referenced this pull request Feb 1, 2023
* master:
  ref(statsd): Add metric name as a tag for Sentry errors (#1797)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants