Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): Send metrics via the global endpoint #2902

Merged
merged 10 commits into from Jan 2, 2024

Conversation

jan-auer
Copy link
Member

@jan-auer jan-auer commented Dec 22, 2023

Introduces an option http.global_metrics. When enabled and not in
processing mode, Relay sends metrics to the global batch endpoint at
/api/0/relays/metrics/ instead of envelopes. This endpoint allows for
batched submission of metrics from multiple projects, which should
reduce the overall number of requests.

Bug Fixes

This change contains additional bug fixes that were discoverd during
implementation:

  • The batch metrics endpoint no longer requires the emit_outcomes
    outcomes flag to be set. This was invalid copy & paste from the
    outcomes endpoint.
  • Requests with HTTP encoding received a signature computed from the
    compressed body. However, Relay requires the signature on the
    uncompressed body.

Details

Building the request occurs in the EnvelopeProcessor in place of
building envelopes in the following steps:

  1. Iterate buckets from all projects and stream them into partitions.
  2. While collecting, check the size of the partition. As soon as a
    partition reaches the batch size limit, flush the partition eagerly.
    Buckets at the border may be split.
  3. At the end flush all remaining partitions with data.
  4. To flush a partition, JSON-encode the list of buckets and optionally
    apply HTTP encoding (compression).
  5. Submit the SendMetricsRequest with the payload and outcome
    metadata directly to the upstream.
  6. In the response callback, log outcomes directly. Therefore, the
    request does not have to be awaited.

In processing mode, Relay still produces to Kafka. The configuration
option does not have an effect in processing mode.

A note on stability: This endpoint and functionality is meant for
operation at scale within a distributed Sentry installation. At this
moment, it is not recommended to enable this option for external Relays.

Tasks

  • PR description
  • Changelog entry with option
  • Integration test
  • Doc comments
  • BUG: Batch metrics endpoint checks outcomes flag
  • BUG: Invalid signature applied to compressed requests

Possible Improvements

The changes in this PR allow for further optimization and changes. In
order to keep the scope of this PR to a necessary minimum, these have
not been included yet:

  • Submit regular envelopes directly from the EnvelopeProcessor,
    avoiding the roundtrip through the EnvelopeManager. This also avoids
    a redundant call back to compress envelopes in the processor, should
    HTTP encoding be enabled.
  • Improve handling of the ExtractionMode. It can be accessed more
    conveniently through a getter. Additionally, it resides on project
    config where it should actually be included in global config.
    Currently it is being passed around in excess.

@jan-auer jan-auer requested a review from a team as a code owner December 22, 2023 15:32
@jan-auer jan-auer mentioned this pull request Dec 22, 2023
6 tasks
@jan-auer jan-auer self-assigned this Dec 22, 2023
tags: &self.tags,
}
.hash64()
let mut hasher = FnvHasher::default();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BucketKeyRef is no longer required. The hash created here and the hash used for partitioning do not have to be the same, which is why we can hash differently.

///
/// The distribution of buckets should be even.
/// If it is not, this metric should expose it.
PartitionKeys,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As partitioning is now exclusively in the processor, this metric has moved here from relay-metrics. Its key is still the same.

bucket: self.current.bucket + at,
};
}
self.current = Index {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for whether the bucket can split has been moved into split_at. Invariants for splitting are still checked in the inner iterator.

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment about test coverage, apart from that LGTM!

relay-metrics/src/view.rs Outdated Show resolved Hide resolved
relay-server/src/http.rs Show resolved Hide resolved
})
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of metrics-specific methods here now, maybe we should move them to a submodule (as standalone functions) in a follow-up PR.

relay-server/src/actors/processor.rs Show resolved Hide resolved
@jan-auer jan-auer merged commit 8a6b8da into master Jan 2, 2024
20 checks passed
@jan-auer jan-auer deleted the feat/global-flush-impl branch January 2, 2024 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants