Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(server): Transform EnvelopeManager into a sequential pipeline #1416

Merged
merged 13 commits into from
Aug 18, 2022

Conversation

jan-auer
Copy link
Member

@jan-auer jan-auer commented Aug 16, 2022

Removes the long-running handler future in EnvelopeManager, and instead pass envelopes through explicit queues the ingestion stages. Actual error handling is now localized in every step of the pipeline.

This helps to improve Relay's response time when buffering requests.

Endpoint

The store endpoint handler remains unchanged. It calls the same messages and eventually invokes QueueEnvelope. That message splits up the Envelope and immediately sends it to the ProjectCache. In a follow-up, this can be moved directly into the endpoint, the EnvelopeManager is no longer needed for this.

Project Cache Queues

The biggest change is within the ProjectCache and projects. Instead of responding to CheckEnvelope, the projects now take ownership of envelopes and put them into queues if they are not up-to-date. The precise sequence is:

  1. Move envelopes into the project via the new ValidateEnvelope message. If the project state is valid, this immediately validates and forwards the envelopes to the next stage (see 3). Otherwise, put them on a local queue on the project.
  2. Trigger a fire-and-forget update of the project state in the background. The update will eventually invoke update_state on the project, which flushes the queues of pending envelopes. This is guaranteed to run eventually, if project fetching fails or times out.
  3. If the envelope needs dynamic sampling, send it back into the project cache to retrieve the second project state via AddSamplingState. Otherwise, directly forward to processing (step 5).
  4. Identical queueing behavior as in step 2. To match the prior behavior this still forwards envelopes if the sampling state cannot be fetched. Such envelopes will all be ingested.
  5. Send the envelope via ProcessEnvelope to the envelope processor, along with the project states.

Remaining Queueing

Control flow of the remaining pipeline steps did not change:

  • The project cache submits envelopes as ProcessEnvelope to the EnvelopeProcessor.
  • Internal errors in the processing pool are handled locally and logged immediately.
  • The envelope is finally passed to the EnvelopeManager with a new message SubmitEnvelope. This message runs the final dispatch between HTTP upstream (SendEnvelope), Kafka (StoreEnvelope), or capturing (Capture).
  • Errors during submission are handled in a guarded future in the EnvelopeManager. This future is expected to last significantly shorter than project fetching.

Drop Guards

There are now 3 places where futures can be dropped, which are individually monitored:

  • Projects gained a Drop implementation that logs how many events have been dropped. In normal operation, only empty projects are evicted from the cache.
  • The EnvelopeProcessor does not have any drop guarding. This is complicated to achieve, and the processor's queues are usually near empty. We can introduce a drop guard when migrating the processor to tokio.
  • The SubmitEnvelope still features the former drop guarded future.

Resources

Prepared by:

Follow-ups:

  • Remove unneeded response structs or fields for CheckEnvelope and ProcessEnvelope.
  • Remove the QueueEnvelope message and instead invoke this as function directly in the endpoint handler.
  • Tie EnvelopeContext into ProcessingError, clean up metrics emission, and improve control flow.
  • Do not overwrite a project state if it's not expired and incoming is an error in update_state.
  • Update the graph and documentation in relay-server.

#skip-changelog

@jan-auer jan-auer self-assigned this Aug 16, 2022
@jan-auer jan-auer force-pushed the ref/envelope-pipeline branch 8 times, most recently from 11d2d7f to 453b92d Compare August 17, 2022 11:50
* master:
  feat(server): Garbage collector thread for project cache eviction (#1410)
@jan-auer jan-auer marked this pull request as ready for review August 18, 2022 10:06
@jan-auer jan-auer requested a review from a team August 18, 2022 10:06
Copy link
Member Author

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guide for Reviewers: Start in endpoints/common.rs and then follow the messages QueueEnvelope -> ValidateEnvelope -> AddSamplingState -> ProcessEnvelope -> SubmitEnvelope.

relay-server/src/actors/envelopes.rs Show resolved Hide resolved
relay-server/src/actors/envelopes.rs Show resolved Hide resolved
relay-server/src/actors/outcome.rs Show resolved Hide resolved
relay-server/src/actors/processor.rs Show resolved Hide resolved
relay-server/src/actors/envelopes.rs Show resolved Hide resolved
relay-server/src/actors/project.rs Show resolved Hide resolved
relay-server/src/actors/project_cache.rs Show resolved Hide resolved
relay-server/src/statsd.rs Show resolved Hide resolved
relay-server/src/actors/project.rs Outdated Show resolved Hide resolved
relay-server/src/actors/project.rs Outdated Show resolved Hide resolved
relay-server/src/actors/project.rs Show resolved Hide resolved
relay-server/src/service.rs Outdated Show resolved Hide resolved
Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

context.notify(HandleEnvelope {
envelope: event_envelope,
envelope_context: event_context,
ProjectCache::from_registry().do_send(ValidateEnvelope::new(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be super-nice is a diagram with Actors and Messages and how they changed. I would even volunteer to draw it :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 much appreciated. One of the follow-ups mentioned in the description includes updating the documentation, which contains a graph that would need updating: https://getsentry.github.io/relay/relay_server/index.html#sequence-diagram-of-components-inside-relay

relay-server/src/actors/project_cache.rs Show resolved Hide resolved
relay-server/src/actors/project.rs Show resolved Hide resolved
jan-auer and others added 3 commits August 18, 2022 19:03
Co-authored-by: Floris Bruynooghe <flub@sentry.io>
* master:
  feat(actix): New Register (#1421)
  fix(server): Fix eviction metric (#1422)
@jan-auer jan-auer merged commit f6b789f into master Aug 18, 2022
@jan-auer jan-auer deleted the ref/envelope-pipeline branch August 18, 2022 18:51
jan-auer added a commit that referenced this pull request Aug 19, 2022
The QueueEnvelope message no longer needs to reside on the
EnvelopeManager since #1416. Instead, it has become a simple dispatch
function that calls ProcessMetrics and ValidateEnvelope directly on the
target actors.

To skip a redundant message, this PR removes this message entirely and
moves its code into the endpoint handler code. The BufferGuard has moved
into utilities.

There is no expected change in behavior or performance, although there
could be a slight decrease in latency since the endpoint no longer has
to wait for this actor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants