Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(replay): Handle compression failures more robustly #6988

Merged
merged 2 commits into from Feb 1, 2023

Conversation

mydea
Copy link
Member

@mydea mydea commented Jan 31, 2023

We now stop the recording when either addEvent or finish fail.

Other semi-related changes:

  • We send the event as a string to the worker now. Currently, we serialize/deserialize/serialize it, we really only need the string form on the worker anyhow, that should save a bit of processing.
  • We do not really need the pendingEvents on the buffer, as well as pendingLength. We really only need to know if any event has been added (I think?).
  • I renamend the worker method init to clear, as that is IMHO clearer.
  • I split the event buffer tests up into separate files per module

Closes #6923

@mydea mydea requested review from billyvg and Lms24 January 31, 2023 13:13
@mydea mydea self-assigned this Jan 31, 2023
this._ensureWorkerIsLoadedPromise = this._ensureWorkerIsLoaded().catch(() => {
// Ignore errors here
});
this._ensureWorkerIsLoadedPromise = this._ensureWorkerIsLoaded();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to catch this, as we catch everything in this method now.

@@ -772,7 +772,7 @@ export class ReplayContainer implements ReplayContainerInterface {
await this._addPerformanceEntries();

// Check eventBuffer again, as it could have been stopped in the meanwhile
if (!this.eventBuffer || !this.eventBuffer.pendingLength) {
if (!this.eventBuffer || !this.eventBuffer.hasEvents) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was actually not really correct before? as pendingLength could be 0 in a compression worker if everything was correctly compressed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is hasEvents different from pendingLength?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pendingLength in the compression worker was subtracted when an event was successfully compressed (=sent to the worker). So if there would have been 10 events in the worker, but nothing pending to be sent, this would have been 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it doesn't matter now, but pendingLength was never subtracted, it only got reset when we finish the compression payload, so I think these two are the same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right of course, I misunderstood this. I guess it's fine anyhow 👍 sorry about the confusion!

@@ -39,5 +41,10 @@ export async function addEvent(
replay.getContext().earliestEvent = timestampInMs;
}

return replay.eventBuffer.addEvent(event, isCheckout);
try {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the "actual" change. Note that for finish we already try-catch this (when flushing).

@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2023

size-limit report 📦

Path Size
@sentry/browser - ES5 CDN Bundle (gzipped + minified) 19.87 KB (-0.01% 🔽)
@sentry/browser - ES5 CDN Bundle (minified) 61.57 KB (0%)
@sentry/browser - ES6 CDN Bundle (gzipped + minified) 18.54 KB (-0.01% 🔽)
@sentry/browser - ES6 CDN Bundle (minified) 54.88 KB (0%)
@sentry/browser - Webpack (gzipped + minified) 20.29 KB (0%)
@sentry/browser - Webpack (minified) 66.37 KB (0%)
@sentry/react - Webpack (gzipped + minified) 20.31 KB (0%)
@sentry/nextjs Client - Webpack (gzipped + minified) 47.6 KB (0%)
@sentry/browser + @sentry/tracing - ES5 CDN Bundle (gzipped + minified) 26.79 KB (-0.01% 🔽)
@sentry/browser + @sentry/tracing - ES6 CDN Bundle (gzipped + minified) 25.08 KB (-0.01% 🔽)
@sentry/replay ES6 CDN Bundle (gzipped + minified) 44.16 KB (-0.1% 🔽)
@sentry/replay - Webpack (gzipped + minified) 38.89 KB (-0.17% 🔽)
@sentry/browser + @sentry/tracing + @sentry/replay - ES6 CDN Bundle (gzipped + minified) 61.5 KB (-0.09% 🔽)

@mydea mydea changed the base branch from master to develop January 31, 2023 15:38
@mydea mydea force-pushed the fn/replay-handle-worker-errors branch from 27439bb to 4f7e6d5 Compare January 31, 2023 16:06
@@ -0,0 +1,106 @@
import { logger } from '@sentry/utils';
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I extracted this out of the EventBufferCompressionWorker, first because we may need it in follow ups (for error handling etc.), second I think it makes sense architecturally to have this separated - no need to leak this through the buffer implementation! Also allows us to easier encapsulate the id handling stuff etc.

Copy link
Member

@billyvg billyvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from a few questions

Comment on lines -39 to -40
public get pendingEvents(): RecordingEvent[] {
return this._pendingEvents;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pendingEvents was added to attempt to send a segment on unload (without compression), but since we have decided to pause on that, I think it's good to remove this 👍

if (!data) {
throw new Error('Adding invalid event');
}
// If the event is not the first event, we need to prefix it with a `,` so
// that we end up with a list of events
const prefix = this.added > 0 ? ',' : '';
const prefix = this._hasEvents ? ',' : '';
// TODO: We may want Z_SYNC_FLUSH or Z_FULL_FLUSH (not sure the difference)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this now 😅

@@ -772,7 +772,7 @@ export class ReplayContainer implements ReplayContainerInterface {
await this._addPerformanceEntries();

// Check eventBuffer again, as it could have been stopped in the meanwhile
if (!this.eventBuffer || !this.eventBuffer.pendingLength) {
if (!this.eventBuffer || !this.eventBuffer.hasEvents) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is hasEvents different from pendingLength?

Comment on lines +47 to +48
__DEBUG_BUILD__ && logger.error(error);
replay.stop();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this gets caught and doesn't bubble up, the callers will continue to run -- is that going to cause issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be, I guess in the worst case we call addEvent again somewhere, which will no-op because it is stopped.

// TODO: We may want Z_SYNC_FLUSH or Z_FULL_FLUSH (not sure the difference)
// Using NO_FLUSH here for now as we can create many attachments that our
// web UI will get API rate limited.
this.deflate.push(prefix + JSON.stringify(data), constants.Z_SYNC_FLUSH);
this.deflate.push(prefix + data, constants.Z_SYNC_FLUSH);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mydea mydea force-pushed the fn/replay-handle-worker-errors branch from 4f7e6d5 to 0f115ab Compare February 1, 2023 08:55
Copy link
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mydea mydea force-pushed the fn/replay-handle-worker-errors branch from 0f115ab to 39e7908 Compare February 1, 2023 09:34
@mydea mydea enabled auto-merge (squash) February 1, 2023 09:35
@mydea mydea force-pushed the fn/replay-handle-worker-errors branch from 39e7908 to 5b53ef6 Compare February 1, 2023 10:14
@mydea mydea merged commit 5c41ab7 into develop Feb 1, 2023
@mydea mydea deleted the fn/replay-handle-worker-errors branch February 1, 2023 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in compression worker being spammed after adding replay integration
3 participants