Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envelopes stuck in __processing until restart #3384

Open
bitsandfoxes opened this issue May 23, 2024 · 2 comments · May be fixed by #3438
Open

Envelopes stuck in __processing until restart #3384

bitsandfoxes opened this issue May 23, 2024 · 2 comments · May be fixed by #3438
Assignees
Labels
Bug Something isn't working Offline Caching

Comments

@bitsandfoxes
Copy link
Contributor

bitsandfoxes commented May 23, 2024

When offline, the CachingTransport will leave envelopes stuck in __processing. Even if the client goes online again and starts sending new events, the old ones remain there.
Only during restart will those envelopes get moved back to the cache and then sent by the worker.

@bitsandfoxes bitsandfoxes added Bug Something isn't working Offline Caching labels May 23, 2024
@bitsandfoxes
Copy link
Contributor Author

I might be missing something but how is the retry supposed to work

try
{
// Wait a bit before retrying
await Task.Delay(500, _workerCts.Token).ConfigureAwait(false);
}

if we're waiting for a signal here before doing any processing?
try
{
await _workerSignal.WaitAsync(_workerCts.Token).ConfigureAwait(false);
_options.LogDebug("CachingTransport worker signal triggered.");
await ProcessCacheAsync(_workerCts.Token).ConfigureAwait(false);
}

@jamescrosswell
Copy link
Collaborator

Lines 162-170 wait for either 500ms or the cancellation token (so basically just delays 500ms in normal circumstances).

... then back at the start of the loop, lines 149-151 execute.

It looks like the only time the _workerSignal gets released is here:

Possibly we could call that back on line 166 as well, after delaying. That would try every 500ms when offline though... so some kind of progressive backoff might be good here... 500ms after the first successive failure, 1000ms after the second etc. up to some maximum backoff that gets reset back to 500ms after a successful iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Offline Caching
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

2 participants