Skip to content

Sync: chunk user requests#1367

Merged
Den4200 merged 2 commits into
masterfrom
bug/backend/bot-4x/chunk-sync-requests
Jan 19, 2021
Merged

Sync: chunk user requests#1367
Den4200 merged 2 commits into
masterfrom
bug/backend/bot-4x/chunk-sync-requests

Conversation

@MarkKoz
Copy link
Copy Markdown
Contributor

@MarkKoz MarkKoz commented Jan 19, 2021

The site can't handle huge syncs. Even a bulk patch of 10k users will crash the service. Chunk the requests into groups of 1000 users and await them sequentially. Testing showed that concurrent requests are not scalable and would also crash the service.

10k user patch without chunking. This caused the site service to crash and respond with 524.

bild
bild

10k user patch with chunking and concurrent requests. The operation succeeded but there was some CPU throttling. It implies that this won't scale well (the situation that triggered the discovery of this issue was a sync of over 80k users).

bild

This compares the two aforementioned tests (start times indicates by the vertical lines):

bild

I have no graphs for chunking without concurrent requests (i.e. what this PR does) but we know that this approach works as it was done in production with internal evals too.

The site can't handle huge syncs. Even a bulk patch of 10k users will
crash the service. Chunk the requests into groups of 1000 users and
await them sequentially. Testing showed that concurrent requests
are not scalable and would also crash the service.
@MarkKoz MarkKoz added t: bug Something isn't working a: backend Related to internal functionality and utilities (error_handler, logging, security, utils and core) p: 1 - high High Priority labels Jan 19, 2021
@coveralls
Copy link
Copy Markdown

coveralls commented Jan 19, 2021

Coverage Status

Coverage increased (+0.05%) to 56.745% when pulling a1c9e00 on bug/backend/bot-4x/chunk-sync-requests into cbeb6eb on master.

Copy link
Copy Markdown
Member

@jb3 jb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a similar solution has already been tested in the production cluster and proven to work so this should fix our timeout problems.

@MarkKoz MarkKoz requested a review from Akarys42 as a code owner January 19, 2021 01:01
Copy link
Copy Markdown
Member

@Den4200 Den4200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was lurking in #dev-ops and tested in the test server!

@Den4200 Den4200 merged commit 966ad16 into master Jan 19, 2021
@Den4200 Den4200 deleted the bug/backend/bot-4x/chunk-sync-requests branch January 19, 2021 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a: backend Related to internal functionality and utilities (error_handler, logging, security, utils and core) p: 1 - high High Priority t: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants