Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve ddb TransactWriteItems performance + small refactor #10464

Merged
merged 2 commits into from Mar 18, 2024

Conversation

bentsku
Copy link
Contributor

@bentsku bentsku commented Mar 15, 2024

Motivation

Follow up from #10415 and especially #9410 where we refactored the logic for BatchWriteItem.
TransactWriteItems had not been touched by the new changes, and should improve a lot from this.

Changes

  • use the same logic than BatchWriteItem, but slightly modified to account for the differences
  • add a new test for streaming with TransactWriteItems as there were none
  • refactor the logic to use structural pattern matching, very well adapted to this use case with unknown input
  • removed the calls to deepcopy in the event creation in favor of just creating a new event (avoid pickling/unpickling).

Benchmarks

Funnily enough, the changes are smaller than expected: Twisted, directly calling DynamoDB local without going through the Gateway, plus Kinesis batching relieving pressure on the gateway already improved the speed of TransactWriteItems by quite a lot compared to for example 3.2, so the gains are not so massive as we've been used to (still around 2x to 4x).

I've added 3.2.0 to the benchmarks to show how far we've come in around 2 weeks.

latest and with fix are done with Twisted. 3.2.0 did not have it yet, so it's still with hypercorn.

No Stream 3.2.0 latest with fix
Throughput (higher is better)
TransactWriteItems (10) throughput item/s 365.61 1169.23 2964.81
TransactWriteItems (25) throughput item/s 286.30 1459.05 5075.70
Total time (lower is better)
TransactWriteItems (10) total seconds (5k items) 27.35 4.2763 1.6864
TransactWriteItems (25) total seconds (12k5 items) 43.66 8.5672 2.4627
With DDB Streams 3.2.0 latest with fix
Throughput (higher is better)
TransactWriteItems (10) throughput item/s 99.41 835.92 1492.23
TransactWriteItems (25) throughput item/s 82.77 1149.21 2746.00
Total time (lower is better)
TransactWriteItems (10) total seconds (5k items) 50.29 5.9814 3.3507
TransactWriteItems (25) total seconds (12k5 items) 151.02 10.8770 4.5521
With Kinesis Streams 3.2.0 latest with fix
Throughput (higher is better)
TransactWriteItems (10) throughput item/s 102.85 899.99 1679.19
TransactWriteItems (25) throughput item/s 85.54 1261.92 3251.62
Total time (lower is better)
TransactWriteItems (10) total seconds (5k items) 48.6138 5.5556 2.9776
TransactWriteItems (25) total seconds (12k5 items) 146.1307 9.9055 3.8442

So in the end, we can still provide a 2x to 3.5x improvement over latest.

For a batch size of 25 with no streams, we already provided a 5x improvement with latest, pushed to 17.5x.
For a batch size of 25 with streams enabled, we already provided a 15x improvement, pushed to an average of 35x.

@bentsku bentsku added area: performance Make LocalStack go rocket-fast aws:dynamodb Amazon DynamoDB aws:dynamodbstreams AWS DynamoDB Streams semver: patch Non-breaking changes which can be included in patch releases labels Mar 15, 2024
@bentsku bentsku requested a review from thrau March 15, 2024 02:26
@bentsku bentsku self-assigned this Mar 15, 2024
Copy link

github-actions bot commented Mar 15, 2024

LocalStack Community integration with Pro

    2 files  ± 0      2 suites  ±0   1h 28m 17s ⏱️ - 2m 40s
2 726 tests +16  2 471 ✅ +16  255 💤 ±0  0 ❌ ±0 
2 728 runs  +16  2 471 ✅ +16  257 💤 ±0  0 ❌ ±0 

Results for commit 863cc94. ± Comparison against base commit 4be0da8.

♻️ This comment has been updated with latest results.

@bentsku bentsku marked this pull request as ready for review March 15, 2024 03:32
@thrau
Copy link
Member

thrau commented Mar 15, 2024

i'll let @giograno take the lead on this review!

Copy link
Member

@viren-nadkarni viren-nadkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the data-driven approach, nice work @bentsku !

localstack/services/dynamodb/provider.py Outdated Show resolved Hide resolved
@bentsku bentsku merged commit bbd5350 into master Mar 18, 2024
29 checks passed
@bentsku bentsku deleted the improve-ddb-transwrite-perf branch March 18, 2024 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: performance Make LocalStack go rocket-fast aws:dynamodb Amazon DynamoDB aws:dynamodbstreams AWS DynamoDB Streams semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants