Apply doc-count batching policy to transactions before pipelining #1808

wotbrew · 2022-08-26T11:47:48Z

Problem

In #1762 pipelining was introduced during indexing from the golden stores. This improves IO resource utilisation and delivers a substantial improvement to ingest throughput.

However one implication is that document stores can receive a much higher set of ids to fetch a time, because now the documents for many transactions are fetched at once. There is some concern ( #1800 ) that the increased concurrent request volume may be triggering errors or breaching request-per-second limits.

Solution

This PR then represents an early mitigation strategy that attempts to allow some transaction batching while putting some limits on the number of fetches document stores will be requested to do in a single eager operation.

The mechanism added batches transactions according to number of referenced docs, so that many small transactions can benefit from a lot of batching together, but larger transactions will be issued in smaller batches.

Note that a single transaction can reference any number of documents, and so for large enough transactions this PR will not help - however that issue will have been around pre #1762.

This represents a tactical, speculative change and may not resolve the S3 / R2 problem raised in #1800. Further benchmarks are necessary to determine what impact the batching policy has on overall ingest throughput, though due to the lack of any batching prior to #1762, I assume no performance regression.

Configuration

A configuration variable is available to vary the ideal batch doc-count on the ingester: :batch-preferred-doc-count.
I would consider this variable a temporary tuning parameter we can use early on while testing but I would not consider it part of the configuration surface of XTDB.

core/src/xtdb/tx.clj

Partition transactions according to number of referenced docs, so that many small transactions can benefit from batching without overwhelming existing fetch impls or exhausting memory/cpu resources. Relates to xtdb#1800 - may fix

wotbrew requested a review from jarohen August 26, 2022 11:51

wotbrew force-pushed the tx-partitioning branch from 93b4172 to cc9b049 Compare August 26, 2022 12:09

jarohen reviewed Aug 26, 2022

View reviewed changes

core/src/xtdb/tx.clj Outdated Show resolved Hide resolved

jarohen reviewed Aug 26, 2022

View reviewed changes

core/src/xtdb/tx.clj Outdated Show resolved Hide resolved

jarohen reviewed Aug 26, 2022

View reviewed changes

core/src/xtdb/tx.clj Outdated Show resolved Hide resolved

wotbrew force-pushed the tx-partitioning branch from cc9b049 to bf2912f Compare August 26, 2022 15:09

jarohen approved these changes Aug 26, 2022

View reviewed changes

wotbrew force-pushed the tx-partitioning branch from bf2912f to 6da8ba6 Compare August 30, 2022 11:24

wotbrew merged commit e4f983c into xtdb:master Aug 30, 2022

jarohen mentioned this pull request Sep 5, 2022

fetch-docs can send a huge number of requests causing ingestion to fail #1800

Closed

jarohen added this to the 1.22.0 milestone Sep 14, 2022

jarohen added the 1.x label Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply doc-count batching policy to transactions before pipelining #1808

Apply doc-count batching policy to transactions before pipelining #1808

wotbrew commented Aug 26, 2022 •

edited

Apply doc-count batching policy to transactions before pipelining #1808

Apply doc-count batching policy to transactions before pipelining #1808

Conversation

wotbrew commented Aug 26, 2022 • edited

Problem

Solution

Configuration

wotbrew commented Aug 26, 2022 •

edited