services/horizon/internal/ingest: Fix deadlock in parallel ingestion #5263

tamirms · 2024-03-27T13:43:28Z

PR Checklist

PR Structure

This PR has reasonably narrow scope (if not, break it down into smaller PRs).
This PR avoids mixing refactoring changes with feature changes (split into two PRs
otherwise).
This PR's title starts with name of package that is most changed in the PR, ex.
services/friendbot, or all or doc if the changes are broad or impact many
packages.

Thoroughness

This PR adds tests for the most critical parts of the new functionality or fixes.
I've updated any docs (developer docs, .md
files, etc... affected by this change). Take a look in the docs folder for a given service,
like this one.

Release planning

I've updated the relevant CHANGELOG (here for Horizon) if
needed with deprecations, added features, breaking changes, and DB schema changes.
I've decided if this PR requires a new major/minor version according to
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.

What

I tried running the reingest command with 2 parallel workers and ran into the following postgres deadlock error which caused the reingestion to fail:

2024-03-27 09:13:38.176 UTC [80] ERROR:  deadlock detected
2024-03-27 09:13:38.176 UTC [80] DETAIL:  Process 80 waits for ShareLock on transaction 593; blocked by process 81.
	Process 81 waits for ShareLock on transaction 592; blocked by process 80.
	Process 80: 
		WITH r AS
			(SELECT unnest($1::character varying(64)[]) /* address */)
		INSERT INTO history_accounts
			(address)
		SELECT * from r
		ON CONFLICT (address) DO NOTHING
	Process 81: 
		WITH r AS
			(SELECT unnest($1::character varying(64)[]) /* address */)
		INSERT INTO history_accounts
			(address)
		SELECT * from r
		ON CONFLICT (address) DO NOTHING
2024-03-27 09:13:38.176 UTC [80] HINT:  See server log for query details.
2024-03-27 09:13:38.176 UTC [80] CONTEXT:  while inserting index tuple (532,3) in relation "history_accounts"
2024-03-27 09:13:38.176 UTC [80] STATEMENT:  
		WITH r AS
			(SELECT unnest($1::character varying(64)[]) /* address */)
		INSERT INTO history_accounts
			(address)
		SELECT * from r
		ON CONFLICT (address) DO NOTHING

This deadlock occurs because we have two go routines which both try to insert the same addresses into the history_accounts table but in a different order. For example:

At time t go routine 1 tries to insert account A.
At time t+1 go routine 2 tries to insert account B.
At time t+3 go routine 2 tries to insert account A and is blocked by go routine 1 which holds the ShareLock on row A. go routine 2 cannot progress until go routine 1 commits or rolls back its transaction.
At time t+4 go routine 1 tries to insert account B and is blocked by go routine 2 which holds the ShareLock on row B. Now there is a deadlock because there is a cycle of dependencies.

This deadlock scenario should not be possible because we sort the account addresses before inserting into history_accounts. But, it turns out that sorting alone is not sufficient to prevent this type of deadlock because of the data flow for parallel reingestion

Let's say we want to reingest ledgers 1 - 1000 with 2 parallel workers. Horizon will spin up 2 go routines. The first go routine will ingest ledgers 1 - 500 and the second go routine will ingest ledgers 501 - 1000. Each go routine will perform the ingestion of their subrange in a single transaction.

However, each go routine will further subdivide their subrange into batches of ledgers. Each batch will be ingested separately, which means that we will insert rows into history_accounts when ingesting each batch. So even if the account address are sorted in each individual batch, that does not mean the account addresses across all batches executed within a single transaction are sorted. Therefore, the possibility of a deadlock remains.

To fix this issue I have encapsulated the ingestion of each ledger batch within a single transaction. Each go routine will no longer perform the ingestion of their subrange in a single transaction, instead the go routines will execute multiple transactions (corresponding to each ledger batch) during their ingestion of the subrange.

With the change described above, a transaction in parallel ingestion may block on another if they're inserting common addresses into history_accounts, but there should never be a deadlock. However, I noticed that performance could be improved by having more fine grained transactions. The workflow of each transaction resembles the following:

BEGIN;

-- insert into lookup tables such as history_accounts, history_claimable_balances, etc
INSERT INTO history_accounts (address)
VALUES
('GA226TSKWZ2VSCJKPTXNH2N3E3UZTUMIBKPTIGCVMBGVMOQ5633D436Z'), ..., ('GA22GNOGDBCNJVTCGRAJSJ7YYRQOBBUD5HBAXO2MQ2ZFDPFOIVAQC3SW');
...

-- insert into lookup history tables such as history_transaction_participants, history_transactions, etc
INSERT INTO history_transaction_participants (history_account_id, history_transaction_id)
VALUES (23, 218982738697388032), ...;
...
COMMIT;

First, we insert into the lookup tables and then we insert into the history tables using the ids from the lookup tables. The first part of the transaction can block if there are other transactions which insert common addresses into history_accounts, however, the second part of the transaction should never block on other concurrent transactions.

So, by splitting the workflow into two transactions we can reduce contention and allow more concurrency when inserting rows into history tables:

BEGIN; -- this tx may block on other tx which insert common rows

-- insert into lookup tables such as history_accounts, history_claimable_balances, etc
INSERT INTO history_accounts (address)
VALUES
('GA226TSKWZ2VSCJKPTXNH2N3E3UZTUMIBKPTIGCVMBGVMOQ5633D436Z'), ..., ('GA22GNOGDBCNJVTCGRAJSJ7YYRQOBBUD5HBAXO2MQ2ZFDPFOIVAQC3SW');
...

COMMIT;

BEGIN; -- this tx should not block on any other running tx

-- insert into lookup history tables such as history_transaction_participants, history_transactions, etc
INSERT INTO history_transaction_participants (history_account_id, history_transaction_id)
VALUES (23, 218982738697388032), ...;
...
COMMIT;

After implementing that change, I observed a 10-15% speed up in parallel ingestion when testing locally on my laptop using 2 workers.

I think there could be other data flow changes which could also improve performance by reducing contention in the postgres transactions. One data flow which I think could be promising is:

stream ledger transactions for a large amount of ledgers in parallel
accumulate all lookup table entries in loaders which are shared by all the parallel workers
insert rows in lookup tables within a single tx
insert rows in history tables concurrently with 1 tx per worker

We can experiment with this idea in another PR

Known limitations

If individual transactions fail it could leave the DB in a partially ingested state. This can still be avoided by ingesting without concurrency (albeit without any performance speedups that concurrency gives you). But this limitation existed even prior to this change.

sreuland · 2024-03-28T19:40:20Z

I think there could be other data flow changes which could also improve performance by reducing contention in the postgres transactions

is it worth taking step back from current lookup table based on a sequential id generator and explore any other table schema that might do better in current flow? since the application flows are now getting more complex due to non-app related db concerns? One option - lookup tables use a hash of the string lookup key to generate deterministicid and corresponding batch insert builders use UPSERT on the lookup tables which may avoid the table contention issues due to INSERTs.

sreuland · 2024-03-28T19:47:35Z

Is there a issue/ticket that captures this for visibility on project side? it seems like this would be part of Sustainability/Object-8 and this time/effort can be accounted for.

tamirms · 2024-03-28T21:19:48Z

I think there could be other data flow changes which could also improve performance by reducing contention in the postgres transactions

is it worth taking step back from current lookup table based on a sequential id generator and explore any other table schema that might do better in current flow? since the application flows are now getting more complex due to non-app related db concerns? One option - lookup tables use a hash of the string lookup key to generate deterministicid and corresponding batch insert builders use UPSERT on the lookup tables which may avoid the table contention issues due to INSERTs.

the challenge with that approach is dealing with collisions and also we'd have to reingest. but it's worth thinking about. however, that falls outside the scope of this pr

tamirms · 2024-03-28T21:20:39Z

Is there a issue/ticket that captures this for visibility on project side? it seems like this would be part of Sustainability/Object-8 and this time/effort can be accounted for.

this PR fixes a bug so it doesn't get pointed

sreuland · 2024-04-15T16:04:53Z

I think there could be other data flow changes which could also improve performance by reducing contention in the postgres transactions. One data flow which I think could be promising is:

@tamirms , one more change is to skip filtered tmp processor during reingest, pr , I could merge that into here or do it separately.

sreuland

lgtm

tamirms · 2024-04-15T16:53:08Z

@sreuland I think it would be easier to review that change separately. I've configured auto-merge on this PR so it should be merged soon

sreuland · 2024-04-15T19:39:08Z

this PR fixes a bug so it doesn't get pointed

was there a GH issue on the bug?

tamirms · 2024-04-15T19:50:48Z

was there a GH issue on the bug?

no

…tellar#5263)

devfed1 · 2024-04-18T13:55:56Z

I can confirm that I encounter same error: job failed, recommended restart range: [544001, 800000]: error when processing [544001, 560000] range: error processing ledger range 557301 - 557400: Error flushing changes from processor: error during lazy loader resolution, *history.AccountLoader.Exec: exec failed: pq: deadlock detected

Horizon version: 2.29.0-4a00bc7e1afbb895dc669853c98af8f6f6663223
go1.22.1

I need to monitor the ingest process and to restart the ingestion after failure.
I end up using the reingestion with no parallel-workers option.

mollykarcher · 2024-04-26T01:10:28Z

@devfed1 this fix was released in version 2.30.0

tamirms added 2 commits March 27, 2024 13:42

Fix postgres deadlock when reingesting with parallel workers

37f6eff

Use separate tx for loaders and processsors

bfd2134

tamirms marked this pull request as ready for review March 28, 2024 09:55

tamirms requested a review from a team March 28, 2024 10:14

sreuland mentioned this pull request Apr 13, 2024

don't run filtered tmp processor when reingestion is enabled tamirms/go#3

Closed

sreuland approved these changes Apr 15, 2024

View reviewed changes

Merge branch 'master' into reingest-range-tx

9bc4e66

tamirms enabled auto-merge (squash) April 15, 2024 16:52

tamirms merged commit 6bb4a44 into stellar:master Apr 15, 2024
23 checks passed

Shaptic pushed a commit to Shaptic/go that referenced this pull request Apr 15, 2024

services/horizon/internal/ingest: Fix deadlock in parallel ingestion (s…

a1903da

…tellar#5263)

sreuland mentioned this pull request Apr 16, 2024

don't run filtered tmp processor when reingestion is enabled #5278

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

services/horizon/internal/ingest: Fix deadlock in parallel ingestion #5263

services/horizon/internal/ingest: Fix deadlock in parallel ingestion #5263

tamirms commented Mar 27, 2024 •

edited

Loading

sreuland commented Mar 28, 2024

sreuland commented Mar 28, 2024

tamirms commented Mar 28, 2024 •

edited

Loading

tamirms commented Mar 28, 2024

sreuland commented Apr 15, 2024

sreuland left a comment

tamirms commented Apr 15, 2024

sreuland commented Apr 15, 2024

tamirms commented Apr 15, 2024

devfed1 commented Apr 18, 2024 •

edited

Loading

mollykarcher commented Apr 26, 2024

services/horizon/internal/ingest: Fix deadlock in parallel ingestion #5263

services/horizon/internal/ingest: Fix deadlock in parallel ingestion #5263

Conversation

tamirms commented Mar 27, 2024 • edited Loading

PR Structure

Thoroughness

Release planning

What

Known limitations

sreuland commented Mar 28, 2024

sreuland commented Mar 28, 2024

tamirms commented Mar 28, 2024 • edited Loading

tamirms commented Mar 28, 2024

sreuland commented Apr 15, 2024

sreuland left a comment

Choose a reason for hiding this comment

tamirms commented Apr 15, 2024

sreuland commented Apr 15, 2024

tamirms commented Apr 15, 2024

devfed1 commented Apr 18, 2024 • edited Loading

mollykarcher commented Apr 26, 2024

tamirms commented Mar 27, 2024 •

edited

Loading

tamirms commented Mar 28, 2024 •

edited

Loading

devfed1 commented Apr 18, 2024 •

edited

Loading