You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we use the BatchInsertBuilder to batch history table rows into large Postgres INSERT statements. However, we have discovered that it is more efficient to use the Postgres COPY statement for bulk loading. Note that COPY does not provide support for updating existing rows but that is ok because the Horizon history tables are immutable (rows are only inserted and never updated).
To complete this issue we will need to:
Implement a component similar to BatchInsertBuilder which uses COPY instead of INSERT (feel free to use the prototype from the spike)
The unit tests for this component should check that the transaction can still be rolled back if an error is encountered before stmt.Close() (this should address bartek's concerns from Ingestion performance fixes #316 (comment) )
Once the component is implemented we will need to integrate it in all the services/horizon/internal/db2/history code which inserts rows into the history tables (see master...tamirms:go:ingest-perf-pq for how this is done in the spike branch). After this task is complete the following tables will be populated by the COPY batch insert builder:
history_effects
history_ledgers
history_operation_claimable_balances
history_operation_liquidity_pools
history_operation_participants
history_operations
history_trades
history_transaction_claimable_balances
history_transaction_liquidity_pools
history_transaction_participants
history_transactions
The text was updated successfully, but these errors were encountered:
Note that COPY does not provide support for updating existing rows but that is ok because the Horizon history tables are immutable (rows are only inserted and never updated).
@tamirms, What happens during reingestion of a ledger range? I understand that it updates the history tables. I recall reading somewhere that it's ok for there to be an overlapping ledgers because the reingestion simply overwrites the existing data, is that correct?
Currently, we use the BatchInsertBuilder to batch history table rows into large Postgres
INSERT
statements. However, we have discovered that it is more efficient to use the PostgresCOPY
statement for bulk loading. Note thatCOPY
does not provide support for updating existing rows but that is ok because the Horizon history tables are immutable (rows are only inserted and never updated).To complete this issue we will need to:
COPY
instead ofINSERT
(feel free to use the prototype from the spike)services/horizon/internal/db2/history
code which inserts rows into the history tables (see master...tamirms:go:ingest-perf-pq for how this is done in the spike branch). After this task is complete the following tables will be populated by theCOPY
batch insert builder:The text was updated successfully, but these errors were encountered: