Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update transaction tables to reduce queries dispatched & executed by Postgres #102

Closed
simerplaha opened this issue Dec 21, 2021 · 3 comments

Comments

@simerplaha
Copy link
Member

simerplaha commented Dec 21, 2021

Overview

Update utransactions & transactions tables to use foreign key constraints

  • To reduce the number of queries required to fetch a single transaction to 1.
  • To make all inserts transactional.

Issue

To fetch a single transaction 3 queries are dispatched to Postgres.

select "hash", "chain_from", "chain_to", "gas-amount", "gas-price" from "utransactions" where "hash" = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444'
select "tx_hash", "hint", "output_ref_key", "unlock_script" from "uinputs" where "tx_hash" = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444'
select "tx_hash", "amount", "address", "lock_time" from "uoutputs" where "tx_hash" = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444'

Similar occurs for a write.

  • 5 queries in total
  • Inserts start individual transactions for single request and are not within a single transaction.
insert into "utransactions" ("hash","chain_from","chain_to","gas-amount","gas-price") ... $1 = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444', $2 = '32', ...
BEGIN
update "uinputs" set "hint"=$1,"unlock_script"=$2 where "tx_hash"=$3 and "output_ref_key"=$4 ... $3 = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444' ...
insert into "uinputs" ("tx_hash","hint","output_ref_key","unlock_script") ... $1 = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444' ...
COMMIT
BEGIN
update "uoutputs" set "amount"=$1,"lock_time"=$2 where "tx_hash"=$3 and "address"=$4 ... $3 = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444', ...
insert into "uoutputs" ("tx_hash","amount","address","lock_time") ... $1 = '4b9f7a4102027a0595f707d0849f2bee4fce4820a40822109edef96368494444', ...
COMMIT

Options

1. Foreign key constraints

Use foreign key constraints on existing tables so Postgres knows the relationships between the tables, and we can reduce
the number of queries dispatched & executed by Postgres for each read and write to 1.

2. Merge all data into a single transaction table

I'm seeing these transaction queries always fetch their relevant inputs and outputs rows. inputs and outputs tables are never queried individually. Merging inputs and outputs tables into the main transaction table as single column or two columns and storing them as un-indexed varchar or blob would increase performance by a large factor over option 1 above because

  • Postgres maintains a different file for each table & index. Merging them into a single table reduces IOps.
  • Increases write performance because only one index is maintained for transactions table and none for inputs
    and outputs (which are much larger tables with larger indexes).
  • Increases read performance because only a single index is queried.

We can store them as jsonb if we want indexing. But if indexing is needed option 1 is easier to maintain and jsonb's value size is limited to 268.435455 MB (from StackOverflow question).

Drawback of using this option over option 1 - In-case we want to query inputs and outputs individually, separate tables are much easier to maintain. But if we want performance this option 2 will be noticeably better.

Thoughts?

@tdroxler
Copy link
Member

We do use inputs and outputs for example to get the balance of an address:

private val getBalanceQuery = Compiled { address: Rep[Address] =>
outputsTable
.filter(output => output.mainChain && output.address === address)
.map(output => (output.key, output.amount, output.lockTime))
.joinLeft(inputsTable.filter(_.mainChain))
.on(_._1 === _.outputRefKey)
.filter(_._2.isEmpty)
.map { case ((_, amount, lockTime), _) => (amount, lockTime) }
}
def getBalanceAction(address: Address): DBActionR[(U256, U256)] = {
getBalanceQuery(address).result.map { outputs =>
val now = TimeStamp.now()
outputs.foldLeft((U256.Zero, U256.Zero)) {
case ((total, locked), (amount, lockTime)) =>
val newTotal = total.addUnsafe(amount)
val newLocked = if (lockTime.map(_.isBefore(now)).getOrElse(true)) {
locked
} else {
locked.addUnsafe(amount)
}
(newTotal, newLocked)
}
}
}
no tx involved here.

While writing this I also realize we need to have those two tables for example to be able to find the spent output when searching the inputs:

private val getInputsQuery = Compiled { (txHash: Rep[Transaction.Hash]) =>
mainInputs
.filter(_.txHash === txHash)
.join(mainOutputs)
.on(_.outputRefKey === _.key)
.sortBy(_._1.order)
.map {
case (input, output) =>
(input.hint,
input.outputRefKey,
input.unlockScript,
output.txHash,
output.address,
output.amount)
}
}

In the UTXO model, an input is always linked to an older output, but when we insert the transaction with the outputs, we don't know in advance which inputs of a future tx will use those outputs. So we have to do all those join between outputs/inputs` table.

So I guess we need to go for option 1

Your thoughts @polarker ?

@simerplaha
Copy link
Member Author

Hey @tdroxler, you are right about this on our Slack conversion. I got Slick confused with Hibernate. In Hibernate adding foreign-key-constraints automatically optimises queries to use joins but in Slick we have to do this manually. My excuse: haven't used ORMs in a while :(

And now since we are writing raw SQL, adding foreign-key-constraints to optimise for performance makes even less sense. So this issue is invalid.

Thank you for pointing this out. If you are ok I will close this task?

@tdroxler
Copy link
Member

👍 all good to close it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants