Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

show us the encrypted data #407

Merged
merged 5 commits into from May 12, 2023
Merged

show us the encrypted data #407

merged 5 commits into from May 12, 2023

Conversation

pro-wh
Copy link
Collaborator

@pro-wh pro-wh commented May 5, 2023

alright, if you really want to look 馃し

no api

philosophy:

  • collapse all callformat variants into a looser type with variable size nonce+data fields.
  • same for public key, there's just a variable size slot for it.
  • store the format alongside, in case anyone wants to make sense of the keys+nonces+data
  • the "public key" in the tx is the one provided by the caller. I'm not aware of us capturing the runtime's public key.
  • if anything goes wrong, log and leave it nil. the drawback is that NULL in the db could mean either plain or something failed during analysis (e.g. new callformat that we don't support)
  • kinda oddly asymmetric, we don't store the data+result when it's unencrypted

@pro-wh pro-wh marked this pull request as draft May 5, 2023 00:24
@pro-wh pro-wh force-pushed the pro-wh/feature/envelope branch 3 times, most recently from 0dbece1 to 2027e2b Compare May 5, 2023 23:07
@pro-wh pro-wh marked this pull request as ready for review May 5, 2023 23:07
@@ -254,8 +254,8 @@ const (
VALUES ($1, $2, $3, $4)`

RuntimeTransactionInsert = `
INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, success, error_module, error_code, error_message)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18)`
INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, evm_encrypted_format, evm_encrypted_public_key, evm_encrypted_data_nonce, evm_encrypted_data_data, evm_encrypted_result_nonce, evm_encrypted_result_data, success, error_module, error_code, error_message)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

馃殏馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃殝馃挩

Copy link
Collaborator

@mitjat mitjat May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

馃槅

I can think of a few ways around the runaway train:

  1. Create a separate evm_transactions table, move the new fields (and probably also tx_eth_hash) in there. I'm a little worried about performance because we'll need probably at least tx_eth_hash and evm_encrypted_format (?) every time we're retrieving txs, even if en masse. Or we could denormalize slightly: we create an evm_encryption table, put just the new fields in it, and additionally store an is_encrypted flag in the main runtime_transactions table, because that's likely all we'll need for the non-detailed view. It's also nice because it generalizes to non-evm encryption (= Cipher).

  2. Using a postgres composite type, i.e. a struct-typed column, to store the evm encryption info. You create them like so (first format), and use them from Go via pgtype.CompositeFields like so. The downside is that we're adding a little complexity to the DB interface/structure/usage.

  3. Same as number 2, but with JSON instead of composite types. Pretty yuck and space-hungry. I prefer the train over this.

I started off favoring 2, but I'm now more in favor of the denormalized variant of 1.

Note on performance of 1 vs 2: Bulky table rows hurt performance; first in gradual ways (because of page size and disk caches), then at 8kB per row abruptly, because postgres stores rows over 8kB in a TOAST table, so every row lookup performs an implicit JOIN. This would imply 2 is faster, because we JOIN only explicitly, and only for single-tx results. However my understanding of TOAST is that only bulky columns are moved to the overflow (= TOAST) table, so with some luck, our composite type and the existing body type would be the ones to get moved, which should result in about the same performance as 1 if we only SELECT those columns for single-tx results.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too scary to do in this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@mitjat mitjat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for figuring out the deserialization incantations!

analyzer/runtime/evm.go Outdated Show resolved Hide resolved
analyzer/runtime/evm.go Outdated Show resolved Hide resolved
analyzer/runtime/extract.go Outdated Show resolved Hide resolved
@@ -254,8 +254,8 @@ const (
VALUES ($1, $2, $3, $4)`

RuntimeTransactionInsert = `
INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, success, error_module, error_code, error_message)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18)`
INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, evm_encrypted_format, evm_encrypted_public_key, evm_encrypted_data_nonce, evm_encrypted_data_data, evm_encrypted_result_nonce, evm_encrypted_result_data, success, error_module, error_code, error_message)
Copy link
Collaborator

@mitjat mitjat May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

馃槅

I can think of a few ways around the runaway train:

  1. Create a separate evm_transactions table, move the new fields (and probably also tx_eth_hash) in there. I'm a little worried about performance because we'll need probably at least tx_eth_hash and evm_encrypted_format (?) every time we're retrieving txs, even if en masse. Or we could denormalize slightly: we create an evm_encryption table, put just the new fields in it, and additionally store an is_encrypted flag in the main runtime_transactions table, because that's likely all we'll need for the non-detailed view. It's also nice because it generalizes to non-evm encryption (= Cipher).

  2. Using a postgres composite type, i.e. a struct-typed column, to store the evm encryption info. You create them like so (first format), and use them from Go via pgtype.CompositeFields like so. The downside is that we're adding a little complexity to the DB interface/structure/usage.

  3. Same as number 2, but with JSON instead of composite types. Pretty yuck and space-hungry. I prefer the train over this.

I started off favoring 2, but I'm now more in favor of the denormalized variant of 1.

Note on performance of 1 vs 2: Bulky table rows hurt performance; first in gradual ways (because of page size and disk caches), then at 8kB per row abruptly, because postgres stores rows over 8kB in a TOAST table, so every row lookup performs an implicit JOIN. This would imply 2 is faster, because we JOIN only explicitly, and only for single-tx results. However my understanding of TOAST is that only bulky columns are moved to the overflow (= TOAST) table, so with some luck, our composite type and the existing body type would be the ones to get moved, which should result in about the same performance as 1 if we only SELECT those columns for single-tx results.

storage/migrations/02_runtimes.up.sql Outdated Show resolved Hide resolved
analyzer/runtime/evm.go Outdated Show resolved Hide resolved
Comment on lines 57 to 58
EVMEncryptedPublicKey *[]byte
EVMEncryptedDataNonce *[]byte
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we able to decode the format/nonces/publickey into a more concrete type here? Or can it vary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format is CallFormat from the sdk. it's an enum

in x25519 (callformat 1), the public key and nonce are specific length byte arrays, which I'm sure the cryptosystem is happy to keep opaque on the outside

storage/migrations/02_runtimes.up.sql Show resolved Hide resolved
storage/migrations/02_runtimes.up.sql Outdated Show resolved Hide resolved
analyzer/runtime/extract.go Outdated Show resolved Hide resolved
Co-authored-by: mitjat <mitjat@users.noreply.github.com>
@pro-wh pro-wh merged commit 92251b7 into main May 12, 2023
5 checks passed
@pro-wh pro-wh deleted the pro-wh/feature/envelope branch May 12, 2023 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants