show us the encrypted data #407

pro-wh · 2023-05-05T00:24:26Z

alright, if you really want to look 🤷

no api

philosophy:

collapse all callformat variants into a looser type with variable size nonce+data fields.
same for public key, there's just a variable size slot for it.
store the format alongside, in case anyone wants to make sense of the keys+nonces+data
the "public key" in the tx is the one provided by the caller. I'm not aware of us capturing the runtime's public key.
if anything goes wrong, log and leave it nil. the drawback is that NULL in the db could mean either plain or something failed during analysis (e.g. new callformat that we don't support)
kinda oddly asymmetric, we don't store the data+result when it's unencrypted

pro-wh · 2023-05-05T23:10:21Z

analyzer/queries/queries.go

@@ -254,8 +254,8 @@ const (
      VALUES ($1, $2, $3, $4)`

 	RuntimeTransactionInsert = `
-    INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, success, error_module, error_code, error_message)
-      VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18)`
+    INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, evm_encrypted_format, evm_encrypted_public_key, evm_encrypted_data_nonce, evm_encrypted_data_data, evm_encrypted_result_nonce, evm_encrypted_result_data, success, error_module, error_code, error_message)


🚂🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋🚋💨

😆

I can think of a few ways around the runaway train:

Create a separate evm_transactions table, move the new fields (and probably also tx_eth_hash) in there. I'm a little worried about performance because we'll need probably at least tx_eth_hash and evm_encrypted_format (?) every time we're retrieving txs, even if en masse. Or we could denormalize slightly: we create an evm_encryption table, put just the new fields in it, and additionally store an is_encrypted flag in the main runtime_transactions table, because that's likely all we'll need for the non-detailed view. It's also nice because it generalizes to non-evm encryption (= Cipher).

Using a postgres composite type, i.e. a struct-typed column, to store the evm encryption info. You create them like so (first format), and use them from Go via pgtype.CompositeFields like so. The downside is that we're adding a little complexity to the DB interface/structure/usage.

Same as number 2, but with JSON instead of composite types. Pretty yuck and space-hungry. I prefer the train over this.

I started off favoring 2, but I'm now more in favor of the denormalized variant of 1.

Note on performance of 1 vs 2: Bulky table rows hurt performance; first in gradual ways (because of page size and disk caches), then at 8kB per row abruptly, because postgres stores rows over 8kB in a TOAST table, so every row lookup performs an implicit JOIN. This would imply 2 is faster, because we JOIN only explicitly, and only for single-tx results. However my understanding of TOAST is that only bulky columns are moved to the overflow (= TOAST) table, so with some luck, our composite type and the existing body type would be the ones to get moved, which should result in about the same performance as 1 if we only SELECT those columns for single-tx results.

too scary to do in this PR

mitjat

Thank you for figuring out the deserialization incantations!

analyzer/runtime/evm.go

analyzer/runtime/extract.go

mitjat · 2023-05-08T18:51:42Z

analyzer/queries/queries.go

@@ -254,8 +254,8 @@ const (
      VALUES ($1, $2, $3, $4)`

 	RuntimeTransactionInsert = `
-    INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, success, error_module, error_code, error_message)
-      VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18)`
+    INSERT INTO chain.runtime_transactions (runtime, round, tx_index, tx_hash, tx_eth_hash, fee, gas_limit, gas_used, size, timestamp, method, body, "to", amount, evm_encrypted_format, evm_encrypted_public_key, evm_encrypted_data_nonce, evm_encrypted_data_data, evm_encrypted_result_nonce, evm_encrypted_result_data, success, error_module, error_code, error_message)


😆

I can think of a few ways around the runaway train:

Create a separate evm_transactions table, move the new fields (and probably also tx_eth_hash) in there. I'm a little worried about performance because we'll need probably at least tx_eth_hash and evm_encrypted_format (?) every time we're retrieving txs, even if en masse. Or we could denormalize slightly: we create an evm_encryption table, put just the new fields in it, and additionally store an is_encrypted flag in the main runtime_transactions table, because that's likely all we'll need for the non-detailed view. It's also nice because it generalizes to non-evm encryption (= Cipher).

Using a postgres composite type, i.e. a struct-typed column, to store the evm encryption info. You create them like so (first format), and use them from Go via pgtype.CompositeFields like so. The downside is that we're adding a little complexity to the DB interface/structure/usage.

Same as number 2, but with JSON instead of composite types. Pretty yuck and space-hungry. I prefer the train over this.

I started off favoring 2, but I'm now more in favor of the denormalized variant of 1.

Note on performance of 1 vs 2: Bulky table rows hurt performance; first in gradual ways (because of page size and disk caches), then at 8kB per row abruptly, because postgres stores rows over 8kB in a TOAST table, so every row lookup performs an implicit JOIN. This would imply 2 is faster, because we JOIN only explicitly, and only for single-tx results. However my understanding of TOAST is that only bulky columns are moved to the overflow (= TOAST) table, so with some luck, our composite type and the existing body type would be the ones to get moved, which should result in about the same performance as 1 if we only SELECT those columns for single-tx results.

storage/migrations/02_runtimes.up.sql

analyzer/runtime/evm.go

Andrew7234 · 2023-05-10T06:02:55Z

analyzer/runtime/extract.go

+	EVMEncryptedPublicKey   *[]byte
+	EVMEncryptedDataNonce   *[]byte


are we able to decode the format/nonces/publickey into a more concrete type here? Or can it vary?

format is CallFormat from the sdk. it's an enum

in x25519 (callformat 1), the public key and nonce are specific length byte arrays, which I'm sure the cryptosystem is happy to keep opaque on the outside

storage/migrations/02_runtimes.up.sql

analyzer/runtime/extract.go

Co-authored-by: mitjat <mitjat@users.noreply.github.com>

pro-wh requested review from aefhm, Andrew7234 and mitjat as code owners May 5, 2023 00:24

pro-wh marked this pull request as draft May 5, 2023 00:24

pro-wh force-pushed the pro-wh/feature/envelope branch 3 times, most recently from 0dbece1 to 2027e2b Compare May 5, 2023 23:07

pro-wh marked this pull request as ready for review May 5, 2023 23:07

pro-wh commented May 5, 2023

View reviewed changes

mitjat reviewed May 8, 2023

View reviewed changes

Andrew7234 reviewed May 10, 2023

View reviewed changes

mitjat reviewed May 10, 2023

View reviewed changes

storage/migrations/02_runtimes.up.sql Outdated Show resolved Hide resolved

storage: typo in comment

602f256

pro-wh force-pushed the pro-wh/feature/envelope branch from 8ff1eef to f9c0338 Compare May 10, 2023 21:32

pro-wh mentioned this pull request May 10, 2023

runtime_transactions table is getting very wide #414

Open

pro-wh added 2 commits May 10, 2023 14:48

analyzer: note evm encrypted data

56b501c

storage: add runtime_transactions fields for evm encrypted data

b4f4a80

pro-wh force-pushed the pro-wh/feature/envelope branch from f9c0338 to b4f4a80 Compare May 10, 2023 21:52

runtime: convert callformats to string type

cd360e3

pro-wh requested a review from mitjat May 10, 2023 22:37

Andrew7234 mentioned this pull request May 10, 2023

Andrew7234/feature/envelope api #415

Merged

mitjat approved these changes May 11, 2023

View reviewed changes

storage/migrations/02_runtimes.up.sql Show resolved Hide resolved

storage/migrations/02_runtimes.up.sql Outdated Show resolved Hide resolved

analyzer/runtime/extract.go Outdated Show resolved Hide resolved

runtime: err naming

27f74fd

Co-authored-by: mitjat <mitjat@users.noreply.github.com>

pro-wh force-pushed the pro-wh/feature/envelope branch from daa75c0 to 27f74fd Compare May 11, 2023 23:39

pro-wh merged commit 92251b7 into main May 12, 2023
5 checks passed

pro-wh deleted the pro-wh/feature/envelope branch May 12, 2023 00:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

show us the encrypted data #407

show us the encrypted data #407

pro-wh commented May 5, 2023 •

edited

pro-wh May 5, 2023

mitjat May 8, 2023 •

edited

pro-wh May 10, 2023

pro-wh May 10, 2023

mitjat left a comment

mitjat May 8, 2023 •

edited

Andrew7234 May 10, 2023

pro-wh May 10, 2023

show us the encrypted data #407

show us the encrypted data #407

Conversation

pro-wh commented May 5, 2023 • edited

pro-wh May 5, 2023

Choose a reason for hiding this comment

mitjat May 8, 2023 • edited

Choose a reason for hiding this comment

pro-wh May 10, 2023

Choose a reason for hiding this comment

pro-wh May 10, 2023

Choose a reason for hiding this comment

mitjat left a comment

Choose a reason for hiding this comment

mitjat May 8, 2023 • edited

Choose a reason for hiding this comment

Andrew7234 May 10, 2023

Choose a reason for hiding this comment

pro-wh May 10, 2023

Choose a reason for hiding this comment

pro-wh commented May 5, 2023 •

edited

mitjat May 8, 2023 •

edited

mitjat May 8, 2023 •

edited