Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyzers, api: Fix dead reckoning of gas_used, num_transfers, total_balance #512

Merged
merged 5 commits into from Sep 11, 2023

Conversation

mitjat
Copy link
Collaborator

@mitjat mitjat commented Aug 31, 2023

This PR fixes bugs in the tracking of three dead-reckoned values:

  • gas_used in the sense of "gas used by other to call EVM contract X".
    • This has been renamed internally (but not in the API) to gas_for_calling(X), to distinguish it from gas_used(X) which normally has the meaning "gas that account X paid".
    • The column was moved from chain.evm_contract to chain.runtime_accounts. When contract X is called, the former table might not have a row for X yet, which led to copmlications (and errors) in tracking gas_for_calling.
  • num_transfers and total_balance for EVM tokens
    • These were previously tracked for every token X in chain.evm_tokens if it had an entry for X, and in analysis.evm_tokens otherwise. Entries were then transferred

The PR also simplifies the implementation, and makes it more robust (see below).

Motivation for the PR:

  • Bugs in old implementation.
  • Because the order of inserting into one or the other table was important, the old tracking approach was not compatible with parallel indexing that we're introducing in fast-sync mode.

Testing:

  • I adapted our e2e regression test so it exercises EVM stuff also, and ran it on blocks 1_003_298 to 1_004_298 on mainnet emerald.
    • It turned out that the old implementation fails the test (results of a run are not deterministic).
    • New code produces deterministic results.
  • For gas_for_calling and num_transfers, I calculated the correct number with a separate SQL query (see below) and compared it with what the analyzer came up with. The results were consistently correct with new code, and incorrect with old.
SQL queries for comparing correct values with analyzer-produced ones
-- CHECK ERRORS IN: num_transfers
SELECT * FROM (
  WITH transfers AS (
      SELECT runtime, DECODE(body ->> 'address', 'base64') AS eth_addr, COUNT(*) AS num_xfers
          FROM chain.runtime_events
          WHERE evm_log_signature='\xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef'::bytea -- ERC-20 and ERC-721 Transfer
          GROUP BY runtime, eth_addr
  )
  SELECT
  tokens.runtime,
  tokens.token_address,
  tokens.num_transfers,
  COALESCE((
    SELECT transfers.num_xfers FROM transfers
    LEFT JOIN chain.address_preimages as preimages 
    ON
        preimages.address_data = transfers.eth_addr AND
        preimages.context_identifier = 'oasis-runtime-sdk/address: secp256k1eth' AND
        preimages.context_version = 0
    WHERE
        tokens.runtime = transfers.runtime AND
        tokens.token_address = preimages.address
  ), 0) AS true_num_transfers
  FROM chain.evm_tokens as tokens
) x ORDER BY abs(true_num_transfers-num_transfers) desc;

-- CHECK ERRORS IN: gas_for_calling
select * from (
  select
    runtime,
    contract_address,
    (SELECT gas_for_calling FROM chain.runtime_accounts ra WHERE (ra.runtime = c.runtime) AND (ra.address = c.contract_address)) AS gas_used,
    (select sum(gas_used) from chain.runtime_transactions rt where c.contract_address = rt.to ) as true_gas_for_calling,
    (select timestamp from chain.runtime_transactions rt where c.creation_tx = rt.tx_hash) as created_at
  from chain.evm_contracts c
) x order by abs(gas_used - true_gas_for_calling) desc, runtime, contract_address
  • For total_supply, the new code agreed with the old one, except for one token (wROSE). I checked manually, and:
    • Old code reported in the DB that the last update to wROSE balance happened at round 1003301.
    • The new code reported that the last update to wROSE balance happened at round 1004297.
    • ⚠️ I don't see how either of those blocks changes the total supply.
    • That said, I manually queried the runtime for the total supply at round 1004298 (where the analyzer was instructed to finish), and it matches what the new code comes up with.

analyzer/evmtokens/evm_tokens.go Show resolved Hide resolved
@@ -625,6 +608,7 @@ var (
LEFT JOIN chain.address_preimages AS account_preimage ON
account_preimage.address = balance_analysis.account_address
WHERE
evm_tokens.token_type IS NOT NULL AND
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before, token_type was guaranteed to be NOT NULL. Now, a type of NULL means "this is not a token for sure yet, but it's a candidate"

@mitjat mitjat force-pushed the mitjat/simplify-evm-dead-reckon branch from c21e7d1 to 17ebf3a Compare August 31, 2023 06:33
FROM analysis.evm_tokens AS tokens
ON CONFLICT(runtime, token_address) DO UPDATE SET
total_supply = EXCLUDED.total_supply,
num_transfers = EXCLUDED.num_transfers,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing value for num_transfers that we're pulling in from analysis.evm_tokens here is known to be a little off for some tokens. But I'd rather just accept the small discrepancy for now: People only care about the approx value anyway, plus the problem will fix itself when we do a full reindex.
The alternative is to have a fairly convoluted join here (the migration is unreadable enough as is ...) that is also quite expensive (requires join with address_preimages and a linear scan of all events, our largest table), so it would make nexus unavailable for a while.

analyzer/evmtokens/evm_tokens.go Show resolved Hide resolved
storage/migrations/12_evm_contract_gas.up.sql Outdated Show resolved Hide resolved
Comment on lines +58 to +71
-- Moves the chain.evm_contracts.gas_used column to chain.runtime_accounts.gas_for_calling

ALTER TABLE chain.runtime_accounts
-- Total gas used by all txs addressed to this account. Primarily meaningful for accounts that are contracts.
ADD COLUMN gas_for_calling UINT63 NOT NULL DEFAULT 0;

UPDATE chain.runtime_accounts as ra
SET gas_for_calling = c.gas_used
FROM chain.evm_contracts c
WHERE c.runtime = ra.runtime AND
c.contract_address = ra.address;

ALTER TABLE chain.evm_contracts
DROP COLUMN gas_used;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm is it out of place to track gas_for_calling here, instead of in evm_contracts? It seems like we currently only use this information for the evm contracts endpoint. Are there non-evm-contract addresses we want to track gas_for_calling for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are currently on non-(evm-contract) addresses that we'd want to track gas_for_calling for. If we add support for cipher, there will be (non-evm)-contract addresses that we might well want to include in this reckoning.

I could also see a potential argument for why we'd want to track this for non-contracts: If people spend a lot of gas for sending money to an account, that's a noteworthy account.

All of that said, the main reason to track this outside the evm_contracts table is to avoid the problem of how/where to do dead reckoning for an address before we know for sure that it's a contract (and thus before it has an entry in evm_contracts). Well, and also Cipher or any future runtimes; those contracts won't be in evm_contracts.

I think it's fair to have this here; it's a property of every account, it's just that it's the most interesting for contract accounts. I hope the comment in line 61 clarifies enough.

storage/migrations/15_runtime_account_stats.up.sql Outdated Show resolved Hide resolved
if possibleToken.Mutated || possibleToken.TotalSupplyChange.Cmp(&big.Int{}) != 0 || possibleToken.NumTransfersChange != 0 {
// One of the mutable-and-queriable-from-the-runtime properties has changed in this round; mark that in the DB.
// NOTE: If _only_ numTransfers (which we cannot query from the EVM) changed, there's seemingly little point in
// requesting a re-download. But in practice, totalSupply will change without an explicit burn/mint event,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totalSupply will change without an explicit burn/mint event

Oh huh, do we know what causes this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No :(

I looked into this some, but I only have observations, no deep understanding. I tested everythin on the wROSE contract, 0x21c718c22d52d0f3a789b752d4c2fd5908a8a733

  • totalSupply changes in most rounds that emit an event (not necessarily burn or mint!) involving that address. Those rounds are NOT at all the same as the rounds with txs referencing wROSE as a standalone arg.
  • of the two txs that I checked and that generate events involving wROSE, both are about yuzuswap; example
    • to make things more confusing, that tx involves some transfer of 5 wROSE, but after the round, the total supply of wROSE increases by 1 (or 0.1 or some other power of 10; not sure about decimals).

IDK I'm inclined to say that the YUZURouter contract changes the total supply of wROSE somehow (internal transactions?) and events should be generated but they're not. I'm far from confident though. In case it's of interest, and in case I resume with this at some point, I'm attaching "private lab notes style" snippets I used when looking into this.

Log of things I tried / helpful snippets

This to find all the forms of wROSE's address: oasis format, eth format in hex and base64:

indexer=# select * from chain.evm_tokens  where symbol ='wROSE';
 runtime |                 token_address                  | token_type |  token_name  | symbol | decimals |        total_supply         | num_transfers | last_mutate_round | last_download_round 
---------+------------------------------------------------+------------+--------------+--------+----------+-----------------------------+---------------+-------------------+---------------------
 emerald | oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf |         20 | Wrapped ROSE | wROSE  |       18 | 134273688211406143160770252 |          2113 |           1004297 |             1004298
indexer=# select address_data FROM chain.address_preimages where address=('oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf');
                address_data                
--------------------------------------------
 \x21c718c22d52d0f3a789b752d4c2fd5908a8a733
oasisindexer=> select encode('\x21c718c22d52d0f3a789b752d4c2fd5908a8a733'::bytea, 'base64');
            encode            
------------------------------
 IccYwi1S0POnibdS1ML9WQiopzM=
(1 row)

This to dump the total balance for ~1000 rounds; I just plopped this at the beginning of the evm_tokens analyzer:

	for round := 1003301; round <= 1004297; round++ {
		var wrose []byte = make([]byte, 20)
		// parse hex string `s` into x
		hex.Decode(wrose, ([]byte)("21c718c22d52d0f3a789b752d4c2fd5908a8a733"))
		tokenData, err := evm.EVMDownloadNewToken(
			ctx,
			m.logger,
			m.source,
			uint64(round),
			wrose,
		)
		if err != nil {
			m.logger.Error("error downloading token", "err", err)
			return
		}
		m.logger.Info("wROSE total supply", "round", round, "totalSupply", tokenData.TotalSupply)
	}
	os.Exit(0)

The log output (round, totalSupply) for the first several rounds is (annotations manual):

[1003301,133550131463522080000000000]
[1003302,133550131463522080000000000]
[1003303,133550131463522080000000000]
[1003304,133550131463522080000000000]
[1003305,133550131463522080000000000]
[1003306,133550136463522090000000000] change! has wROSE event.
[1003307,133550136463522090000000000]
[1003308,133550136463522090000000000]
[1003309,133557178326858330000000000] change! has wROSE event.
[1003310,133563967671930200000000000] change! has wROSE event.
[1003311,133563967671930200000000000]
[1003312,133570740220382000000000000] change! has wROSE event.
[1003313,133570740220382000000000000]
[1003314,133577253255795830000000000] change! has wROSE event.
[1003315,133577253255795830000000000]
[1003316,133577239443979430000000000] change! has wROSE event.
[1003317,133583821956026310000000000] change! has wROSE event.
[1003318,133583821956026310000000000] NO change. has wROSE event.
[1003319,133583821956026310000000000]
[1003320,133590316151646200000000000] change! has wROSE event.

This to find relevant events and txs (used to generate above annotations):

-- select * from chain.runtime_events where (body::text like '%IccYwi1S0POnibdS1ML9WQiopzM%' or body::text like '%21c718c22%' or body::text like '%oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf%' or related_accounts::text like '%oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf%') and runtime='emerald' and round=1004297;
select * from chain.runtime_events where runtime='emerald' and round>=1003301 and round<=1004297 and body::text like '%IccYwi1S0POnibdS1ML9WQiopzM=%';
select round from chain.runtime_transactions where runtime='emerald' and round>=1003301 and round<=1004297 and body::text like '%IccYwi1S0POnibdS1ML9WQiopzM=%'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the thorough investigation!

if transactionData.Method == "evm.Call" {
// Dead-reckon gas used by contracts
batch.Queue(queries.RuntimeEVMContractGasUsedUpdate,
if (transactionData.Method == "evm.Call" || transactionData.Method == "evm.Create") && transactionData.To != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch - was this the reason for the wrong accounting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can transactionData.To ever be null for successful EVM transactions? Or does that likely indicate an error in the BlockTransactionData parsing code?

If the former, a comment would be suitable. If the latter, maybe move this below into a separate conditional, and log a warning.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andrew7234: As far as I understand, no. When the old code inserted into chain.evm_tokens for the first time, it calculated and inserted this:

    WITH
      contract_gas_used AS (
        SELECT SUM(gas_used) AS total_gas
        FROM chain.runtime_transactions
        WHERE runtime = $1 AND "to" = $2::text
      )

which includes the evm.Create gas. So basically I added it here to match the old behavior. Whether that's more "correct" from a product perspective, I'm actually in doubt. But mainly I think it doesn't matter either way, as long as we're consistent :)
I don't know the reason the old code was a little off.

@ptrus: Fun corner case: it is nil for reverted evm.Create. Added a comment.

@pro-wh pro-wh mentioned this pull request Sep 6, 2023
Copy link
Collaborator Author

@mitjat mitjat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and sorry this took long. I wanted to look into why totalSupply changes for seemingly no good reason.

if transactionData.Method == "evm.Call" {
// Dead-reckon gas used by contracts
batch.Queue(queries.RuntimeEVMContractGasUsedUpdate,
if (transactionData.Method == "evm.Call" || transactionData.Method == "evm.Create") && transactionData.To != nil {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andrew7234: As far as I understand, no. When the old code inserted into chain.evm_tokens for the first time, it calculated and inserted this:

    WITH
      contract_gas_used AS (
        SELECT SUM(gas_used) AS total_gas
        FROM chain.runtime_transactions
        WHERE runtime = $1 AND "to" = $2::text
      )

which includes the evm.Create gas. So basically I added it here to match the old behavior. Whether that's more "correct" from a product perspective, I'm actually in doubt. But mainly I think it doesn't matter either way, as long as we're consistent :)
I don't know the reason the old code was a little off.

@ptrus: Fun corner case: it is nil for reverted evm.Create. Added a comment.

if possibleToken.Mutated || possibleToken.TotalSupplyChange.Cmp(&big.Int{}) != 0 || possibleToken.NumTransfersChange != 0 {
// One of the mutable-and-queriable-from-the-runtime properties has changed in this round; mark that in the DB.
// NOTE: If _only_ numTransfers (which we cannot query from the EVM) changed, there's seemingly little point in
// requesting a re-download. But in practice, totalSupply will change without an explicit burn/mint event,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No :(

I looked into this some, but I only have observations, no deep understanding. I tested everythin on the wROSE contract, 0x21c718c22d52d0f3a789b752d4c2fd5908a8a733

  • totalSupply changes in most rounds that emit an event (not necessarily burn or mint!) involving that address. Those rounds are NOT at all the same as the rounds with txs referencing wROSE as a standalone arg.
  • of the two txs that I checked and that generate events involving wROSE, both are about yuzuswap; example
    • to make things more confusing, that tx involves some transfer of 5 wROSE, but after the round, the total supply of wROSE increases by 1 (or 0.1 or some other power of 10; not sure about decimals).

IDK I'm inclined to say that the YUZURouter contract changes the total supply of wROSE somehow (internal transactions?) and events should be generated but they're not. I'm far from confident though. In case it's of interest, and in case I resume with this at some point, I'm attaching "private lab notes style" snippets I used when looking into this.

Log of things I tried / helpful snippets

This to find all the forms of wROSE's address: oasis format, eth format in hex and base64:

indexer=# select * from chain.evm_tokens  where symbol ='wROSE';
 runtime |                 token_address                  | token_type |  token_name  | symbol | decimals |        total_supply         | num_transfers | last_mutate_round | last_download_round 
---------+------------------------------------------------+------------+--------------+--------+----------+-----------------------------+---------------+-------------------+---------------------
 emerald | oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf |         20 | Wrapped ROSE | wROSE  |       18 | 134273688211406143160770252 |          2113 |           1004297 |             1004298
indexer=# select address_data FROM chain.address_preimages where address=('oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf');
                address_data                
--------------------------------------------
 \x21c718c22d52d0f3a789b752d4c2fd5908a8a733
oasisindexer=> select encode('\x21c718c22d52d0f3a789b752d4c2fd5908a8a733'::bytea, 'base64');
            encode            
------------------------------
 IccYwi1S0POnibdS1ML9WQiopzM=
(1 row)

This to dump the total balance for ~1000 rounds; I just plopped this at the beginning of the evm_tokens analyzer:

	for round := 1003301; round <= 1004297; round++ {
		var wrose []byte = make([]byte, 20)
		// parse hex string `s` into x
		hex.Decode(wrose, ([]byte)("21c718c22d52d0f3a789b752d4c2fd5908a8a733"))
		tokenData, err := evm.EVMDownloadNewToken(
			ctx,
			m.logger,
			m.source,
			uint64(round),
			wrose,
		)
		if err != nil {
			m.logger.Error("error downloading token", "err", err)
			return
		}
		m.logger.Info("wROSE total supply", "round", round, "totalSupply", tokenData.TotalSupply)
	}
	os.Exit(0)

The log output (round, totalSupply) for the first several rounds is (annotations manual):

[1003301,133550131463522080000000000]
[1003302,133550131463522080000000000]
[1003303,133550131463522080000000000]
[1003304,133550131463522080000000000]
[1003305,133550131463522080000000000]
[1003306,133550136463522090000000000] change! has wROSE event.
[1003307,133550136463522090000000000]
[1003308,133550136463522090000000000]
[1003309,133557178326858330000000000] change! has wROSE event.
[1003310,133563967671930200000000000] change! has wROSE event.
[1003311,133563967671930200000000000]
[1003312,133570740220382000000000000] change! has wROSE event.
[1003313,133570740220382000000000000]
[1003314,133577253255795830000000000] change! has wROSE event.
[1003315,133577253255795830000000000]
[1003316,133577239443979430000000000] change! has wROSE event.
[1003317,133583821956026310000000000] change! has wROSE event.
[1003318,133583821956026310000000000] NO change. has wROSE event.
[1003319,133583821956026310000000000]
[1003320,133590316151646200000000000] change! has wROSE event.

This to find relevant events and txs (used to generate above annotations):

-- select * from chain.runtime_events where (body::text like '%IccYwi1S0POnibdS1ML9WQiopzM%' or body::text like '%21c718c22%' or body::text like '%oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf%' or related_accounts::text like '%oasis1qpgcp5jzlgk4hcenaj2x82rqk8rrve2keyuc8aaf%') and runtime='emerald' and round=1004297;
select * from chain.runtime_events where runtime='emerald' and round>=1003301 and round<=1004297 and body::text like '%IccYwi1S0POnibdS1ML9WQiopzM=%';
select round from chain.runtime_transactions where runtime='emerald' and round>=1003301 and round<=1004297 and body::text like '%IccYwi1S0POnibdS1ML9WQiopzM=%'

@mitjat mitjat force-pushed the mitjat/simplify-evm-dead-reckon branch from 17ebf3a to 818e080 Compare September 11, 2023 20:09
@mitjat mitjat merged commit 4fb8274 into main Sep 11, 2023
6 checks passed
@mitjat mitjat deleted the mitjat/simplify-evm-dead-reckon branch September 11, 2023 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants