Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trace compatibility testing with Trueblocks #3721

Open
1 task done
wakamex opened this issue Jul 11, 2023 · 25 comments
Open
1 task done

trace compatibility testing with Trueblocks #3721

wakamex opened this issue Jul 11, 2023 · 25 comments
Assignees
Labels
A-rpc Related to the RPC implementation C-bug An unexpected or incorrect behavior S-needs-investigation This issue requires detective work to figure out what's going wrong

Comments

@wakamex
Copy link
Contributor

wakamex commented Jul 11, 2023

Describe the bug

trying to create a local trueblocks unchained index, I'm getting some warnings:

Unknown trace type selfdestruct
Unknown trace type selfdestruct
Unknown trace type selfdestructng 391  of 2000 at block 11182389 of 17659147 (6476758 blocks from head)
Unknown trace type selfdestruct
Unknown trace type selfdestruct

I'm not clear how impactful this particular warning is to Trueblocks operation, but it points to something non-standard in Reth's trace implementation.

Digging into this further, I couldn't find specifically where in the trueblocks or erigon codebases they deal with selfdestruct in a way that could explain this discrepancy.

However, Trueblocks has a long list of tests for expected tracing functionality in /src/other/trace_tests/curl_tests. Running these on my reth node, I get a difference from expected for 26 out of 35 tests. You can find the diffs of those tests in this PR: https://github.com/TrueBlocks/trueblocks-core/pull/3037/files

@tjayrush could explain these further.

Steps to reproduce

reproduce warnings above:

  1. sync erigon
  2. install trueblocks
  3. run chifra scrape

get comparison vs. expected Trueblocks trace tests

  1. run_all in /src/other/trace_tests/curl_tests of trueblocks repo (modify line 3 as needed: export RPC_ENDPOINT="http://localhost:23456")
  2. compare_expected_produced.sh

Node logs

No response

Platform(s)

Linux (x86)

What version/commit are you on?

reth Version: 0.1.0-alpha.2 Commit
SHA: 1330fc1
Build Timestamp: 2023-07-08T22:15:01.820025074Z
Build Features:
Build Profile: release

What database version are you on?

Current database version: 1
Local database is uninitialized

If you've built Reth from source, provide the full command you used

cargo build --release

Code of Conduct

  • I agree to follow the Code of Conduct
@wakamex wakamex added C-bug An unexpected or incorrect behavior S-needs-triage This issue needs to be labelled labels Jul 11, 2023
@mattsse
Copy link
Collaborator

mattsse commented Jul 11, 2023

this is very helpful, thanks a lot

looking into this

@mattsse mattsse added S-needs-investigation This issue requires detective work to figure out what's going wrong A-rpc Related to the RPC implementation and removed S-needs-triage This issue needs to be labelled labels Jul 11, 2023
@mattsse mattsse self-assigned this Jul 11, 2023
@tjayrush
Copy link

Thanks @wakamex. @mattsse, please makes sure to ping me either in discord or here if you have any questions about any of these tests. We wrote these tests when we were porting from OpenEthereum to Erigon and as far as I know they worked about two and a half years ago with Erigon.

Things may have changed under the covers with Erigon (although I hope not). So, if a test does not seem right, don't assume there's something wrong with Reth. Ping me and I can help you figure it out.

I'm going to install Reth this weekend, so should have a locally running copy soon to help test this.

@mattsse
Copy link
Collaborator

mattsse commented Jul 12, 2023

looked at the unknown trace type error, and turns out the OE traces never migrated from suicide to selfdestruct

https://github.com/TrueBlocks/trueblocks-core/blob/7c65387d5d5aa97af7ccf6269281f200711a74ca/src/apps/chifra/internal/scrape/handle_blaze_blaze.go#L294-L294

so in order to be compatible I think we should rename this to suicide as well

@mattsse
Copy link
Collaborator

mattsse commented Jul 12, 2023

@wakamex
Copy link
Contributor Author

wakamex commented Jul 12, 2023

is the insufficient funds error erigon, and the success response reth?

yup, you can see the reth output for each test in /src/other/trace_tests/curl_tests/reth

for that test in src/other/trace_tests/curl_tests/reth/trace_callMany_all.json we have:
https://github.com/TrueBlocks/trueblocks-core/pull/3037/files?file-filters%5B%5D=.json&show-viewed-files=true#diff-43fbbd2af40737486871fc9b12f73ae842426b16a37678e8b5ece40be1be9f78R1-R8

Perhaps more useful would be running these tests again today against an erigon node, as that seems to be a better source of truth than OE?

@tjayrush
Copy link

We should definitely run the tests again against Erigon as these 'expected' results are about two and a half years old. While it would be surprising if they changed things, it's very possible that they did.

@tjayrush
Copy link

Just a quick note. These tests are purposefully using curl (as opposed to say, TrueBlocks) so as to eliminate any weird processing in TrueBlocks. The curl responses from both nodes should be identical (at least that's what I think).

@mattsse
Copy link
Collaborator

mattsse commented Jul 12, 2023

I think the insufficient fund error can be explained by:

https://github.com/ledgerwatch/erigon/blob/af504f675a84868b941fbad840a3d2819fb525bf/turbo/jsonrpc/trace_adhoc.go#L967-L967

which is different from eth_call for example:

https://github.com/ledgerwatch/erigon/blob/af504f675a84868b941fbad840a3d2819fb525bf/turbo/transactions/call.go#L86-L86

note the disabled basefee, reth traceCall currently behaves as eth_call: basefee disabled, gas price 0

which differs from erigon

@mattsse
Copy link
Collaborator

mattsse commented Jul 12, 2023

it's still suicide

https://github.com/ledgerwatch/erigon/blob/af504f675a84868b941fbad840a3d2819fb525bf/turbo/jsonrpc/trace_adhoc.go#L40-L40

so I changed selfdestruct to suicide here for compact #3736

@tjayrush
Copy link

We should probably check against a current version of Erigon. These 'expected' results are two and a half years old. @wakamex Do you have an Erigon node?

@gakonst
Copy link
Member

gakonst commented Jul 12, 2023

@tjayrush reaching out in DMs with an RPC.

@wakamex
Copy link
Contributor Author

wakamex commented Jul 12, 2023

I got my hands on an erigon node. Here are the reth vs. erigon results:

Comparing reth_2023_07_11/ to erigon_2023_07_12/, placing results into reth_erigon/
Command executed on Wed Jul 12 19:36:46 UTC 2023
=== Not found ===

=== Identical ===
erigon_2023_07_12/trace_get_0.json
erigon_2023_07_12/trace_get_3.json
erigon_2023_07_12/trace_replayBlockTransactions_trace.json
erigon_2023_07_12/trace_replayTransaction_all.json
erigon_2023_07_12/trace_replayTransaction_trace.json
=== Difference found ===
erigon_2023_07_12/trace_block.json
erigon_2023_07_12/trace_callMany_all.json
erigon_2023_07_12/trace_callMany_none.json
erigon_2023_07_12/trace_callMany_stateDiff.json
erigon_2023_07_12/trace_callMany_trace.json
erigon_2023_07_12/trace_callMany_vmTrace.json
erigon_2023_07_12/trace_call_all.json
erigon_2023_07_12/trace_call_error.json
erigon_2023_07_12/trace_call_none.json
erigon_2023_07_12/trace_call_stateDiff.json
erigon_2023_07_12/trace_call_trace.json
erigon_2023_07_12/trace_call_view.json
erigon_2023_07_12/trace_call_view_none.json
erigon_2023_07_12/trace_call_vmTrace.json
erigon_2023_07_12/trace_filter.json
erigon_2023_07_12/trace_get_10.json
erigon_2023_07_12/trace_get_5.json
erigon_2023_07_12/trace_rawTransaction_none.json
erigon_2023_07_12/trace_rawTransaction_trace.json
erigon_2023_07_12/trace_rawTransaction_vmTrace.json
erigon_2023_07_12/trace_replayBlockTransactions_all.json
erigon_2023_07_12/trace_replayBlockTransactions_none.json
erigon_2023_07_12/trace_replayBlockTransactions_stateDiff.json
erigon_2023_07_12/trace_replayBlockTransactions_vmTrace.json
erigon_2023_07_12/trace_replayTransaction_none.json
erigon_2023_07_12/trace_replayTransaction_stateDiff.json
erigon_2023_07_12/trace_replayTransaction_vmTrace.json
erigon_2023_07_12/trace_transaction.json

You can find the diffs here: https://github.com/wakamex/trueblocks-core/tree/compare_script/src/other/trace_tests/curl_tests/reth_erigon

See this PR for the scripts I used, or run it yourselves: TrueBlocks/trueblocks-core#3037

@tjayrush
Copy link

I thought I'd add some color to some of these test cases. This is from memory, so take it with a grain of salt, but...


These failed tests return null instead of [] for traceAddress. This should be simple to fix.


We thought that someone had hand-coded assembly language for this transaction and had made a mistake, so reporting an error is correct. The error, I think, is correct if you look at the actual data. Not sure why it's saying something about the nonce.


Be a little bit careful with this one. I'm not sure this is the exact transaction, but I remember that OpenEthereum may have had a bug in certain places where the first trace in some transactions was incorrectly reported. Erigon found may have hit on this same issue (which is why this test is included). If I remember right, Erigon ended up adding a "compatibility" mode for some cases where they would act correctly by default but allow the user to cause old OpenEthereum bugs to persist. I don't know if this is such a case, but be careful here.


This may have been one of those backward compatibility things. The spec may say it requires a u64, but most of the other RPC endpoints require hex. So, as a compromise, even thought it's not in the spec, this was allowed. The spec is not really a spec after all.


Personally, I think better error messages are better, but this may break existing code. It won't break TrueBlocks, but it may break others.

@mattsse
Copy link
Collaborator

mattsse commented Jul 14, 2023

tysm for these

for trace_rawTransaction, I think the nonce error occurs because the transaction's nonce is lower that what's in latest state.
so this tx would be invalid at this state

@gakonst
Copy link
Member

gakonst commented Jul 14, 2023

We'll get back to these - thank you TJ, we're resyncing that node for now so it won't be available if you're trying to do more things.

@wakamex
Copy link
Contributor Author

wakamex commented Jul 14, 2023

updated the baseline expectation in favor of modern Erigon in these benign cases: changed gas value, remove of null transactionHash and transactionPosition, improved error message (for more info see wakamex/trace_tests#2)

this leaves only 2 categories of differences between this baseline and modern Erigon, which require further investigation:

  • error where there was result before
    • can't tell if the previous results were bogus or just more verbose (5 trace_rawTransaction_* tests)
  • different behavior of trace_get returning chained traces (2 trace_get_* tests)

result of test_trace_get.sh (Array is the second parameter as described here, table displays result.traceAddress field):

Array reth erigon
0 [0] [0]
1 [1] [1]
2 [2] [2]
3 [3] [3]
4 [4] [4]
5 [5] [4,0]
6 [6] [4,1]
7 null [4,2]
8 null [5]
9 null [5,0]
10 null [5,1]
11 null [5,2]
12 null [6]
13 null [6,0]
14 null [6,1]
15 null [6,2]
16 null null

@tjayrush
Copy link

tjayrush commented Jul 14, 2023

Just one quick comment. The traceAddress example above is super interesting and important for getting the correct calling structure of the transaction. Basically, the array is a call tree. It's almost certainly zero-based, but it may be one-based. I can't remember. [6,2] is the sixth call from the second call in the top level (or seventh and third if zero-based, but you get the idea.)

I'm 99% sure the very top level (trace) is null. The top level trace basically is the transaction itself with slight (very confusing) differences in the gas calc.

@mattsse
Copy link
Collaborator

mattsse commented Jul 19, 2023

thanks for the example I was able to track this down,

previously we matched the indices to the trace address which explains the null values.

should be fixed in #3852

@gakonst
Copy link
Member

gakonst commented Jul 20, 2023

What else do we need to close this? I spoke with Peter and he said 50M gas for tracing was a heuristic, so maybe we should just set it to that, and call it a day?

@tjayrush
Copy link

Are the tests passing? The tests represent previous behaviour.

@yabirgb
Copy link

yabirgb commented Jul 28, 2023

@gakonst any update on this? Was the limit increased?

@mattsse
Copy link
Collaborator

mattsse commented Jul 28, 2023

yes, we also have some open tracing fixes that will land shortly, I will check compatibility again

@tjayrush
Copy link

tjayrush commented Aug 7, 2023

I'm in the process of syncing my node. Once it's synced, the first thing I'll do is double-check these issues and try to make them more consumable (by creating a separate, much simpler, example for each outstanding issue).

@mattsse
Copy link
Collaborator

mattsse commented Aug 7, 2023

tysm for this,

fyi we're aiming for a new alpha release within the next few days

@onbjerg
Copy link
Member

onbjerg commented Apr 2, 2024

@wakamex @tjayrush Hi, I realize this issue is pretty old, but I am wondering if this is still an issue for you? We've had quite a few tracing fixes since the versions mentioned in here

@onbjerg onbjerg self-assigned this Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rpc Related to the RPC implementation C-bug An unexpected or incorrect behavior S-needs-investigation This issue requires detective work to figure out what's going wrong
Projects
Status: Todo
Development

No branches or pull requests

6 participants