New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
erigon eats 100gb+ of memory when tracing a certain tx #4637
Comments
@banteg is there anything more fresh that has this or similar behaviour? I have a pruned node, so I can't check that far back in history. Or similar transactions that aren't 1/2 of a year old are tracing just fine? |
This is also affecting the latest
|
This is because of this PR: #2779 |
ah, okay, then we probably need to think about adding some kind of pagination/limitation for these traces, or some binary response |
I wonder also if another json-serialization lib could work there, the marshal/json isn't the most frugal code |
no, my dataset consisted of 11,000 transactions and only these three had this behavior |
I'm also having this issue with stable release, tx |
okay what we can try to do is try to change the json serialization library in an experiment branch and then @banteg @darkhorse-spb if you can test it on your machines and see if that helps at all |
@mandrigin It's impossible to stream json, and return error if error happened in the middle of streaming. Because json is not streaming-friendly format. |
I also have a weird idea of using ETL to first dump everything to the binary files, check for errors and then stream results. |
but the question is also, what eats all this RAM? @banteg can I ask you to run Erigon with the built-in rpc daemon and with |
We decided to enable back streaming feature by default: #4647 Erigon has enalbed json streamin for some heavy endpoints (like trace_*). It's treadoff: greatly reduce amount of RAM (in some cases from 30GB to 30mb), but it produce invalid json format if error happened in the middle of streaming (because json is not streaming-friendly format) We decided that value from this streaming is higher than handling "error happen in the middle" rare corner case. But added flag: --rpc.streaming.disable if users wish to pay for correctnesses or compatibility. |
@banteg @darkhorse-spb can you check in the current devel version and see if it helped? |
Is it Go Code? We ran into the same issue with TrueBlocks. We stream our data too. We were able to get around it using a If the program crashes, and a subroutine never returns, it doesn't work, but the program crashed, so something isn't working anyway. |
Then user will not see error message at all |
We attach the error as another field in the object in the |
@tjayrush it even may work in many client libs. do you have some open-source example? |
I'm almost embarrassed to show it. It's super hacky, but here's an example: https://github.com/TrueBlocks/trueblocks-core/blob/feature/new-unchained-index-2.0/src/apps/chifra/internal/chunks/handle_addresses.go#L66. The RenderFooter routine (which closes an array and an object (everything our API delivers has the same shape) get called even if an error happens. We deliver the error on standard error many levels above this code, so it just closes the JSON object and returns the error (or nil if there is no error). |
tnx, will try tomorrow |
@AskAlexSharov do you want to keep this one around? |
It’s fixed - streaming enabled. But we need add this approach also: #4637 (comment) |
okay @nanevardanyan will take a look at the error handling then |
seems fixed on erigon's side, but clients would need to consider streaming too. one of the traces i reported yields a 66.5GB response. here is a small script which will show both compressed and uncompressed size of the response. https://gist.github.com/banteg/98dbccbf6e2a3f997199a1b16eb93c5a |
System information
Erigon version: erigon version 2022.07.1-alpha-09776394
OS & Version: Linux
Commit hash : 0977639
Expected behaviour
an rpc call returns a trace
Actual behaviour
erigon gobbles up 100gb+ or memory and gets killed by the system
Steps to reproduce the behaviour
run
debug_traceTransaction
against any of these txs:Backtrace
not available, erigon gets killed by the system
The text was updated successfully, but these errors were encountered: