Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API thread poor performance getting (old) block data #3483

Open
bladedoyle opened this issue Nov 2, 2020 · 7 comments
Open

API thread poor performance getting (old) block data #3483

bladedoyle opened this issue Nov 2, 2020 · 7 comments

Comments

@bladedoyle
Copy link
Contributor

bladedoyle commented Nov 2, 2020

Describe the bug
API thread has very poor performance fetching some (old) block data. It can take up to several hours for the call to return. In this time the API thread is busy and also it looks like receiving block data over p2p is stopped.

To Reproduce
curl -ugrin:<secret> localhost:3413/v1/blocks/29
The query is against an archive node.
Blocks before and after this work fine.
More than one block causes this issue (its hard to test because a node restart is needed to try again)

Relevant Information
No error messages in the node log or on the screen.
Hundreds of thousands of these warning messages:

20201108 17:36:17.257 WARN grin_chain::txhashset::txhashset - rewind_single_block: fallback to legacy input bitmap for block 00000470f0f9 at 11811
20201108 17:36:17.259 WARN grin_chain::txhashset::txhashset - rewind_single_block: 10 output_pos entries missing for: 00000470f0f9 at 11811
@bladedoyle
Copy link
Contributor Author

Note that this thread being stuck prevents shutdown from completing.

@antiochp
Copy link
Member

antiochp commented Nov 4, 2020

This seems weird to me -

5318    mdb/libraries/liblmdb/mdb.c: No such file or directory.

@bladedoyle
Copy link
Contributor Author

Thats just the debugger trying to find the source code for the stack trace. If it was there the debugger could show the code for that line in the stack.

@bladedoyle
Copy link
Contributor Author

bladedoyle commented Nov 4, 2020

Note: Im no longer convinced that the node "hangs" forever. I think it may just be very very slow. I will confirm.

Note also: I turned up the log messages and I see thousands of messages with this warning (I believe) related to the hang/slowness issue:
20201104 12:14:01.433 WARN grin_chain::txhashset::txhashset - rewind_single_block: 9 output_pos entries missing for: 0001621b28b9 at 918572

@antiochp
Copy link
Member

antiochp commented Nov 5, 2020

We tracked this down to the code that generates (optional) Merkle proofs for unspent coinbase outputs in these early blocks.
Still investigating if we can simply disable this Merkle proof generation.

Edit: Still not entirely sure why its this slow though. For a single block, even with an expensive rewind it should only take ms, maybe a few seconds at most. Maybe something else is going on here as well.

@antiochp
Copy link
Member

antiochp commented Nov 5, 2020

This is a v1 api endpoint. I had not noticed this earlier.
v1 is scheduled to be deprecated in 5.0.0.
Are you able to try this with the corresponding v2 api?

@bladedoyle
Copy link
Contributor Author

v2 api shows the same issue.
Quoting dberkett from Keybase discussion:

pub fn get_block(
    &self,
    height: Option<u64>,
    hash: Option<Hash>,
    commit: Option<String>
) -> Result<BlockPrintable, Error>
So it returns BlockPrintable, which has an option for no merkle proof: https://docs.rs/grin_api/4.1.1/grin_api/struct.BlockPrintable.html#method.from_block
It doesn't look like that was exposed to the foreign API though.

@bladedoyle bladedoyle changed the title API thread hangs attempting to get (old) block data from LMDB API thread poor performance getting (old) block data Nov 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants