Store the serialized block size in the database #5911

teor2345 · 2023-01-04T02:28:33Z

Motivation

RPCs like getblock return the serialized block size, but we haven't found any specific clients that need it yet.

If we find clients that need it, we want to get this size efficiently, without reading the whole block. This substantially improves the performance of clients like lightwalletd and zebra-checkpoints, which use other fields in this RPC.

Complex Code or Requirements

We could add a block size to the hash_by_height index. It only needs to be 4 bytes:
https://github.com/ZcashFoundation/zebra/blob/main/book/src/dev/rfcs/0005-state-updates.md#rocksdb-data-structures

This ticket updates the database format on disk. If we increment the state version, we'll run some rarely-used CI jobs, and make all our users rebuild their entire databases from scratch.

But we could implement the size lookup in a backwards-compatible way instead. If we check the length of the hash_by_height value, and return None if the block size is not present in the database, then old databases will still work. We could remove this compatibility code the next time we increment the state version.

Testing

We'll need tests for reading block sizes from the database.
Writing block sizes is covered by our existing sync and cached state tests.

If we use the backwards-compatible design, we'll also need specific tests for legacy state data (without the size) and new state data (with the size).

Our existing cached state tests will check both these modes. But coverage won't be guaranteed, because the blocks that have sizes will change depending on how other tests and PRs use the cached state.

Related Work

This is a performance fix for PR #5894, we're using a workaround for now. The workaround has acceptable performance for zebra-checkpoints.

The text was updated successfully, but these errors were encountered:

mpguerra · 2023-01-30T19:26:38Z

Hey team! Please add your planning poker estimate with Zenhub @arya2 @conradoplg @dconnolly @oxarbitrage @teor2345 @upbqdn

teor2345 · 2023-01-30T23:27:09Z

@mpguerra I think this might be optional, since we solved #5894 another way. And the performance is acceptable.

teor2345 added C-enhancement Category: This is an improvement S-needs-triage Status: A bug report needs triage P-Low ❄️ I-slow Problems with performance or responsiveness A-rpc Area: Remote Procedure Call interfaces labels Jan 4, 2023

teor2345 mentioned this issue Jan 4, 2023

feat(zebra-checkpoints): make zebra-checkpoints work for zebrad backend #5894

Merged

9 tasks

teor2345 added P-Optional ✨ and removed P-Low ❄️ labels Jan 30, 2023

teor2345 added the A-state Area: State / database changes label Jan 30, 2023

mpguerra removed the S-needs-triage Status: A bug report needs triage label Mar 16, 2023

mpguerra mentioned this issue Aug 22, 2023

Tracking: Official support for mining RPCs in Zebra #7366

Closed

17 tasks

mpguerra removed the P-Optional ✨ label Jan 17, 2024

mpguerra closed this as not planned Won't fix, can't repro, duplicate, stale Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store the serialized block size in the database #5911

Store the serialized block size in the database #5911

teor2345 commented Jan 4, 2023 •

edited

Loading

mpguerra commented Jan 30, 2023

teor2345 commented Jan 30, 2023

Store the serialized block size in the database #5911

Store the serialized block size in the database #5911

Comments

teor2345 commented Jan 4, 2023 • edited Loading

Motivation

Complex Code or Requirements

Testing

Related Work

mpguerra commented Jan 30, 2023

teor2345 commented Jan 30, 2023

teor2345 commented Jan 4, 2023 •

edited

Loading