Fix signet indexing #63
Conversation
| pub file_no: u32, | ||
| pub height_first: u32, | ||
| pub height_last: u32, | ||
| pub data_len: Option<u32>, |
There was a problem hiding this comment.
It’s optional because not every parser path has a known logical blk file size. its a fallaback inase we get NONE. None means “fall back to the full file length”
There was a problem hiding this comment.
Do you mean not every blk file has a logical size? It seems like its always set to Some() in method that creates the hints?
There was a problem hiding this comment.
yes those are always Some(info.size). The Option is only for the parser fallback path when no hints are provided where we still allow BlkFileHint::default() / empty hints and do not know the logical size up front.
check parser.rs line 47
| pub paths: dense::IndexPaths, | ||
| pub spk_db: SledScriptPubkeyDb, | ||
| /// Blk file layout hints: `(file_no, height_first, height_last)`, sorted by `file_no`. | ||
| /// Blk file layout hints derived from Bitcoin Core's block index. |
There was a problem hiding this comment.
This seems like a regression. I would revert and just reference the new struct BlkFileHints
| pub file_no: u32, | ||
| pub height_first: u32, | ||
| pub height_last: u32, | ||
| pub data_len: Option<u32>, |
There was a problem hiding this comment.
A rustdoc for this would be useful. esp. since later in this PR its re-assigned a var called used_len?
bc1cindy
left a comment
There was a problem hiding this comment.
good catch!
could we add a test that builds a synthetic blk file with garbage past data_len and asserts the parse stops cleanly?
that would pin down the signet regression against future changes
Yes to adding a test for this. It can live with the other blk file test fixtures. I am not sure how to get actually generate this fixture. |
maybe using one that already exists and appending zeros? |
I don't think that will work. You would also need to update the leveldb index |
Use Bitcoin Core's recorded blk file size instead of the raw filesystem length when parsing block files. This prevents the parser from reading into the preallocated tail of the active blk file, which can appear as garbage after XOR decoding and trigger unexpected EOF errors.
closes: #62
This PR fixes blk file parsing by using Bitcoin Core’s recorded logical file size (BlockFileInfo.size) instead of the raw filesystem length. That prevents the parser from reading into the preallocated tail of the active blk*.dat file, which can look like garbage after XOR decoding and cause unexpected EOF errors.
Reasoning for this approach are here #62 (comment)
Message for reviwers:
After this fix indexing is working with results below, but if a better approach is available would appreciate suggestions