Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockstorage: do not flush block to disk if it is already there #27039

Merged
merged 1 commit into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
24 changes: 12 additions & 12 deletions src/node/blockstorage.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -906,19 +906,19 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
if (!fKnown) {
LogPrint(BCLog::BLOCKSTORAGE, "Leaving block file %i: %s (onto %i) (height %i)\n",
last_blockfile, m_blockfile_info[last_blockfile].ToString(), nFile, nHeight);
}

// Do not propagate the return code. The flush concerns a previous block
// and undo file that has already been written to. If a flush fails
// here, and we crash, there is no expected additional block data
// inconsistency arising from the flush failure here. However, the undo
// data may be inconsistent after a crash if the flush is called during
// a reindex. A flush error might also leave some of the data files
// untrimmed.
if (!FlushBlockFile(last_blockfile, !fKnown, finalize_undo)) {
LogPrintLevel(BCLog::BLOCKSTORAGE, BCLog::Level::Warning,
"Failed to flush previous block file %05i (finalize=%i, finalize_undo=%i) before opening new block file %05i\n",
last_blockfile, !fKnown, finalize_undo, nFile);
// Do not propagate the return code. The flush concerns a previous block
// and undo file that has already been written to. If a flush fails
// here, and we crash, there is no expected additional block data
// inconsistency arising from the flush failure here. However, the undo
// data may be inconsistent after a crash if the flush is called during
// a reindex. A flush error might also leave some of the data files
// untrimmed.
if (!FlushBlockFile(last_blockfile, !fKnown, finalize_undo)) {
LogPrintLevel(BCLog::BLOCKSTORAGE, BCLog::Level::Warning,
"Failed to flush previous block file %05i (finalize=%i, finalize_undo=%i) before opening new block file %05i\n",
last_blockfile, !fKnown, finalize_undo, nFile);
}
}
// No undo data yet in the new file, so reset our undo-height tracking.
m_blockfile_cursors[chain_type] = BlockfileCursor{nFile};
Expand Down
9 changes: 5 additions & 4 deletions test/functional/feature_reindex_readonly.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ def reindex_readonly(self):
opreturn = "6a"
nulldata = fastprune_blockfile_size * "ff"
self.generateblock(self.nodes[0], output=f"raw({opreturn}{nulldata})", transactions=[])
block_count = self.nodes[0].getblockcount()
self.stop_node(0)

assert (self.nodes[0].chain_path / "blocks" / "blk00000.dat").exists()
Expand Down Expand Up @@ -73,10 +74,10 @@ def reindex_readonly(self):
pass

if undo_immutable:
self.log.info("Attempt to restart and reindex the node with the unwritable block file")
with self.nodes[0].assert_debug_log(expected_msgs=['FlushStateToDisk', 'failed to open file'], unexpected_msgs=[]):
self.nodes[0].assert_start_raises_init_error(extra_args=['-reindex', '-fastprune'],
expected_msg="Error: A fatal internal error occurred, see debug.log for details")
self.log.debug("Attempt to restart and reindex the node with the unwritable block file")
with self.nodes[0].wait_for_debug_log([b"Reindexing finished"]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to change this to the aggressive busy-loop helper? The previous should work just fine with the right timeout. Also, I think wait_for_debug_log should be renamed to busy_wait_for_debug_log

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. The previous code expected a crash so the test would know exactly when to check for the expected log message. In the new code bitcoind just runs fine so we have to poll the log for the success message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_debug_log (the variant of the function that accepts raw bytes) does not have a sleep, so it will try to maxx out IO at 100%, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see now thanks. So I guess... start_node calls wait_for_rpc_connection. Is RPC available before reindexing is finished? If we can rely on that to prevent race conditions then this could just be assert_debug_log.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can rely on that to prevent race conditions then this could just be assert_debug_log.

assert_debug_log is also syncing (it has a timeout argument), so no race should happen. My comment is only about the IO usage. assert_debug_log is not using a busy wait, but a sleepy wait. Otherwise they are exactly identical.

However, if assert_debug_log is used to sync, my personal preference is to provide the timeout argument, so that the code is self-documenting.

I think in this context my feedback is mostly of stylistic nature, but I could imagine other scenarios where a more expensive reindex in combination with the busy wait may lead to test timeouts on slow storage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Worth opening a new PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, yes, I am happy to review. Though, I won't create a pull myself.

self.start_node(0, extra_args=['-reindex', '-fastprune'])
assert block_count == self.nodes[0].getblockcount()
undo_immutable()

filename.chmod(0o777)
Expand Down