Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db/store full (forked) blocks cleanup #2585

Closed
antiochp opened this issue Feb 16, 2019 · 10 comments

Comments

@antiochp
Copy link
Member

commented Feb 16, 2019

We compact/prune the blocks db periodically to remove historical full blocks from the db older than 1 week (unless running in archive mode).

But - we do this based on the current chain by traversing back from the head of the chain.

So if we see any short lived forks or reorgs we will never actually clean these forked blocks up and they will remain in the db indefinitely.

i.e. if we have the following blocks in the chain -

B1 <- B2 <- B3 etc.

We will never remove any forked blocks, say B2' for example.

@192-sean

This comment has been minimized.

Copy link

commented Mar 11, 2019

Where would I start if I were to tackle this.

@antiochp

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2019

Right now we traverse back through historical blocks and then go to the db for each one to be removed. But I think a better approach may be start at the db and simply iterate over the blocks in the db, removing any that are candidates for removal.

This might be a good introduction to the lower level db access stuff, so I'd start looking around our lmdb code to see how we might use an iterator over full blocks. We do something similar in the peers db (via an iterator).

@daogangtang

This comment has been minimized.

Copy link
Contributor

commented May 5, 2019

I would like to process this.

daogangtang added a commit to daogangtang/grin that referenced this issue May 6, 2019

:fix: try to fix issue mimblewimble#2585.
Signed-off-by: Mike Tang <daogangtang@gmail.com>
@daogangtang

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

Hi, @antiochp I have the following questions:

If we clean old blocks by iterating db directly, how we check which block is at a short-lived fork? or even a fork?

And, should we remove all fork blocks from db?

And I proposed a request on #2807 , does it walk the right way?

Thank you.

@hayeah

This comment has been minimized.

Copy link

commented May 7, 2019

My understanding is that instead of using batch.get_previous_header(&current) to iterate through the blocks, you should access the underlying store of the Batch type (which is a LMDB instance)

/// An atomic batch in which all changes can be committed all at once or
/// discarded on error.
pub struct Batch<'a> {
	db: store::Batch<'a>,
}

Create a LMDB iterator to access ALL the blocks, regardless of whether the block is in the chain or not.

Then we need to have an efficient way to check whether a particular block exists on the chain (if not, remove it).

To check whether a block is on chain, one naive way is to iterate through the chain (using get_previous_header), and save block hashes in a map. The CUT_THROUGH_HORIZON is ~10k blocks, so we'll have to keep 10k entries in memory for the compaction process.

daogangtang added a commit to daogangtang/grin that referenced this issue May 7, 2019

fix: try to fix bug mimblewimble#2585.
Signed-off-by: Mike Tang <daogangtang@gmail.com>
@antiochp

This comment has been minimized.

Copy link
Member Author

commented May 7, 2019

Then we need to have an efficient way to check whether a particular block exists on the chain (if not, remove it).

chain.is_on_current_chain()

This checks the header against the header at that height based on our current chain - if they match then we're good (header at height h is the one we think it is).

@daogangtang

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

hi, #2810 CI prompted:

running 5 tests
test hard_forks ... ok
test next_target_adjustment ... ok
test test_secondary_pow_ratio ... ok
test adjustment_scenarios ... test adjustment_scenarios has been running for over 60 seconds
test test_secondary_pow_scale ... test test_secondary_pow_scale has been running for over 60 seconds

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received

What is the meaning of it? Because in my local device, there are no these timeout error. How should I resolve this consensus timeout?

@chisa0a

This comment has been minimized.

Copy link

commented May 9, 2019

@daogangtang had a similar issue on a resource constrained device. If you have the ability to increase the memory available on the CI box, that might help.

Another option is to refactor the tests to consume less resources, which is being partly addressed by #2813 (IIUC).

daogangtang added a commit to daogangtang/grin that referenced this issue May 9, 2019

fix: try to fix issue mimblewimble#2585 by adding block cleanup from …
…db directly.

Signed-off-by: Mike Tang <daogangtang@gmail.com>
@daogangtang

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

@chisa0a Yes, thank you, I repost a PR #2815 , and it pass now.

antiochp added a commit that referenced this issue May 13, 2019

fix: try to fix issue #2585 by adding block cleanup from db directly. (
…#2815)

* fix: try to fix issue #2585 by adding block cleanup from db directly.

Signed-off-by: Mike Tang <daogangtang@gmail.com>

* use another effective algorithm to do old block and short-lived block cleaup.

Signed-off-by: Mike Tang <daogangtang@gmail.com>

* 1. rename iter_lived_blocks to blocks_iter;
2. comments and iterator calling optimiztions.

Signed-off-by: Mike Tang <daogangtang@gmail.com>

* Fix locking bug when calling is_on_current_chain() in batch.blocks_iter just by removing it.
Because "we want to delete block older (i.e. lower height) than tail.height".

Signed-off-by: Mike Tang <daogangtang@gmail.com>
@antiochp

This comment has been minimized.

Copy link
Member Author

commented May 13, 2019

Resolved in #2815. 🚀

@antiochp antiochp closed this May 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.