Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow DB reopen with reduced options.num_levels #2740

Closed
wants to merge 5 commits into from

Conversation

yiwu-arbug
Copy link
Contributor

Summary:
Allow user to reduce number of levels in LSM by issue a full CompactRange() and put the result in a lower level, and then reopen DB with reduced options.num_levels. Previous this will fail on reopen on when recovery replaying the previous MANIFEST and found a historical file was on a higher level than the new options.num_levels. The workaround was after CompactRange(), reopen the DB with old num_levels, which will create a new MANIFEST, and then reopen the DB again with new num_levels.

This patch relax the check of levels during recovery. It allows DB to open if there was a historical file on level > options.num_levels, but was also deleted.

Test Plan:
See the new test. Also make sure existing tests pass.

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

return false;
}
// Don't need to check deleted files.
level_iter = levels_.erase(level_iter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of deleting entries from a data structure while iterating it, no matter it is supported or not, because it creates a portable issue if someone tries to swap a data structure. I suggest we do it in two paths, or not removing these empty levels at all.

@yiwu-arbug
Copy link
Contributor Author

@siying tested with ldb manifest_dump --path=<PATH> > /dev/null with a MANIFEST of size 27M. Each run 3 times:
master: 0.408s 0.409s 0.449s
w/ the patch: 0.451s 0.457s 0.441s
Does it address your concern or do we need to improve the map?

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug updated the pull request - view changes - changes since last import

@siying
Copy link
Contributor

siying commented Aug 17, 2017

Is 27MB the largest you can find? I think there must be much larger manifest files.
Talking about this regression, 10% slower? Is that noise? If it is really 10% slower, it's something we should try to avoid but I feel it hard to believe.

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@yiwu-arbug
Copy link
Contributor Author

Updated the code to use a separate map for invalid levels, to avoid regression. Tested with the same ldb command to dump a 39MB manifest (which is the largest among several prod servers I checked).
master: 0.711s 0.679s 0.623s
w/ the patch: 0.694s 0.691s 0.658s

@yiwu-arbug
Copy link
Contributor Author

ping

@yiwu-arbug
Copy link
Contributor Author

ping @siying

invalid_levels_[level].insert(number);
} else {
has_invalid_levels_ = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of complicating the logic to have invalid_levels_ and has_invalid_levels_ ? Can we have the verification logic all in the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid_levels_ is split from levels_ to avoid regression in case there isn't files above num_levels_. has_invalid_levels_ can be avoid, but it is handy to remember whether there are erase of non-existing files or addition of existing file, on invalid levels. Let me add more comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

invalid_levels_[level].insert(number);
} else {
has_invalid_levels_ = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants