Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility to rebuild interleaved storage index files #1642

Closed
wants to merge 3 commits into from

Conversation

ivankelly
Copy link
Contributor

We came across a case where the a ledger had been deleted from
zookeeper accidently. It was possible to recover the ledger metadata
from the zookeeper journal and old snapshots, but the bookies had
deleted the indices by this time. However, even if the index is
deleted, the data still exists in the entrylog.

This utility scans the entrylog to rebuild the index, thereby making
the ledger available again.

We came across a case where the a ledger had been deleted from
zookeeper accidently. It was possible to recover the ledger metadata
from the zookeeper journal and old snapshots, but the bookies had
deleted the indices by this time. However, even if the index is
deleted, the data still exists in the entrylog.

This utility scans the entrylog to rebuild the index, thereby making
the ledger available again.
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
We can cherry pick to 4.8 if you want

if (!ledgerCache.ledgerExists(ledgerId)) {
// set dummy master key and set fenced.
// master key is only used on write operations
ledgerCache.setMasterKey(ledgerId, new byte[0]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. this would cause a problem, if we recover a ledger, but the writer comes in later and attempts to write entries, all the writes will be failing with auth failure. and the reader can't actually read them, right? do we have a test case to cover this negative case?

we might need to fix it or just add messages in the help messages of the tool, so people is aware of this situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes. I had been thinking of adding it as required a passed in option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add messages in the help messages of the tool

I think we need to pick up your effort to revamp the the tools, and 1) move them all into the same framework, 2) add tests & 3) add docs for each one.

Right now, bookie shell is a mess. There's even multiple tools that do exactly the same thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a little bit more hidden complexity here that I'm not sure how to resolve.

If a idx file has been deleted, but ledger metadata exists, when a client tries to open it and run recovery, it will see that it cannot read any entries, and close the ledger as empty.

I know DLog has a force option for this, but I think it would be better to provide another cli tool to allow users to mark the metadata from Closed -> In recovery, and then run the recovery again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we don't actually check the password on fencing, we just blindly accept what the client passes in :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to pick up your effort to revamp the the tools, and 1) move them all into the same framework, 2) add tests & 3) add docs for each one.

let's pick up the tasks for #1000

If a idx file has been deleted, but ledger metadata exists, when a client tries to open it and run recovery, it will see that it cannot read any entries, and close the ledger as empty.

yeah. I think this is a rare corner case that we actuallly can do nothing for it.

I know DLog has a force option for this, but I think it would be better to provide another cli tool to allow users to mark the metadata from Closed -> In recovery, and then run the recovery again.

sgtm

It looks like we don't actually check the password on fencing, we just blindly accept what the client passes in :/

not sure we really need password for fencing, since it doesn't really append entries. and I think in general password should be removed eventually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, I'll push my other updates for now, and leave out the password stuff, since it has no effect.

@sijie
Copy link
Member

sijie commented Sep 4, 2018

We can cherry pick to 4.8 if you want

I don't think we need to cherry-pick this to 4.8. I think branch-4.8 is already cut and waiting for bug fixes only, so any feature-like issues should go to 4.9.

@ivankelly
Copy link
Contributor Author

rerun bookkeeper-server replication tests
rerun validation tests

@ivankelly
Copy link
Contributor Author

rerun pr validation

@ivankelly
Copy link
Contributor Author

rerun bookkeeper-server replication tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants