Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE #1527: Make ExplicitLAC persistent #1532

Closed
wants to merge 4 commits into from

Conversation

reddycharan
Copy link
Contributor

Descriptions of the changes in this PR:

For persisting explicitLAC, we can follow the
same approach as followed in persisting
fencing information / stateBits (d69986c)
in FileInfo file and special marker entry
in Journal.

Motivation

ExplicitLAC is kept in the memory and it can be lost in bookie reboot.
Though it is an extreme corner case scenario, it can break one of the BK guarantees.
" If you read once you can always read it".
If all the bookies of the Write stripe were rebooted, it can loose its explicitLAC value and the client
which was able to read the entry before the reboot, can't read it anymore.

Master Issue: #1527

For persisting explicitLAC, we can follow the
same approach as followed in persisting
fencing information / stateBits (d69986c)
in FileInfo file and special marker entry
in Journal.
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks good to me.
We must add extensive testing of all moving parts.
We already have tests around calls from bookie to journal and around journal internals.

I feel we should have a minimal test which combines ExplicitLAC with every ledgerstorage, in particular DbLedgerManager which is not in use as SF.

We are treating Explicit LAC as a regular addentry, we have to implement/change stats in journal, otherwise they will appear as writes.

A thought: ExplicitLAC touches LedgerStorage, do we have to handle 'fencing', an option is to check flag and report it to the writer, so that the writer will fail?
I have thought about this and I think this is not needed because the writer will eventually fail at next write, and Explicit LAC as no way to report a failure to the writer.

@@ -1182,13 +1201,24 @@ public void recoveryAddEntry(ByteBuf entry, WriteCallback cb, Object ctx, byte[]
}
}

public void setExplicitLac(ByteBuf entry, Object ctx, byte[] masterKey)
static ByteBuf createExplicitLACEntry(long ledgerId, ByteBuf explicitLac) {
ByteBuf bb = Unpooled.buffer(8 + 8 + 4 + explicitLac.capacity());
Copy link
Contributor

@eolivelli eolivelli Jun 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use a pooled bytebuf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could change that place as well.
Journal will call "release" on the ByteBuf after writing the entry to the file

@reddycharan
Copy link
Contributor Author

reddycharan commented Jul 2, 2018

@eolivelli Thanks for going through WIProgress CR. I would like to get feedback on the approach. Yes I need to add tests and I'm working on it.

Regarding fenced/setexplicitlac concern you raised, if I understood you correctly you are suggesting that we should have similar check like https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java#L1235

But I don't think that this check is needed for set explicitLac call, because ExplicitLacFlush is periodic delayed operation - https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ExplicitLacFlushPolicy.java#L82 so by the time write handle pushes set explicitLac request (or set explicitLac requests might be in flight), ledger might have been recover opened (fenced), but it is legally ok for bookie to receive setExplicitLac call and set explicitLac accordingly.

@eolivelli
Copy link
Contributor

@reddycharan I agree there is no need to check fence flag.

Let's move forward with current approch

- Functional testcases for validating persisted
explicitLAC
- minor fixes
@eolivelli
Copy link
Contributor

What about adding some test in this file, to narrow the scope of the test? This suite exercises and tests directly the interface between Bookie and Journal

https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieWriteToJournalTest.java

@eolivelli
Copy link
Contributor

Could you also consider adding some test like these?
https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/test/java/org/apache/bookkeeper/proto/WriteEntryProcessorTest.java

I mean only unit tests around the 'Processor' for setExplicitLAC on bookie side.

- added testcase for compatibility check with previous
and new header versions of FileInfo.
@reddycharan reddycharan changed the title (WIP) ISSUE #1527: Make ExplicitLAC persistent ISSUE #1527: Make ExplicitLAC persistent Jul 10, 2018
@reddycharan
Copy link
Contributor Author

Hey @eolivelli , considering the importance/sensitivity of this fix, I added tests at multiple levels -

  1. ExplicitLacTest - end-to-end test from BookKeeper client to Bookie level, verified at Bookkeeper client level
  2. LedgerStorageTest - end-to-end scenario from BookKeeper client to Bookie level, but verified at Bookie API level (reading explicit lac using Bookie) to validate explicitlac entry added to journal and also verified by reading at FileInfo level to validate explicitlac persisted to FileInfo header.
  3. IndexPersistenceMgrTest - test scenario at IndexPersistenceMgr level, by doing read and write at IndexPersistenceMgr level, also old and new header versions of FileInfo are pre-created for compatibility (backward) validation.

So I guess, I added sufficient testing for this fix.

@reddycharan
Copy link
Contributor Author

reddycharan commented Jul 10, 2018

@jvrao @sijie PR is ready for formal review.

@sijie
Copy link
Member

sijie commented Jul 11, 2018

rerun bookkeeper-server bookie tests
rerun pr validation

} else {
throw new IOException("Invalid journal. Contains explicitLAC " + " but layout version ("
+ journalVersion + ") is too old to hold this");
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the case of rollback? The software won't understand the special entryId and falls into this section right? Do we handle it right in handle.addentry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this journalVersion is the version of journalfile but not "JournalChannel.CURRENT_JOURNAL_FORMAT_VERSION" (current code version), we would get to this 'else' case if we found a journal older than V5 but it is having entry where entryId METAENTRY_ID_LEDGER_EXPLICITLAC, which is not possible and expected in any case. Hence we are throwing IOException here just like https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java#L783

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean any special entryIds are going to be treated as regular entryIds in this case. So my request is to create a mask to identify SPECIAL entries and filter them out. If there is anything special that the code doesn't know, flag an error and continue. I know this doesn't solve the previous release but future code changes should be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on having a mask as what JV suggested.

Copy link
Contributor Author

@reddycharan reddycharan Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvrao @sijie valid point. for masking purpose, I'm going to assume all negative entryids are SPECIAL entries and if the current code / binary doesn't recognize a special entry then it will log a warn line and proceed.

@@ -206,9 +218,11 @@ public synchronized void readHeader() throws IOException {
throw new IOException("Missing ledger signature while reading header for " + lf);
}
int version = bb.getInt();
if (version != HEADER_VERSION) {
if (version > CURRENT_HEADER_VERSION) {
throw new IOException("Incompatible ledger version " + version + " while reading header for " + lf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this prevents software rollback period. isn't it? Should we consider coming up ? what are the downsides of that? yeah we may not read the explicitLAC other than that, is there anything else? I am worried if there are any situations where we make the cluster not usable.

Copy link
Contributor

@jvrao jvrao Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code change is fine. I spoke to @sijie and @eolivelli this morning. We needed a bigger gate to identify ondisk changes. I will open an issue for that.

@reddycharan
Copy link
Contributor Author

@jvrao @sijie actually i missed considering Journal version bump for this change. It seems for other special entries in journal, JournalVersion is bumped

c65d6ad
d69986c

can you confirm if Journal version bump is required here.

@reddycharan
Copy link
Contributor Author

@sijie and @eolivelli can you please complete the review. Its been waiting for quite sometime.

@eolivelli
Copy link
Contributor

For me is quite okay but I want to merge it only after cutting 4.8 branch.
This feature is not enough mature to be delivered with 4.8

@@ -121,6 +123,7 @@
static final long METAENTRY_ID_LEDGER_KEY = -0x1000;
static final long METAENTRY_ID_FENCE_KEY = -0x2000;
public static final long METAENTRY_ID_FORCE_LEDGER = -0x4000;
static final long METAENTRY_ID_LEDGER_EXPLICITLAC = -0x8000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to bump the journal version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, bumping journal version

@@ -784,6 +787,23 @@ public void process(int journalVersion, long offset, ByteBuffer recBuff) throws
+ " but layout version (" + journalVersion
+ ") is too old to hold this");
}
} else if (entryId == METAENTRY_ID_LEDGER_EXPLICITLAC) {
if (journalVersion >= JournalChannel.V5) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be JournalChannel.V6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, bumped journal version

} else {
throw new IOException("Invalid journal. Contains explicitLAC " + " but layout version ("
+ journalVersion + ") is too old to hold this");
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on having a mask as what JV suggested.

bb.putInt(masterKey.length);
bb.put(masterKey);
bb.putInt(stateBits);
if (this.headerVersion >= V1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be better to make fileInfo version configurable in ServerConfiguration, as how we handled in Journal.

so people can upgrade binary but still sets the write version to be v0. so the bookie can run with new binary but writing old version fileinfo. this allows rollback binary.

after people verify the binary work, then can change the write version to v1. at this point, it can still rollback to configuration to v0 and using the new binary. in this approach, it gives some options for rollback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, adding config for this

@sijie
Copy link
Member

sijie commented Jul 17, 2018

@reddycharan I made a few comments regarding journal version and fileinfo version.

@eolivelli if those versions are configurable, it is okay to include this change as part of 4.8. and it is actually better, because you can have code ready in 4.8 to understand the new version, but allow the new version still functioning in old version. this would provide a smooth upgrade/rollback process for users.

@sijie
Copy link
Member

sijie commented Jul 17, 2018

also please add an entry "4.7.x to 4.8.0 upgrade" in this "Upgrade" guide and document that there are new journal version and file info version introduced.

http://bookkeeper.apache.org/docs/latest/admin/upgrade/#46x-to-470-upgrade

- bumped version of journal
- introduced fileInfoFormatVersionToWrite config
- testcases to validate combination of fileInfoFormatVersionToWrite and journalFormatVersionToWrite
@reddycharan
Copy link
Contributor Author

@sijie @jvrao addressed review comments. Can you please check the new iteration I sent.

regarding updating http://bookkeeper.apache.org/docs/latest/admin/upgrade/#46x-to-470-upgrade, I think it would be better and convenient to do it in different PR. Anyhow nothing is written so far for "4.7.x to 4.8.0 upgrade".

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me.
I left a nit about a test but nothing really important.

Great work @reddycharan and @sijie !

For me it is okay to have in merged now in 4.8.
Let's wait for green light from CI

* to persist explicitLac, journalFormatVersionToWrite should be atleast
* V6 and fileInfoFormatVersionToWrite should be atleast V1
*/
baseConf.setJournalFormatVersionToWrite(6);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reddycharan maybe it is better to use defined constants for these values (I mean '6' here and '1' below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I thought about this. But "JournalChannel" is not public class (which contains defined constants) and similarly "FileInfo" is not public either.

I guess for the same reason actual values are used in other places aswell - https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/test/java/org/apache/bookkeeper/conf/TestBKConfiguration.java#L50

https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java#L637

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok!
Thank you for your explanation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok!
Thank you for your explanation

@eolivelli
Copy link
Contributor

retest this please

@eolivelli
Copy link
Contributor

@sijie @reddycharan

It seems to me that the only remaining concern from Sijie is to upgrade the guide.
If you Charan create a new task and have time to finish the work I would be happy to merge this change, so that we can keep it on the branch for a few days (and run downstream projects tests) before cutting the release branch

@reddycharan
Copy link
Contributor Author

@eolivelli created issue for upgrade doc - #1565. Will follow up with that once this is closed.

@eolivelli
Copy link
Contributor

retest this please

@eolivelli
Copy link
Contributor

rerun bookie tests
rerun integration tests

@eolivelli
Copy link
Contributor

rerun bookkeeper-server bookie tests

@sijie
Copy link
Member

sijie commented Jul 23, 2018

@jvrao since you left a few comments here before, do you want to review this again?

Copy link
Contributor

@jvrao jvrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@reddycharan
Copy link
Contributor Author

Thanks all for your feedback on this work item. Gladly, tests have passed finally.

@sijie sijie added this to the 4.8.0 milestone Jul 23, 2018
@sijie sijie closed this in 21f125d Jul 23, 2018
reddycharan added a commit to reddycharan/bookkeeper that referenced this pull request Oct 17, 2018
Descriptions of the changes in this PR:

For persisting explicitLAC, we can follow the
same approach as followed in persisting
fencing information / stateBits (d69986c)
in FileInfo file and special marker entry
in Journal.

ExplicitLAC is kept in the memory and it can be lost in bookie reboot.
Though it is an extreme corner case scenario, it can break one of the BK guarantees.
" If you read once you can always read it".
If all the bookies of the Write stripe were rebooted, it can loose its explicitLAC value and the client
which was able to read the entry before the reboot, can't read it anymore.

Master Issue: apache#1527

Author: cguttapalem <cguttapalem@salesforce.com>

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org>, Venkateswararao Jujjuri (JV) <None>

This closes apache#1532 from reddycharan/storeexplicitlac, closes apache#1527
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants