Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ExplicitLAC persistent #1527

Closed
jvrao opened this issue Jun 20, 2018 · 8 comments
Closed

Make ExplicitLAC persistent #1527

jvrao opened this issue Jun 20, 2018 · 8 comments

Comments

@jvrao
Copy link
Contributor

jvrao commented Jun 20, 2018

FEATURE REQUEST

ExplicitLAC is kept in the memory and it can be lost in bookie reboot.
Though it is an extreme corner case scenario, it can break one of the BK guarantees.
" If you read once you can always read it".
If all the bookies of the Write stripe were rebooted, it can loose its explicitLAC value and the client
which was able to read the entry before the reboot, can't read it anymore.

@eolivelli
Copy link
Contributor

This will be great.
I hope you already have some idea to share

@reddycharan
Copy link
Contributor

@jvrao @sijie @eolivelli

For persisting explicitLAC, we can follow the same approach as followed in persisting fencing information / stateBits (d69986c) in FileInfo file and special marker entry in Journal.

We can have a new marker METAENTRY_ID_LEDGER_EXPLICITLAC = -0x8000 and we can log a special entry in journal (just like fence entry) when we get setExplicitLAC call to the bookie. This special entry could be following

    static ByteBuf createExplicitLACEntry(long ledgerId, long entryId) {
        ByteBuf bb = Unpooled.buffer(8 + 8 + 8);
        bb.writeLong(ledgerId);
        bb.writeLong(METAENTRY_ID_LEDGER_EXPLICITLAC);
        bb.writeLong(entryId);
        return bb;
    }

The long explicitLAC value could be stored in the header of the FileInfo after 'stateBits' value, as there is sufficient space remaining in FileInfo allocated header (1024). For explicitLAC, we can follow the same semantics as fence/statebits logic for flushHeader/checkpoint and replay journal .

But there is some small intricacy here regarding 'HEADER_VERSION' of the FileInfo. Since zero is valid ExplicitLac value for a Ledger, for backward compatibility purpose we cannot just consider the read zero value from HEADER after 'statebits' in old FileInfo file, as persisted 'ExplicitLAC'. So we may need to bump up 'HEADER_VERSION' of FileInfo for implementing this feature and handle compatibility scenarios.

@jvrao
Copy link
Contributor Author

jvrao commented Jun 26, 2018

Yeah; we need to bump up the version number, and need to handle the case for backward compatibility.

@eolivelli
Copy link
Contributor

eolivelli commented Jun 26, 2018

Some thoughts (thinking out loud):

  • I think we should unify ExplicitLAC and PiggyBackLAC, for instance we should update this new field on FileInfo both on setExplicitLAC and on addEntry
  • We should change readLastAddConfirmed so that the reader will use this new value, and there will be no need to call readExplicitLAC, which is very confusing for the user
  • If this setExplicitLAC operation is logged to the journal and we wait for fsync, it can be a good to update writer-side LAC in case of DEFERRED_SYNC

@sijie
Copy link
Member

sijie commented Jun 26, 2018

I think we should unify ExplicitLAC and PiggyBackLAC, for instance we should update this new field on FileInfo both on setExplicitLAC and on addEntry

We should change readLastAddConfirmed so that the reader will use this new value, and there will be no need to call readExplicitLAC, which is very confusing for the user

I don't think we should update FileInfo on addEntry. lac is already stored (piggyback) at the entries, we shouldn't couple this with explicit LAC. from end-user perspective, you can merge the view for piggybacked lac and explicit lac and provide one way to read it; but I don't think we should merge how we store it. it makes things complicated.

@eolivelli
Copy link
Contributor

I don't think we should update FileInfo on addEntry.

Maybe we could save resources and do not write LAC on each entry. I don't know the original motivation of storing LAC for each entry. I thought it was only just to store data from network 'as-is' directly to the storage. @sijie you know the full history.
Anyway I am fine with keeping the two values separated at storage level.

To me it is really important that we have only one API for the read to access LAC, so I would like to drop readExplicitLAC

@sijie
Copy link
Member

sijie commented Jun 26, 2018

as I said, from end-user perspective it is good to have one API, however I don't see the need to merge these two different formats together.

@reddycharan
Copy link
Contributor

reddycharan commented Jun 27, 2018

Actually I missed a crucial aspect of explicitLAC in FileInfo. In FileInfo class, explicitLAC value is stored as ByteBuf but not simple long value. This ByteBuf contains Checksum/Digest information (variable length digestcode) as well, which would be needed to verify digest when explicitLAC is read back by the client. So to persist explicitLAC we received, we have to persist complete ByteBuf we received but not just long value.

So the special entry in Journal should be -

    static ByteBuf createExplicitLACEntry(long ledgerId, ByteBuf explicitLac) {
        ByteBuf bb = Unpooled.buffer(8 + 8 + 4 + explicitLac.capacity());
        bb.writeLong(ledgerId);
        bb.writeLong(METAENTRY_ID_LEDGER_EXPLICITLAC);
        bb.writeInt(explicitLac.capacity());
        bb.writeBytes(explicitLac);
        return bb;
    }

In FileInfo file to persist this variable length explicitLac in the header of FileInfo, we should write length of the explicitLac byteBuf first and then the actual explicitLac ByteBuf (just like how masterKey is persisted in the header of FileInfo). This length of explicitLac ByteBuf and actual ByteBuf should go to the Head of the FileInfo after statebits (this and need for changing version of HEADER is as described earlier and remains same).

reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Jun 29, 2018
For persisting explicitLAC, we can follow the
same approach as followed in persisting
fencing information / stateBits (d69986c)
in FileInfo file and special marker entry
in Journal.
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Jul 6, 2018
- Functional testcases for validating persisted
explicitLAC
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Jul 6, 2018
- Functional testcases for validating persisted
explicitLAC
- minor fixes
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Jul 6, 2018
- Functional testcases for validating persisted
explicitLAC
- minor fixes
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Jul 10, 2018
- added testcase for compatibility check with previous
and new header versions of FileInfo.
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Jul 19, 2018
- bumped version of journal
- introduced fileInfoFormatVersionToWrite config
- testcases to validate combination of fileInfoFormatVersionToWrite and journalFormatVersionToWrite
@sijie sijie closed this as completed in 21f125d Jul 23, 2018
@eolivelli eolivelli added this to the 4.8.0 milestone Sep 6, 2018
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Sep 20, 2018
- As part of ISSUE apache#1527, new versions of 'journalFormatVersionToWrite'
and 'fileInfoFormatVersionToWrite' are introduced. Since 4.8 release version
is created, default values of these versions should be bumped.
sijie pushed a commit that referenced this issue Sep 21, 2018

Descriptions of the changes in this PR:


- As part of ISSUE #1527, new versions of 'journalFormatVersionToWrite'
and 'fileInfoFormatVersionToWrite' are introduced. Since 4.8 release version
is created, default values of these versions should be bumped.

Author: cguttapalem <cguttapalem@salesforce.com>

Reviewers: Sijie Guo <sijie@apache.org>, Enrico Olivelli <eolivelli@gmail.com>

This closes #1689 from reddycharan/bumpjournalversion
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Oct 17, 2018
Descriptions of the changes in this PR:

For persisting explicitLAC, we can follow the
same approach as followed in persisting
fencing information / stateBits (d69986c)
in FileInfo file and special marker entry
in Journal.

ExplicitLAC is kept in the memory and it can be lost in bookie reboot.
Though it is an extreme corner case scenario, it can break one of the BK guarantees.
" If you read once you can always read it".
If all the bookies of the Write stripe were rebooted, it can loose its explicitLAC value and the client
which was able to read the entry before the reboot, can't read it anymore.

Master Issue: apache#1527

Author: cguttapalem <cguttapalem@salesforce.com>

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org>, Venkateswararao Jujjuri (JV) <None>

This closes apache#1532 from reddycharan/storeexplicitlac, closes apache#1527
reddycharan added a commit to reddycharan/bookkeeper that referenced this issue Oct 17, 2018
Descriptions of the changes in this PR:

- As part of ISSUE apache#1527, new versions of 'journalFormatVersionToWrite'
and 'fileInfoFormatVersionToWrite' are introduced. Since 4.8 release version
is created, default values of these versions should be bumped.

Author: cguttapalem <cguttapalem@salesforce.com>

Reviewers: Sijie Guo <sijie@apache.org>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#1689 from reddycharan/bumpjournalversion

@Rev vjujjuri@
jiazhai pushed a commit that referenced this issue Nov 2, 2020
… (Header) of TransientLedgerInfo is not persisted (#2474)

Motivation
DbLedgerStorage does not persist "ExplicitLAC" on RocksDB, so when you restart a Bookie the information is lost.
This work can be considered a follow up of @reddycharan work at #1527

Changes
persist ExplicitLAC on DBLedgerStorage
save ExplicitLAC on RocksDB (it is an optional field, so this change is backward compatible)
enable testExplicitLACIsPersisted test even for DBLedgerStorage
on TransientLedgerInfo rewind the ByteBuf in order to be able to use it again while writing to RocksDB
use computeIfAbsent in getOrAddLedgerInfo (code clean up, but code looked strange, probably subject to some possible race condition)
Master Issue: #1533


* Issue 1533: DbLedgerStorage does not persist ExplicitLAC: Information (Header) of TransientLedgerInfo is not persisted

* final fix and fix checkstyle

* restore tests

Co-authored-by: Enrico Olivelli <eolivelli@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants