Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BP-50: Add reuse journal file feature to support Intel PMem disk #3194

Merged
merged 5 commits into from Jul 27, 2022

Conversation

hangc0276
Copy link
Contributor

Motivation

The implementation of #3193

@hangc0276 hangc0276 changed the title [WIP] PIP-50: Add reuse journal file to support PMem disk [WIP] PIP-50: Add reuse journal file feature to support PMem disk Apr 9, 2022
@hangc0276 hangc0276 changed the title [WIP] PIP-50: Add reuse journal file feature to support PMem disk [WIP] PIP-50: Add reuse journal file feature to support Intel PMem disk Apr 9, 2022
Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests, at least to:

  • make sure journals aren't reused when feature is disabled.
  • cover recovery from journal/after crash with reused journal. is it possible to use wrong journal file in that case? or end up with corrupted one? What happens with last log mark and checkpointing? e.g. see BookieJournalTest.
  • Bookie's readJournal()/replay() scan journals sequentially because Journal.listJournalIds assumes they are. With this option the assumption changes to circular with unclear stop point.

@ArvinDevel
Copy link
Contributor

when the bookie is restarted with replaying the journal, how to handle the corner cases below:
1)bookie crash with reused journal enabled, but the restart bookie with the flag disabled;
2)bookie crash with reused journal disabled, but the restart bookie with the flag enabled.

@hangc0276 hangc0276 mentioned this pull request Jul 20, 2022
11 tasks
@hangc0276
Copy link
Contributor Author

when the bookie is restarted with replaying the journal, how to handle the corner cases below: 1)bookie crash with reused journal enabled, but the restart bookie with the flag disabled; 2)bookie crash with reused journal disabled, but the restart bookie with the flag enabled.

@ArvinDevel Thanks for your review.

Currently, we reuse the journal files by renaming the file name.
For example, the journal file pool size is 5, and current journal files are: [101, 102, 103, 104, 105], and the current mark deletion position is (103, 2000). We need a new journal file named 106, and if we enabled the reuse journal file, we will do the following check.

  1. The journal files list is: [101, 102, 103, 104, 105], and the mark deletion position point to file: 103, it is safe to reuse file: 101
  2. We pick up file: 101 and rename it to file: 106, after rename, the journal files list is: [102, 103, 104, 105, 106]

For the first case.

1)bookie crash with reused journal enabled, but the restart bookie with the flag disabled

Bookie crash with reused journal enabled, the last written journal file is 106, and the last mark delete position is (103, 2000). When the bookie restarts with the flag disabled, it will replay from file: 103 to file 106.

  1. bookie crash with reused journal disabled, but the restart bookie with the flag enabled

Bookie crash with reuse journal disabled, the last written journal file is 106, and the last mark delete position is (103, 2000). When the bookie restarts with the flag enabled, it will replay from file 103 to file 106 and for new file 107 to be written, it does the above check, renames file 101 to file 107, and open it for writing.

@hangc0276
Copy link
Contributor Author

Please add tests, at least to:

  • make sure journals aren't reused when feature is disabled.
  • cover recovery from journal/after crash with reused journal. is it possible to use wrong journal file in that case? or end up with corrupted one? What happens with last log mark and checkpointing? e.g. see BookieJournalTest.
  • Bookie's readJournal()/replay() scan journals sequentially because Journal.listJournalIds assumes they are. With this option the assumption changes to circular with unclear stop point.

@dlg99 Thanks for your review.
Please refer to: #3194 (comment)

For the test, we need a special FileChannel which records the current written position when rewriting an existing journal file. I'm thinking about how to add the test.

  • cover recovery from journal/after crash with reused journal. is it possible to use wrong journal file in that case? or end up with corrupted one? What happens with last log mark and checkpointing? e.g. see BookieJournalTest.

We rename the old journal file to the new one and keep the journal file name increasing as the current behavior. It can avoid complex cases such as using the wrong journal file and so on.

  • Bookie's readJournal()/replay() scan journals sequentially because Journal.listJournalIds assumes they are. With this option the assumption changes to circular with unclear stop point.

We rename the old journal file to the new one, we can file the stop point with the largest file name on the journal replay stage.

@hangc0276 hangc0276 requested a review from dlg99 July 20, 2022 07:26
@hangc0276 hangc0276 self-assigned this Jul 20, 2022
@hangc0276 hangc0276 added this to the 4.16.0 milestone Jul 20, 2022
@hangc0276 hangc0276 changed the title [WIP] PIP-50: Add reuse journal file feature to support Intel PMem disk PIP-50: Add reuse journal file feature to support Intel PMem disk Jul 20, 2022
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eolivelli eolivelli changed the title PIP-50: Add reuse journal file feature to support Intel PMem disk BP-50: Add reuse journal file feature to support Intel PMem disk Jul 20, 2022
Copy link
Contributor

@ArvinDevel ArvinDevel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hangc0276 thanks for your contribution at first. I left one small comment, please check it. From the change, I can see that the main changes is that extract writeHeader and allow to writeHeader to existing file if it exists and skipHeader when scanning the journal file to replace current fileExists trigger, correct me if I'm wrong.


public void writeHeader() throws IOException {
try {
bc = bcBuilder.create(fc, writeBufferSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bc is already created at line 193, right? if so, I think this line is redundant.

@hangc0276 hangc0276 force-pushed the chenhang/reuse_journal_file branch from 9cc48d5 to a1ed233 Compare July 25, 2022 15:02
@hangc0276
Copy link
Contributor Author

@hangc0276 thanks for your contribution at first. I left one small comment, please check it. From the change, I can see that the main changes is that extract writeHeader and allow to writeHeader to existing file if it exists and skipHeader when scanning the journal file to replace current fileExists trigger, correct me if I'm wrong.

@ArvinDevel Thanks for your review. Yes, you are right. The original logic is a little complex and it has bugs. I rewrite the logic, Please help take a look again. Thanks a lot.

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@ArvinDevel ArvinDevel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hangc0276 thanks for your clarification, please add the failure cases(which are not solved by previous version) in the test to ensure the refactored code works as expected.

if (!channel.fileExists(fn)) { // new file, write version
if (!fn.createNewFile()) {
if (!channel.fileExists(fn) || reuseFile) { // new file, write version
if (!reuseFile && !fn.createNewFile()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why mix the logic with the old one?
when the reuseFile is true, we have done renameJournalFile(toReplaceFile, fn), so we expect the channel.fileExists(fn), if not then throw exception, is this alert what you want to ensure?
If so I suggest to separate it and put more clear information to log, and please also add necessary comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArvinDevel Thanks for your review.

when the reuseFile is true, we have done renameJournalFile(toReplaceFile, fn), so we expect the channel.fileExists(fn), if not then throw exception, is this alert what you want to ensure?

No, there are two code paths to run into line 180.

  1. For normal create a new journal file, the channel.fileExists(fn) is false and reuseFile flag is false. It will call createNewFile and then write the header into the new journal file.
  2. For open an existed journal file to write, it will rename the journal file first. The channel.fileExists(fn) is true and reuseFile is true. It will run into line 181, due to reuseFile being true, so it won't call fn.createNewFile to create a new journal file. In the end, it will write the new header into the journal file to override existing data in the file.

For the test, we need a special FileChannel which records the current written position when rewriting an existing journal file. It is a little hard to run into a reuseFile case.

@@ -248,6 +257,13 @@ private JournalChannel(File journalDirectory, long logId,
}
}

public static void renameJournalFile(File source, File target) throws IOException {
if (!source.renameTo(target)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should check source's value like:
if (source == null) { throw new NullPointerException(); }

@StevenLuMT
Copy link
Contributor

rerun failure checks

Copy link
Contributor

@ArvinDevel ArvinDevel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty clear now. Thanks for your excellent work. @hangc0276
LGTM!

@hangc0276 hangc0276 merged commit 505ee5f into apache:master Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants