Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor part of IndexFileDeleter and ReplicaFileDeleter into a common utility class #12126

Merged
merged 9 commits into from
Mar 16, 2023

Conversation

zhaih
Copy link
Contributor

@zhaih zhaih commented Feb 3, 2023

Description

See #11885

Approach

I extracted mainly the ref counting part into the new FileDeleter so that both IndexFileDeleter and ReplicaFileDeleter will use it as a component. It does not provide any level of thread-safety since the IndexFileDeleter originally was implemented in a lock-free way, so the user of this new FileDeleter is responsible for the synchronization if necessary.

Test

I haven't written any specific test for this since I feel like the existing tests (mainly TestIndexFileDeleter) should already provide a good coverage

@zhaih zhaih linked an issue Feb 3, 2023 that may be closed by this pull request
@@ -206,7 +206,7 @@ private synchronized void _transferAndCancel(CopyJob prevJob) throws IOException
if (Node.VERBOSE_FILES) {
dest.message("remove partial file " + prevJob.current.tmpName);
}
dest.deleter.deleteNewFile(prevJob.current.tmpName);
dest.deleter.deleteIfNoRef(prevJob.current.tmpName);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like deleteIfNoRef is always safer than deleteNewFile. Do we still need the method deleteNewFile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I actually want to do it the opposite way, I'm not 100% sure why we need a deleteNewFile (force delete) here rather than deleteIfNoRef but I don't want to introduce a (possibly) different behavior in this change so I kept it.
Altho the deleteNewFile is a bit misleading so I changed the name to forceDeleteFile

}
}

deleteFiles(toDelete);
fileDeleter.deleteFilesIfNoRef(toDelete);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we get toDelete from fileDeleter as well. should this part of code be encaptulated into fileDeleter as a whole. like call fileDeleter.clean() or sth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was struggling with this as well. The reason I put it like this is that I feel like the IllegalStateException here is quite important so I don't want to throw it away, but if I put it into say fileDeleter.clean(), then it does not always make sense too.

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this refactor @zhaih! It is very hairy code that was doing nearly the same thing in two different places. The rote refactor is badly needed.

I left a few small comments... I think this is very close.

lucene/CHANGES.txt Outdated Show resolved Hide resolved
@@ -154,7 +151,7 @@ public IndexFileDeleter(
|| fileName.startsWith(IndexFileNames.PENDING_SEGMENTS))) {

// Add this file to refCounts with initial count 0:
getRefCount(fileName);
fileDeleter.getRefCount(fileName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda weird that a method starting with get has this set-like side effect, sigh.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can maybe later / separately improve the naming, or add a dedicated initRefCount(fileName) method or so.

Copy link
Contributor Author

@zhaih zhaih Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't like that part either, it's quite tricky, good thing is we're not abusing the old get so I just followed your suggestion and created a initRefCount, feels better now LOL.

@@ -610,76 +601,34 @@ public void checkpoint(SegmentInfos segmentInfos, boolean isCommit) throws IOExc
}
}

private void logInfo(FileDeleter.MsgType msgType, String msg) {
if (msgType == FileDeleter.MsgType.REF && VERBOSE_REF_COUNTS == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably you could have just moved the VERBOSE_REF_COUNTS down into FileDeleter, but since this is a rote refactor it's good to separate / do that later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could leave VERBOSE_REF_COUNTS as is, since FileDeleter is just responsible for throwing message out with it's type, and here VERBOSE_REF_COUNTS is the switch on the IFD side so I think it makes sense to let it stay there.

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for refactoring such a hairy part of Lucene @zhaih!

@zhaih zhaih merged commit d3b6ef3 into apache:main Mar 16, 2023
zhaih added a commit that referenced this pull request Mar 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor and generalize file deleter
4 participants