Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-27204][flink-runtime] Refract FileSystemJobResultStore to execute I/O operations on the ioExecutor #22341

Merged
merged 4 commits into from
Aug 23, 2023

Conversation

WencongLiu
Copy link
Contributor

What is the purpose of the change

At present, FileSystemJobResultStore executes I/O operations through FileSystem directly. We should refract the interface of JobResultStore to make I/O operations be executed asynchronously. This would move the responsibility of I/O operation from the Dispatcher into the JobResultStore.

Brief change log

  • Refract the JobResultStore interface.
  • Refract all codes that calls JobResultStore.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

flinkbot commented Apr 4, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@XComp
Copy link
Contributor

XComp commented May 10, 2023

Hi @WencongLiu, sorry for the late response. I found time to look into you proposal now. The initial intention of FLINK-27204 was to have the async functionality being hidden in FileSystemBasedJobResultStore. The FileSystemBasedJobResultStore performs IO operations for most of the interface's methods which is considered blocking and should run in the Dispatcher's main thread (as it does right now). To achieve that, we would have to migrate all JobResultStore methods to become asynchronous:

pre-FLINK-27204 JobResultStore post-FLINK-27204 JobResultStore*
void createDirtyResult(JobResultEntry) throws ... CompletableFuture createDirtyResultAsync(JobResultEntry)
void markResultAsClean(JobID) throws ... CompletableFuture markResultAsCleanAsync(JobID)
boolean hasJobResultEntry(JobID) throws ... CompletableFuture hasJobResultEntryAsync(JobID)
boolean hasDirtyJobResultEntry(JobID) throws ... CompletableFuture hasDirtyJobResultEntryAsync(JobID)
boolean hasCleanJobResultEntry(JobID) throws ... CompletableFuture hasCleanJobResultEntryAsync(JobID)
Set getDirtyResults(JobID) throws ... CompletableFuture<Set> getDirtyResultsAsync()

* The async calls in the runtime module usually have the formate CompletableFuture<?> <method-name>Async()

The FileSystemBasedJobResultStore would get an constructor parameter ioExecutor which then would be used to run the async calls. Your current proposal doesn't specify a executorService to run the CompletableFutures on. Additionally, we need to utilize the CompletableFutures whereever possible (instead of calling .get() rightaway). Calling .get() on the CompletableFuture makes the call blocking again (which is what we want to avoid).

Does that sound reasonable to you? Let me know if you have more questions.

@WencongLiu
Copy link
Contributor Author

WencongLiu commented May 11, 2023

@XComp Thanks for your reply! 😀 I'll follow the suggestions.

@WencongLiu
Copy link
Contributor Author

@XComp I have made a round of changes. Please take a look at it when you have time. 😃

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @WencongLiu . I did a pass over the change and added a few comments. PTAL

Just letting you know. I'm on vacation for the next two weeks. Therefore, don't expect any responses in that time. ...just to help you coordinating your work. :-)

@WencongLiu WencongLiu force-pushed the dev_FLINK-27204 branch 2 times, most recently from 90d7f95 to 318edb4 Compare June 3, 2023 11:44
@WencongLiu
Copy link
Contributor Author

@XComp I have made a round of changes. Please take a look at it when you have time. 😄

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for getting back to you that late, @WencongLiu . I haven't found the time till now to go through open PullRequest. Anyway, I went over the changes once more. I added a few comments. PTAL

@WencongLiu
Copy link
Contributor Author

Thanks for your careful review 😄 @XComp. Sorry for the first round of pull request because it's a bit rough. Please take a look when you have time.

@WencongLiu WencongLiu force-pushed the dev_FLINK-27204 branch 4 times, most recently from 584c7a7 to 6ef6f2c Compare July 25, 2023 04:23
@WencongLiu WencongLiu changed the title [FLINK-27204] Refract FileSystemJobResultStore to execute I/O operations on the ioExecutor [FLINK-27204][flink-runtime] Refract FileSystemJobResultStore to execute I/O operations on the ioExecutor Jul 25, 2023
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @WencongLiu for addressing my comments. I went over the code once more. PTAL.

I might do another pass over it to address the issue in JobMasterServiceLeadershipRunner#verifyJobSchedulingStatusAndCreateJobMasterServiceProcess next week.

@WencongLiu WencongLiu force-pushed the dev_FLINK-27204 branch 2 times, most recently from 4eccd9b to 107b36a Compare August 9, 2023 09:56
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM already. I'm gonna pass over the JobMasterServiceLeadershipRunner change tomorrow.

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, we're almost there. 👍 I added a few (mostly cosmetic comments). PTAL

@WencongLiu
Copy link
Contributor Author

Thanks for your patient review. @XComp I have added a fixup commit. PTAL.

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks. We're getting closer. I have a few cosmetic comments. PTAL

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @WencongLiu . I only have a few cosmetic proposals. But it looks good now. :-)

One additional thing: Could you fix the TODO in JobMasterServiceLeadershipRunner:171ff in a hotfix commit. I just noticed that we forgot to clean that up 😇

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Thanks very much for your effort. One last thing is there to be fixed.

When you're done with that, you can re-organize/squash the commits properly and rebase the branch. I will do a final pass over it after this is done. 👍

@WencongLiu WencongLiu force-pushed the dev_FLINK-27204 branch 2 times, most recently from 705c6c9 to 249e0e9 Compare August 18, 2023 15:10
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reorganizing the commits. In my final pass over the change, I realized that we missed one location where we should switch to the CompletableFuture handling instead of blocking the call. Sorry for missing that earlier. I marked the code snippet below. PTAL.

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. We're done here in my opinion. Good job. 👍 I pushed some minor fixes (see comparison). The changes can be merged to master as soon as the release-1.18 branch is created.

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is failing due to us not acknowledging the requirement that any state access in the Dispatcher has to happen in the main thread (to ensure sequential execution of any Dispatcher logic). Please see my comment below for further details.

return FutureUtils.completedExceptionally(
DuplicateJobSubmissionException.ofGloballyTerminated(
jobID));
} else if (jobManagerRunnerRegistry.isRegistered(jobID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (jobManagerRunnerRegistry.isRegistered(jobID)
} else if (jobManagerRunnerRegistry.isRegistered(jobID)

Ok, that's a tricky one which I missed: The Dispatcher has one requirement: Any state access needs to happen in the main thread of the Dispatcher. There's a special implementation of JobManagerRunnerRegistry that ensures this invariant (see OnMainThreadJobmanagerRunnerRegistry). Our change (with the thenCompose being chained with the future that's returned by isInGloballyTerminalState) is going against this requirement. Why?

isInGloballyTerminalState calls jobResultStore.hasJobResultEntryAsync which executes the logic on the ioExecutor (i.e. a thread for IO operations which is not the Dispatcher's main thread) internally. The returned future is linked to this executor, i.e. any chained CompletableFuture calls will run in the same thread. The thenCompose logic is, therefore, also executed in the ioExecutor instead of the main thread.

To workaround this, we have to change the executor for the chained execution. This can be achieved by using thenComposeAsync, instead. Here we would specify the main thread executor by calling getMainThreadExecutor. One example where it's done like that is Dispatcher:619: The cleanupAsync method is executed on the ioExecutor. But the error handling has to happen in the main thread again. That's where we use handleAsync with getMainThreadExecutor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed explanation! 😄 I've modified the thenCompose to thenComposeAsync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't do a pull before adding the changes (I did a force-push to include a few minor changes previous). These changes were reverted with your most-recent push

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😼 I have added the changes in the comparison. Really sorry for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Let's see whether CI becomes 🟢 this time. 🤞

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is green. 👍 Thanks again :)

@XComp XComp merged commit c9fcb0c into apache:master Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants