Skip to content

Conversation

akalash
Copy link
Contributor

@akalash akalash commented Jun 1, 2022

What is the purpose of the change

Checkpoint trigger always from one thread

Brief change log

  • *Trigger non-periodic checkpoint in 'timer' thread *

Verifying this change

No extra tests

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 1, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.locks.LockSupport;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @akalash , thanks for your contribution. Could you run 'mvn spotless:apply' to remove these useless code?

@akalash
Copy link
Contributor Author

akalash commented Jun 23, 2022

I think we can rethink triggerCheckpoint and use it only from one timer thread(main coordinator thread in the future). Right now, I introduced the second method triggerCheckpointFromCheckpointThread because it is simple and it doesn't require a lot of changes but theoretically, we can use only one method triggerCheckpoint and if it is called from the non-expected thread(check the name of the thread?) then we can automatically wrap it with required thread(the same logic as we use in mailbox). Or at least we should check the condition that the triggerCheckpoint is called from the expected thread but it requires many changes in tests since many tests call this method directly from main test thread right now.

Copy link
Member

@1996fanrui 1996fanrui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the update.

Copy link
Contributor

@rkhachatryan rkhachatryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @akalash, in general it LGTM.
I've left a couple of comments regarding the interface though, PTAL.

Comment on lines 505 to 506
public CompletableFuture<CompletedCheckpoint> triggerCheckpointFromCheckpointThread(
boolean isPeriodic) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should not accept boolean isPeriodic:
If periodic triggering is enabled via startCheckpointScheduler(), I couldn't imagine a use case for that - and there is no callers using true.
If it's disabled with stopCheckpointScheduler() then it doesn't make sense to call this method at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same point about tests. But perhaps, I can migrate these tests to another method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, there are a lot of existing usages of triggerCheckpoint(false) in tests, so please feel free to leave it as is.

Copy link
Contributor

@rkhachatryan rkhachatryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the PR @akalash
LGTM

Please squash the commits and rebase, I suppose the failure is caused by FLINK-28269.

@HuangXingBo
Copy link
Contributor

Hi @akalash , could you rebase master and push again?

@HuangXingBo
Copy link
Contributor

@flinkbot run azure

liujiawinds pushed a commit to liujiawinds/flink that referenced this pull request Jul 22, 2022
huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants