Skip to content

[FLINK-38901][runtime-web] Introduce the Rescales/Configuration sub-page for streaming jobs with the adaptive scheduler enabled#27826

Open
och5351 wants to merge 2 commits intoapache:masterfrom
och5351:feature/FLINK-38901
Open

[FLINK-38901][runtime-web] Introduce the Rescales/Configuration sub-page for streaming jobs with the adaptive scheduler enabled#27826
och5351 wants to merge 2 commits intoapache:masterfrom
och5351:feature/FLINK-38901

Conversation

@och5351
Copy link

@och5351 och5351 commented Mar 25, 2026

What is the purpose of the change

  • [FLINK-38901][runtime-web] Introduce the Rescales/Configuration sub-page for streaming jobs with the adaptive scheduler enabled
  • The pr is not blocked by FLIP-495 completion and is independent sub-task in FLIP-487.

Brief change log

Adds the 'Rescale' tab and 'Configuration' subpage in relation to [FLINK-38897][Runtime/REST] Introduce /jobs/:jobid/rescales/config endpoint to REST API #27580.

image

Verifying this change

Environmental preparation

# Preparing the Flink Environment
./mvnw clean install -DskipTests -pl flink-runtime,flink-dist -am -P skip-webui-build -Dmaven.wagon.http.ssl.insecure=true -Dmaven.wagon.http.ssl.allowall=true

# Enable adaptive scheduler 
vi ./build-target/conf/config.yaml

# Run cluster
./build-target/bin/start-cluster.sh


# Prepare node packages
cd flink-runtime-web/web-dashboard
npm install
npm run proxy 

CASE 1. Streaming Job & Adaptive scheduler

image
# Run socket
nc -lk 9999

# Run streaming example
./build-target/bin/flink run -t remote -m localhost:8081 ./build-target/examples/streaming/SocketWindowWordCount.jar --port 9999
image

CASE 2. Batch Job

# Run batch example
./build-target/bin/flink run -t remote -m localhost:8081 ./build-target/examples/table/WordCountSQLExample.jar
image

CASE 3. Only Streaming job

image
# Stop cluster
./build-target/stop-cluster.sh

# Disable adaptive scheduler
vi ./build-target/conf/config.yaml

# Run cluster
./build-target/start-cluster.sh

# Run streaming example
./build-target/bin/flink run -t remote -m localhost:8081 ./build-target/examples/streaming/SocketWindowWordCount.jar --port 9999
image

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 25, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@och5351
Copy link
Author

och5351 commented Mar 25, 2026

Hi, @RocMarshal!
Would you mind taking a look at this PR if you have time?

Copy link
Contributor

@RocMarshal RocMarshal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! @och5351
Left a few of comments. PTAL in your free time.

Comment on lines +137 to +145
const checkpointIndex = this.checkpointIndexOfNav();
if (data.plan.type == 'STREAMING' && checkpointIndex == -1) {
this.listOfNavigation.splice(this.checkpointIndexOfNavigation, 0, {
path: 'checkpoints',
title: 'Checkpoints'
});
} else if (data.plan.type == 'BATCH' && index > -1) {
this.listOfNavigation.splice(index, 1);
} else if (data.plan.type == 'BATCH' && checkpointIndex > -1) {
this.listOfNavigation.splice(checkpointIndex, 1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change could be committed as a alone commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how to split the commit message appropriately, so I created a sub-issue [FLINK-39325] under [FLIP-487] for the rescales tab logic and committed it separately with the new ticket number.

Copy link
Contributor

@RocMarshal RocMarshal Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@och5351 It should be sufficient to use a separate commit with the [hotfix] prefix here, since this is just a minor variable naming issue, right?

BTW, the change of the hotfix commit should contain the checkpointIndex->checkpointNavIndex related naming only.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @RocMarshal !
Thank you for your review.

I have applied the fix for[hotfix] Dynamically show/hide rescales tab.

Could you please take a look and review it again?

disabledInterval = 0x7fffffffffffffff;

public rescalesConfig?: RescalesConfig;
public jobDetail: JobDetailCorrect;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the JobDetailCorrect be JobDetail?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image Image

I followed the existing pattern from job-checkpoints.component.ts, which uses JobDetailCorrect for its jobDetail property. Since JobDetailCorrect extends JobDetail, the variable naming appears to follow the base type name for consistency across components.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @och5351 If I understand correctly, this class is used here only to obtain information such as the job ID, job type, and scheduler type.

However, JobDetailCorrect provides a much broader range of information than this scope. In addition, when it comes to displaying the detailed information for each rescale in the future, reusing JobDetailCorrect does not seem to be an appropriate choice.

Please let me know that's your opinon.

Copy link
Author

@och5351 och5351 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @RocMarshal

You're right that currently we only use job ID, type, and scheduler type from the jobDetail. However, I used JobDetailCorrect because:

  1. [1]JobLocalService.jobDetailChanges() returns Observable, so using JobDetail would cause a type mismatch
  2. The job-checkpoints.component.ts follows the same pattern for consistency
  3. The jobDetail here serves as job metadata (scheduler, type, etc.), while future rescale-specific details would come from RescalesConfig

Would you prefer changing JobLocalService to return JobDetail instead, or keeping the current approach for consistency with other components?

[1] flink-runtime-web/web-dashboard/src/app/pages/job/job-local.service.ts

Copy link
Contributor

@RocMarshal RocMarshal Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @och5351 for the clarification.

I had a try on JobDetailCorrect -> JobDetail, It worked.

If the code logic does not rely on the extensions made to the plan property in JobDetailCorrect(i.e., it does not use plan.streamNodes, plan.streamLinks, or plan.nodesas NodesItemCorrect[]), then this usage is type-safe and appropriate.

Pls correct me if wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @RocMarshal!

You're absolutely right. I tested the JobDetailCorrect → JobDetail change and it works correctly.

I really appreciate your code analysis - it helped me understand the type hierarchy better and I learned a lot from your insights.

I've made the changes and rebased the commits. Could you please take another look?

@och5351 och5351 force-pushed the feature/FLINK-38901 branch 3 times, most recently from e3c665e to 00e4b62 Compare March 26, 2026 00:17
@och5351 och5351 requested a review from RocMarshal March 26, 2026 00:48
@RocMarshal RocMarshal self-assigned this Mar 26, 2026
Comment on lines +137 to +138
const checkpointNavIndex = this.checkpointIndexOfNav();
if (data.plan.type == 'STREAMING' && checkpointNavIndex == -1) {
Copy link
Contributor

@RocMarshal RocMarshal Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the lack of clarity in my previous message.
What I meant is that the current changes are primarily aimed at adding the rescale navigation button and the rescales/configuration subpage.
However, it is now apparent that this also includes renaming checkpoint-related variables.
I prefer that each patch has changes that are as single-purpose or have a single responsibility as possible.
If you also prefer this approach, we should split the renaming of checkpoint-related variables into a separate hotfix like [hotfix][runtime-web] Polish the checkpoint related variables naming.
The other changes can remain in the commit for FLINK-38901.

disabledInterval = 0x7fffffffffffffff;

public rescalesConfig?: RescalesConfig;
public jobDetail: JobDetailCorrect;
Copy link
Contributor

@RocMarshal RocMarshal Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @och5351 for the clarification.

I had a try on JobDetailCorrect -> JobDetail, It worked.

If the code logic does not rely on the extensions made to the plan property in JobDetailCorrect(i.e., it does not use plan.streamNodes, plan.streamLinks, or plan.nodesas NodesItemCorrect[]), then this usage is type-safe and appropriate.

Pls correct me if wrong.

Comment on lines +88 to +90
<td *ngIf="rescalesConfig['rescaleHistoryMax'] === -1"></td>
<td *ngIf="rescalesConfig['rescaleHistoryMax'] !== -1">
{{ rescalesConfig['rescaleHistoryMax'] }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to show the raw value from the response~
Because there will be a default value for the rescaleHistoryMax.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've applied this change in the last commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@och5351 och5351 force-pushed the feature/FLINK-38901 branch 4 times, most recently from a571231 to 973a0ee Compare March 26, 2026 04:56
@och5351 och5351 requested a review from RocMarshal March 26, 2026 05:34
…age for streaming jobs with the adaptive scheduler enabled
@och5351 och5351 force-pushed the feature/FLINK-38901 branch from 9ef4694 to f96e323 Compare March 26, 2026 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants