Skip to content

feat(server): support single computer do schedule task#2676

Open
shirleyStorage wants to merge 5 commits into
apache:masterfrom
shirleyStorage:yaojingyi/computer_do_sche
Open

feat(server): support single computer do schedule task#2676
shirleyStorage wants to merge 5 commits into
apache:masterfrom
shirleyStorage:yaojingyi/computer_do_sche

Conversation

@shirleyStorage
Copy link
Copy Markdown
Contributor

@shirleyStorage shirleyStorage commented Oct 11, 2024

Purpose of the PR

Main Changes

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

shirley118146 added 3 commits October 12, 2024 00:31
Signed-off-by: shirley118146 <shirley118146@gmail.com>
Signed-off-by: shirley118146 <shirley118146@gmail.com>
Signed-off-by: shirley118146 <shirley118146@gmail.com>
@shirleyStorage shirleyStorage changed the title [DNM] Support computer do schedule task Support computer do schedule task Oct 11, 2024
@shirleyStorage shirleyStorage marked this pull request as ready for review October 11, 2024 17:45
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. feature New feature labels Oct 11, 2024
@shirleyStorage
Copy link
Copy Markdown
Contributor Author

@imbajin PTAL~

@shirleyStorage shirleyStorage changed the title Support computer do schedule task Support single computer do schedule task Oct 12, 2024
@imbajin imbajin requested a review from zyxxoo October 12, 2024 08:43
@shirleyStorage
Copy link
Copy Markdown
Contributor Author

@imbajin I updated the test case to the PR description. PTAL ~~

@shirleyStorage
Copy link
Copy Markdown
Contributor Author

@dosu how to rerun "HugeGraph-Server CI / build-server (hbase, 11)"

@JackyYangPassion
Copy link
Copy Markdown
Contributor

@dosu how to rerun "HugeGraph-Server CI / build-server (hbase, 11)"

2024-10-22.22.40.08.mov

Copy link
Copy Markdown
Contributor

@javeme javeme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.
Can I know in what scenario you need to do this?


boolean singleNode = !hasWorkerNode||computerNodeCount==1;
if (singleNode != this.onlySingleNode) {
LOG.info("Switch only_single_node 02 to {}", singleNode);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure 02 means what

}
if (server.role().master()) {
continue;
}else if (server.role().computer()){
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a space after '}', or remove 'else'

return this.selfNodeRole() != null && this.selfNodeRole().master();
public boolean selfIsMasterOrSingleComputer() {
boolean isMaster=this.selfNodeRole() != null && this.selfNodeRole().master();
boolean isSingleComputer=isStandAloneComputer();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect spaces " = "

return this.selfNodeRole() != null && this.selfNodeRole().computer();
}

public boolean isStandAloneComputer(){
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused method?

@imbajin imbajin changed the title Support single computer do schedule task feat: support single computer do schedule task Oct 28, 2024
@imbajin imbajin changed the title feat: support single computer do schedule task feat(server): support single computer do schedule task Oct 28, 2024
@Tsukilc
Copy link
Copy Markdown
Collaborator

Tsukilc commented Nov 10, 2025

Hello, I'm trying to disable the server-role option(https://github.com/hugegraph/hugegraph/pull/85). I'd like to ask what blocked this pr merge before?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to let a standalone computer role node schedule and execute tasks in HugeGraph’s task scheduling flow.

Changes:

  • Allows master checks to pass for selfIsMasterOrSingleComputer().
  • Adds standalone-computer detection in ServerInfoManager.
  • Refreshes single-node state during scheduler heartbeat flow.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/task/TaskManager.java Updates scheduling gate to include standalone computer nodes.
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/task/StandardTaskScheduler.java Allows standard task scheduling/cancel master checks for standalone computer nodes.
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/task/ServerInfoManager.java Adds standalone-computer detection and recalculates single-node state.
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/task/DistributedTaskScheduler.java Updates distributed scheduler requirement checks for standalone computer nodes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +175 to +186
public boolean selfIsMasterOrSingleComputer() {
boolean isMaster=this.selfNodeRole() != null && this.selfNodeRole().master();
boolean isSingleComputer=isStandAloneComputer();
return isMaster||isSingleComputer;
}

public boolean selfIsComputer() {
return this.selfNodeRole() != null && this.selfNodeRole().computer();
}

public boolean isStandAloneComputer(){
return this.onlySingleNode() && this.selfIsComputer();
Comment on lines +241 to +265
protected synchronized void updateIsSingleNode(){
Collection<HugeServerInfo> servers=this.allServerInfos();
boolean hasWorkerNode = false;
long now = DateUtil.now().getTime();
int computerNodeCount=0;

// Iterate servers to find suitable one
for (HugeServerInfo server : servers) {
if (!server.alive()) {
continue;
}
if (server.role().master()) {
continue;
}else if (server.role().computer()){
computerNodeCount++;
}
hasWorkerNode = true;
}

boolean singleNode = !hasWorkerNode||computerNodeCount==1;
if (singleNode != this.onlySingleNode) {
LOG.info("Switch only_single_node 02 to {}", singleNode);
this.onlySingleNode = singleNode;
}
}
Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two remaining behavior issues around standalone computer scheduling. The existing current-head Copilot comments for checkstyle and the mixed worker/computer onlySingleNode case are valid, so I added +1 reactions there instead of duplicating them.


2 CI/status checks failed on the latest head; please check the failed jobs 🔗 https://github.com/apache/hugegraph/actions/runs/16592586432/job/46931607687.

// Update server heartbeat
serverManager.heartbeat();

serverManager.updateIsSingleNode();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ onlySingleNode is first refreshed here, but selfIsMasterOrSingleComputer() is already used by StandardTaskScheduler.checkOnMasterNode() before this periodic loop runs. Since TaskManager starts this loop with a 10 * SCHEDULE_PERIOD initial delay, a standalone server.role=computer node can still reject the first task request after startup with Can't schedule task on non-master server. Please initialize this state during server-info bootstrap, or compute the standalone-computer condition on demand instead of waiting for the background scheduler.

return;
}
if (this.selfIsMaster()) {
if (this.selfIsMasterOrSingleComputer()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ This changes the recovery behavior for standalone computer nodes when their own ServerInfo entry is missing. Before this patch, non-master nodes recreated the missing record locally via saveServerInfo(...); after this change, a standalone computer returns here and waits for re-init like a master. In standalone server.role=computer mode there is no master to recreate that row, so once the server-info vertex is lost the node cannot recover itself anymore. Please keep the local re-save path for standalone computers, or add a dedicated re-init path for them.

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added +1 reactions to the existing Copilot comments that already cover the checkstyle breakage and the single-computer misclassification bug. One additional gap remains around missing coverage.


2 CI/status checks failed; please check the failed jobs: 🔗 https://github.com/apache/hugegraph/actions/runs/16592586432/job/46931607687

return !this.closed && this.graph.started() && this.graph.initialized();
}

protected synchronized void updateIsSingleNode(){
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scheduler role change needs automated coverage. The PR changes when a computer node can pass master-only scheduling checks and recalculates onlySingleNode, but no test files are changed, so the standalone-computer path and the master + worker + single-computer case are unguarded. Please add tests that cover a single computer node scheduling a task and a mixed cluster where one computer node must not make the whole deployment behave as single-node.

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional standalone-computer scheduling gap remains in the distributed scheduler path.


2 CI/status checks failed on the latest head; please check the failed jobs 🔗 https://github.com/apache/hugegraph/actions/runs/16592586432/job/46931607687.

@Override
public void checkRequirement(String op) {
if (!this.serverManager().selfIsMaster()) {
if (!this.serverManager().selfIsMasterOrSingleComputer()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DistributedTaskScheduler now also depends on selfIsMasterOrSingleComputer(), but this scheduler never refreshes the new onlySingleNode flag. TaskManager.scheduleOrExecuteJobForGraph() only calls updateIsSingleNode() inside the StandardTaskScheduler branch, while each scheduler owns its own ServerInfoManager initialized with onlySingleNode = false. In scheduler.type=distributed, a standalone server.role=computer node therefore still fails this requirement check permanently instead of passing as a standalone computer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Can not delete vertex labels when use standalone mode and set server.role=computer

6 participants