Skip to content

fix(cube-master): panic if http server fails to start#127

Open
yyyang91 wants to merge 3 commits into
TencentCloud:masterfrom
yyyang91:fix-cubemaster-port-check
Open

fix(cube-master): panic if http server fails to start#127
yyyang91 wants to merge 3 commits into
TencentCloud:masterfrom
yyyang91:fix-cubemaster-port-check

Conversation

@yyyang91
Copy link
Copy Markdown

  • Fix the conditional logic for port availability detection.
  • Add explicit success logs to print the listening address upon startup.
  • Add detailed error logs when the HTTP server fails to start.
  • Enhance service observability for better troubleshooting.

@chenhengqi
Copy link
Copy Markdown
Collaborator

What's the purpose of this change?

@yyyang91
Copy link
Copy Markdown
Author

yyyang91 commented Apr 30, 2026

What's the purpose of this change?

This modification mainly addresses the following issues.

When the port used by the cubemaster service is occupied, cubemaster does not correctly print error logs. It is difficult to quickly identify the problem based on the logs. Even when it starts successfully, there is no success log similar to that of the cube API.
scree

Comment thread CubeMaster/pkg/server/server.go Outdated
addr = ":http"
}

ln, err := net.Listen("tcp", addr)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change work with cubemastercli?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is ported from the Go standard library (net/http/server.go, v1.24.8).

I migrated the original implementation here to gain more granular control over the startup sequence. Specifically, it allows us to inject the startup/failure logs and port check logic directly into the serving loop, which addresses the observability requirements for cube-master.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not answer my question.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works

Comment thread CubeMaster/pkg/server/server.go Outdated

actualAddr := ln.Addr().String()
CubeLog.Infof("cube-master listening on on %s", actualAddr)
stdlog.Printf("cube-master listening on on %s", actualAddr)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a stdlog?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because:

  1. CubeLog.Infof only writes logs to a specific business log file /data/log/CubeMaster/cubemaster-req.log
  2. start_with_pidfile fucniton just redirect stdout/stderr to /var/log/cube-sandbox-one-click/cubemaster.log, we cannot find errors in /var/log/cube-sandbox-one-click/cubemaster.log

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the logs under /data/log/CubeMaster/meet the requirements.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, the installation logs directed me to the /var/log/cube-sandbox-one-click directory to check for errors, You can view the screenshot. #127 (comment)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean just keep stdlog when the server start error? If the server start successfully, just logs to /data/log/CubeMaster/

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about panic when the requested port is in use?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think a panic is more clearer. I didn't use it because I'm not sure about the project's policy on panicking. Additionally, I found out that CubeLog.Fatal doesn't exit the program.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about panic when the requested port is in use?

I think the root cause of the current issue is likely that CubeLog.Errorf does not trigger a panic or exit the program.😅

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fslongjin PTAL.

- Fix the conditional logic for port availability detection.
- Add explicit success logs to print the listening address upon startup.
- Add detailed error logs when the HTTP server fails to start.
- Enhance service observability for better troubleshooting.
@yyyang91 yyyang91 force-pushed the fix-cubemaster-port-check branch from 83c5952 to 9e7ebd3 Compare May 6, 2026 04:14
Comment thread CubeMaster/pkg/server/server.go Outdated
addr = ":http"
}

ln, err := net.Listen("tcp", addr)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the real issue is that the requested port is in use, we can simply panic here.

Copy link
Copy Markdown
Author

@yyyang91 yyyang91 May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that a panic only terminates the program, while a non-panic means the process is still running. To speed up troubleshooting, it's necessary to have logs indicating whether the service started successfully. Should we write success and error messages to `/var/log/cube-sandbox-one-click/cubemaster.log? This is because when the installation script encounters an error, it prompts us to check the files in the directory.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Panic will log to stderr and should be captured, no?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Panic will log to stderr and can be captured. but it will print the stack. Is it ok to print the stack?

@yyyang91 yyyang91 requested a review from chenhengqi May 7, 2026 03:13
@yyyang91
Copy link
Copy Markdown
Author

yyyang91 commented May 8, 2026

I revert the preious commit, and add a commit just panic.

@yyyang91 yyyang91 changed the title fix(cube-master): improve port check logic and add startup/failure logs fix(cube-master): panic if http server fails to start May 8, 2026
@fslongjin
Copy link
Copy Markdown
Member

I believe a better solution here would be to introduce a synchronization point​ before the startup completes. If an issue is detected, trigger a graceful shutdown​ instead of panicking outright. We should already have a graceful exit mechanism in place. @yyyang91

@yyyang91
Copy link
Copy Markdown
Author

yyyang91 commented May 9, 2026

I believe a better solution here would be to introduce a synchronization point​ before the startup completes. If an issue is detected, trigger a graceful shutdown​ instead of panicking outright. We should already have a graceful exit mechanism in place. @yyyang91

On the master branch, when serverTmp.Run() occur errors would not triger a graceful exit mechanism.

recov.GoWithRecover(func() {
serverTmp.Run()
})

I revert the commit that can tigger graceful exit yesterday.

@yyyang91
Copy link
Copy Markdown
Author

When the port is already in use, cube-master hangs silently instead of exiting. The graceful shutdown mechanism is not triggered, so the process just sits there. At the same time, no error is written to /var/log/cube-sandbox-one-click/cubemaster.log — the directory the installation script points you to when something goes wrong.
So when the install fails and the ops team follows the script's prompt to check logs, there is no trace of the startup failure anywhere.

What is the project's expected behavior here? CubeLog alone cannot emit errors to the path the installation script expects, so should the process exit immediately, trigger a graceful shutdown, or is there another mechanism to make startup failures visible in the installation script's diagnostic directory?

@chenhengqi @fslongjin
Please confirm the direction and I'll implement accordingly.

Copy link
Copy Markdown
Collaborator

@chenhengqi chenhengqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK with this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants