New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the number of active scheduler is 0, the running peer will crash on its own #3158
Comments
|
Sorry, I found out that I am using version v2.1.10 not v2.1.0. Since which version has the second problem been fixed? |
I have used the latest version, but it still causes peer crashes when I scale scheduler to 0; but I found the log in peer {"log":"2024-04-01T16:11:18.602Z\u0009WARN\u0009dependency/dependency.go:149\u0009receive signal: terminated\n","stream":"stderr","time":"2024-04-01T16:11:18.60252273Z"} and some log in kubelet is as below, the container was killed because grpc liveness failed multiple times Liveness probe for "dragonfly-dfdaemon-m76dl_dragonfly-system(7b1c5b41-9bda-4ed9-98f8-57655c5fa0c1):dfdaemon" failed (failure): service unhealthy (responded with "NOT_SERVING") Killing unwanted container "dfdaemon"(id={"docker" "fc63d3f4032abca8e59bc1ea9cac3cfa4cd4f85e14c4331a431913b5d81f266b"}) for pod "dragonfly-dfdaemon-m76dl_dragonfly-system(7b1c5b41-9bda-4ed9-98f8-57655c5fa0c1)" |
Without a scheduler, the dfdaemon health check will fail and k8s will restart the daemon. If you do not want it to be restarted, you can delete the probe. |
thx, I found relevant feat. #2130 |
Bug report:
When the number of active schedulers becomes 0, the peer will continuously crash on its own, causing machines using Docker mode to be unable to pull images(docker proxy exist but daemon crash)
Here are the logs before and after peer restart
after-crash.log
before-crash.log
Expected behavior:
When the number of active schedulers is 0, the running peer can continue to run normally and back source on its own
How to reproduce it:
Scale the scheduler to 0 and wait for up to 5 minutes (Redis cache expiration time)
Environment:
The text was updated successfully, but these errors were encountered: