-
-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes healthcheck gives access denied #386
Comments
@pbwur Thanks, The change must be in PR #380, #382, #384 or #385 then. What does the Verne log tell? @ashtonian does this ring a bell to you, from the changes to add optional listeners? 👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq |
ik zi niks on de logging dat wijst op een probleem bij de healthcheck. When the first pod (of 3) starts there are a lot of log statements like: vmq_swc_store:handle_info/2:555: Replica meta4: Can't initialize AE exchange due to no peer available After a while VerneMq exists. But before that I'm able to execute the healthcheck using http://localhost:8888/health successfully. 024-05-02T08:53:35.711676+00:00 [debug] <0.292.0> vmq_swc_store:handle_info/2:555: Replica meta9: Can't initialize AE exchange due to no peer available |
Those "Replica" logs are normal when you have debug log level on. 👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq |
Probably need to add this back: |
@ashtonian Thanks, I reverted this here: #387 👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq |
@pbwur I have now uploaded 2.0.0 images with a tentative fix to Dockerhub. Can you test one of those to check whether the Kubernetes Health check works now? 👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq |
@ioolkos, it seems to work now. All 3 nodes of the cluster are starting now. Thanks for the great response! Although probably not related, I do get an error with the second node after the first node starts successfully. After I delete the persistentStoraceClaim and start the cluster again, everything is ok. This is part of the logging: 2024-05-03T09:00:36.793105+00:00 [info] <0.686.0> vmq_diversity_app:start/2:85: enable auth script for postgres "./share/lua/auth/postgres.lua" Runtime terminating during boot ({{badkey,{'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local',<<34,100,99,27,209,16,239,117,147,202,59,36,181,234,60,253,91,83,95,77>>}},[{erlang,map_get,[{'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local',<<34,100,99,27,209,16,239,117,147,202,59,36,181,234,60,253,91,83,95,77>>},#{}],[{error_info,#{module=>erl_erts_errors}}]},{vmq_swc_plugin,'-summary/1-lc$^1/1-1-',3,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,220}]},{vmq_swc_plugin,'-summary/1-lc$^1/1-1-',3,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,220}]},{vmq_swc_plugin,history,1,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,230}]},{vmq_swc_peer_service,attempt_join,1,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_peer_service.erl"},{line,57}]},{vmq_server_cli,'-vmq_cluster_join_cmd/0-fun-1-',3,[{file,"/opt/vernemq/apps/vmq_server/src/vmq_server_cli.erl"},{line,516}]},{clique_command,run,1,[{file,"/opt/vernemq/_build/default/ Crash dump is being written to: /erl_crash.dump...[os_mon] memory supervisor port (memsup): Erlang has closed |
@pbwur I have the same issue as the one you describe in your last comment above: When restarting a pod of the vernemq stateful set, I get the exact same error; only after deleting the PVC (and underlying PV) and restarting the pod it comes up again. This issue started with 2.0.0, I did not have it with 1.13. Did you, by any chance, resolve that issue on your side? If yes, I would be thankful to hear how :) |
@pbwur @hsudbrock Currently looking into the PVC related start error; it looks like some sort of regression. The following setting in
|
Hi @hsudbrock and @ioolkos , apologies for the late response. That issue did still happen here also. |
@pbwur (translate |
Thanks for the hint and the PR for fixing the issue! For me, so far it looks good, i.e., disabling the nonempty join check resulted in no errors when restarting my vernemq cluster so far. |
Hi,
I'm using the 2.0.0 version of Vernemq with he helmchart. Unfortunately the pod in Kubernetes remains unhealthy. The errormessage is:
Readiness probe failed: Get "http://10.244.76.200:8888/health": dial tcp 10.244.76.200:8888: connect: connection refused
From with the pod using curl with the url http://localhost:8888/health the response is as expected: {"status":"OK"}
It seems the used IP address is the problem.
Using version 2.0.0-rc1 works ok. So looking for the difference here
The text was updated successfully, but these errors were encountered: