Replies: 67 comments 2 replies
-
This is indeed a very useful suggestion. I also have been thinking on how to do this since some time. Please find a couple of comments from my own experience. First, I wouldn't advise on using Instead, I propose to implement a simple healthcheck routine in the Portainer binary itself that can then be used by Docker during healthchecks. In this case, Portainer can dial to itself requesting a status update and return the appropriate result and exit level if HTTP code is 2XX or non 2XX. Luckily, Portainer already implements a status API endpoint that can be leveraged for this proposal. Therefore we just need to implement a simple flag, e.g. For example:
With the above in place, then healthchecks can be enabled in a Portainer stack with the following: healthcheck:
test: ['CMD', 'portainer', '--healthcheck'] For reference, this is how the Kong API Gateway does healthcheck, i.e. Moreover, this same approach can also be implemented for the Portainer Agent binary. @itsconquest if you and the Portainer team agree on this idea, I can work on it relatively quick as it doesn't involve working with UI elements and I can easily test on my side. |
Beta Was this translation helpful? Give feedback.
-
@ElleshaHackett @hhromic This would indeed be a nice way to go. |
Beta Was this translation helpful? Give feedback.
-
@Ornias1993 no I have not started working on this :) @deviantony @itsconquest now that I've got more familiar with the Portainer codebase, perhaps I can code a prototype and submit as a PR for review? |
Beta Was this translation helpful? Give feedback.
-
@hhromic Ahh, okey... Happens the best of us :) I read through most of the previous discussions about it. I think the fastest way of getting feedback is throwing in a prototype and work from there indeed. 👍 |
Beta Was this translation helpful? Give feedback.
-
Alright then, I'll put a prototype together this week and see how it goes ! |
Beta Was this translation helpful? Give feedback.
-
Sounds like a good idea! I look forward to reviewing your work @hhromic :) |
Beta Was this translation helpful? Give feedback.
-
Could be good also to have control over the healthcheck of the image or even disable the healthcheck according to https://docs.docker.com/engine/reference/run/#healthcheck |
Beta Was this translation helpful? Give feedback.
-
@rhuanbarreto You can always overrule it in docker. So thats a given. |
Beta Was this translation helpful? Give feedback.
-
Yes. But is it possible to do it in portainer? |
Beta Was this translation helpful? Give feedback.
-
Thats not the scope of this issue, there is another issue for handling healthchecks inside portainer though. |
Beta Was this translation helpful? Give feedback.
-
Actually this was already implemented way before this issue... And got reverted just because it isn't compatible with the --ssl flag (which makes it unsuitable to add to the dockerfile). |
Beta Was this translation helpful? Give feedback.
-
Hey guys, Just stumbled across this, was there any movement on the --healthcheck? I understand there were a few issues with the previous solution Thanks! |
Beta Was this translation helpful? Give feedback.
-
Maintainers are not interested it seems. |
Beta Was this translation helpful? Give feedback.
-
Would really like this feature also, it's a little odd that a platform designed for managing and monitoring your docker containers doesn't include the option to monitor itself. 🤷♂️ |
Beta Was this translation helpful? Give feedback.
-
@hhromic was there any updates your end? |
Beta Was this translation helpful? Give feedback.
-
Thank you for that information. I will dig deeper. I did try the environment and still same issue. Definitely strange. And I never saw that environment line in the code sample portainer provided us since it's also ran in the command section. Per Portainer Swarm Setup
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
AGENT_CLUSTER_ADDR: localhost This seemed to work. for some reason it doesn't let the DNS work properly in healthcheck |
Beta Was this translation helpful? Give feedback.
-
After a while, i had this issue on the agents - i think the agents got restarted but then couldnt start due to a dns problem
And the UI reported this
I tried something similar but it doesn't work in a docker swarm, although for single node swarm or services it should be ok What im looking at now is how to trigger all the portainer containers to restart if one of the agent fails the healthcheck, maybe with a seperate docker container monitoring them - or updating the healthcheck to trigger the parent docker host to relaunch the containers. FYI - I've also updated the comment with a healthcheck api call so you know its up and running for the UI
|
Beta Was this translation helpful? Give feedback.
-
With (Portainer still needs a proper |
Beta Was this translation helpful? Give feedback.
-
@lonix1 i prefer to call It pretty much doing what a healthcheck endpoint is doing, just giving more info about the status 🚀 |
Beta Was this translation helpful? Give feedback.
-
@t0mtaylor I didn't consider the log. Good idea. The response is this:
So to be complete, in a script, I'd do something like this: [ $(wget --quiet -O- --tries=1 http://localhost:9000/api/system/status | sed -nE 's/.*Version":"([^"]*)".*/\1/p' | wc -l) = 1 ] \
&& echo up || echo down That not only checks that the page exists, but that it is returning expected data. I've extracted the However in a compose file, I'd do something simpler: healthcheck:
# ...
test: wget --no-verbose --tries=1 --spider http://localhost:9000/api/system/status || exit 1 |
Beta Was this translation helpful? Give feedback.
-
@lonix1 Yea i would keep it simple for the healthceck as its giving you enough to determine its healthy I do something similar checking the version in a bash script which checks services are running every 5 mins and also check how many containers are running per service, as docker can still be a bit flakey and services vanish from the swarm! I've updated the main comment #3572 (comment) as theres an issue with the healthcheck for agents when running in swarm mode - but running single node on a rapsberry pi for example both healthchecks for UI and Agents work, as @sgtcoder has confirmed on his setup 👍 |
Beta Was this translation helpful? Give feedback.
-
Thank you guys for all the updates. I applied a bunch of the suggestions. I still had to use localhost on single swarm node, but it seems to work aside from the TLS handshake log errors. I had issues in general with using more than one docker node swarm with trying to replicate storage with both performance issues and overhead, so I just stick with one node for now. Start period of 5 seconds seems to be fine for me. Running on a dedicated HPe DL380 Gen9 server with the docker VM configured with 32GB RAM and 32vCPU. Here is what I have now
|
Beta Was this translation helpful? Give feedback.
-
@sgtcoder try the wget for the agent healthcheck and that will remove the tls handshake errors :)
These healthchecks work As a workaround, I have a separate bash script checking with docker that the agent containers are up and running on each server, and ive exposed port 9001 so i can wget that also on each server - not ideal but a way forward until @tamarahenson and team improve the agent - ideally they add a http |
Beta Was this translation helpful? Give feedback.
-
I tried the wget again, but for whatever reason, that causes the check to fail, whereas the nc command works. |
Beta Was this translation helpful? Give feedback.
-
@sgtcoder Have you tried the wget via sh in the container whilst the agent is running? whats the output? does it have an error?
with returned the shell ready to use on the agent container
my output is this - its an error 400 but thats good as it hit the agent on port 9001:
|
Beta Was this translation helpful? Give feedback.
-
docker exec -it agent sh/app # wget --no-check-certificate --no-verbose --tries=3 --spider --header='Content-Type:application/json' https://localho that's the way how it works you need to specify httpS not just http and this way spawns no extra SSL log warnings like "http: TLS handshake error from 172.24.0.1:54186: tls: first record does not look like a TLS handshake" |
Beta Was this translation helpful? Give feedback.
-
Is there a solution for Portainer only, without its agent? I am unable to attach to the container, as if it does not have bash or sh. |
Beta Was this translation helpful? Give feedback.
-
He in case it wasn't obvious to anyone reading this report, in order to use healthchecks for the portainer and portainer-agent containers, you'll need the alpine versions in order to perform these "Healthchecks" on the containers. |
Beta Was this translation helpful? Give feedback.
-
Describe the feature
Being able to see a "health status" of the Portainer Docker container.
Describe the solution you'd like
I would like support for the Docker Healthcheck (that is also shown in Portainer.io 's own dashboard and probably other Docker management software).
Describe alternatives you've considered
Alternative is setting up something similarly without the use of the already existing tools within Docker.
Additional context
The
Dockerfile
could contain something like this:HEALTHCHECK --interval=60s --timeout=10s --retries=3 CMD curl -sS http://localhost:9000 || exit 1
.For debugging and testing purposses you can use:
docker inspect --format "{{json .State.Health}}" containername
Beta Was this translation helpful? Give feedback.
All reactions