-
Notifications
You must be signed in to change notification settings - Fork 23.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support of healthcheck in docker_container module #46772
Add Support of healthcheck in docker_container module #46772
Conversation
The test
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Looks good. Besides all the little things I commented on, there's one thing which is not good: the option is not checked for idempotency.
If the healthcheck details can only be set on container creation, you need to check Container.has_different_configuration()
. If the healthcheck details can be changed during runtime, it must be in Container.has_different_resource_limits()
/ TaskParameters.update_parameters()
/ Container.update_limits()
(in that case, we should rename some of the functions).
type: dict | ||
description: | ||
- Configure a check that's run to determine whether container is healthy | ||
- 'I(test) - Test to perform to determine container health.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should document the elements in the dict better. It is not clear from this short description that test
is a list of strings, which is the command which will be executed (and what CMD
in front of it does). Also, it is unclear to me how this command reports the health (return code? timeout?). If it is too complicated to describe it here, you should link to a document where this is explained in detail (docker documentation?) and say that it is similar to there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general: you should also document default values, if there are any.
description: | ||
- Configure a check that's run to determine whether container is healthy | ||
- 'I(test) - Test to perform to determine container health.' | ||
- 'I(interval) - The time to wait between check.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here (or somewhere else) you should describe in which format time can be specified.
image: image_name | ||
state: started | ||
healthcheck: | ||
test: ["CMD", "curl", "-f", "http://localhost"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should add a short comment explaining what this test achieves. I.e. something like Tries to make a HTTP GET to localhost. If this fails or timeouts, the healthcheck fails.
(if that's correct).
if self.healthcheck.get('test') is None: | ||
return None | ||
|
||
if self.healthcheck.get('interval') is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you create a new dict which only contains the parameters you support?
parts = regex.match(time_str) | ||
|
||
if not parts: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can't parse it, you should fail instead of returning None
.
''' | ||
Return time duration in nanosecond | ||
''' | ||
regex = re.compile(r'((?P<hours>\d+?)hr)?((?P<minutes>\d+?)m)?((?P<seconds>\d+?)s)?') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you allow fractions of a second? (Not that it makes too much sense, but after all, docker expects nanoseconds.)
@@ -753,6 +775,7 @@ def __init__(self, client): | |||
self.exposed_ports = None | |||
self.force_kill = None | |||
self.groups = None | |||
self.healthcheck = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use dict()
here; use None
. It will be overwritten anyway.
@@ -845,6 +868,10 @@ def __init__(self, client): | |||
self.ulimits = self._parse_ulimits() | |||
self.sysctls = self._parse_sysctls() | |||
self.log_config = self._parse_log_config() | |||
|
|||
if self.healthcheck: | |||
self.healthcheck = self._parse_healthcheck_options() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the if
and make _parse_healthcheck_options()
handle self.healthcheck == None
.
|
||
return self.healthcheck | ||
|
||
def convert_duration_to_nanosecond(self, time_str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a private function, so its name should start with a _
.
2b881f1
to
04565c3
Compare
The test
The test
The test
The test
The test
The test
The test
The test
|
04565c3
to
2329d24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idempotency question is still open. Also, there's something you have to think about: how can the user indicate that an existing healthcheck for a container should be removed? There must be three states:
- Healthcheck is explicitly added
- Healthcheck is explicitly removed (maybe indicate as empty dict?)
- Healthcheck is ignored (when parameter
healthcheck
isNone
/ not specified)
- 'Configure a check that is run to determine whether or not containers for this service are "healthy". | ||
See the docs for the L(HEALTHCHECK Dockerfile instruction,https://docs.docker.com/engine/reference/builder/#healthcheck) | ||
for details on how healthchecks work.' | ||
- 'I(test) - Command to run to check health. C(test) must be either string or list. If it is list first item must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C(test) must be either *a* sting or *a* list. If it is *a* list, *the* first item must be *one of* C(NONE), ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. I will update this. Thank you @felixfontein for pointing.
duration_options = ['interval', 'timeout', 'start_period'] | ||
|
||
for (key, value) in supported_healthcheck_options.items(): | ||
if not self.healthcheck.get(key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should write if key not in self.healthcheck:
to avoid confusing falsish values with non-specified values.
# Convert provided durations into nanoseconds | ||
self.healthcheck[key] = self._convert_duration_to_nanosecond(self.healthcheck[key]) | ||
|
||
return self.healthcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again: why don't you create a new dict which only contains the parameters you support?
@@ -931,6 +963,7 @@ def create_parameters(self): | |||
labels='labels', | |||
stop_signal='stop_signal', | |||
working_dir='working_dir', | |||
healthcheck='healthcheck', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
healthcheck
was only added in docker-py 2.0 (see changelog), while we support also older versions of docker-py. You have to make sure that this parameter is only added for docker-py 2.0 or newer (and inform the user that if healthcheck
is specified and docker-py < 2.0, that the value will be ignored -- see other examples in docker_container
, like init
).
eb216eb
to
f837373
Compare
time_params[name] = int(value) | ||
|
||
time_in_seconds = timedelta(**time_params).seconds | ||
time_in_nanoseconds = time_in_seconds * 1000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you ignore the fractional parts (milliseconds, microseconds). You need to add 1000 * timedelta(**time_params).microseconds
. (Also, you should store timedelta(**time_params)
in a variable, instead of time_in_seconds
, and work with that one.)
I.e. something like:
time = timedelta(**time_params)
time_in_nanoseconds = (time.seconds * 1000000 + time.microseconds) * 1000
69cf25d
to
c597b44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I think the only unsolved problem is: how to remove healthcheck from an existing container? Specifying an empty dict for healthcheck
does not remove it (since it is interpreted the same as when it is not specified).
Two more things:
As soon as you added a way to disable healthchecks, you should also add that to the test (i.e. the sequence should be enable, idempotency, change, disable) :) |
Regarding implementing disabling: maybe you should define Then you can add |
The error in rhel/7.5/2 seems to be a random fluke (the real test didn't even start); you can trigger a restart of the CI by closing and re-opening the PR. |
(It's also possible to restart individual CI tests, but that can only be done by very few people, and the chances finding someone right now aren't too good :) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, up to two minor improvements (one bug if the healthcheck interval is 1 day or more, and one additional test).
time_params[name] = int(value) | ||
|
||
time = timedelta(**time_params) | ||
time_in_nanoseconds = (time.seconds * 1000000 + time.microseconds) * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, there's one thing we forgot: a timedelta
object also has days
, and seconds are up to one day. Fortunately, there's an easier way to do this all:
time_in_nanoseconds = (time.seconds * 1000000 + time.microseconds) * 1000 | |
time_in_nanoseconds = int(time.total_seconds() * 1000000000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Using ((time.days * 3600 * 24 + time.seconds) * 1000000 + time.microseconds) * 1000
is more exact, but that exactness is only relevant if the time interval is more than 270 years or so, which seems slightly unreasonable for docker healthchecks ;-) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with changes. Thanks @felixfontein for suggestion.
@@ -2381,6 +2512,9 @@ def _parse_comparisons(self): | |||
# Add implicit options | |||
comparisons['publish_all_ports'] = dict(type='value', comparison='strict', name='published_ports') | |||
comparisons['expected_ports'] = dict(type='dict', comparison=comparisons['published_ports']['comparison'], name='expected_ports') | |||
comparisons['disable_healthcheck'] = dict(type='value', | |||
comparison='ignore' if comparisons['healthcheck']['comparison'] == 'ignore' else 'strict', | |||
name='disable_healthcheck') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for splitting this :) I forgot to run flake8
manually...
retries: 3 | ||
stop_timeout: 1 | ||
register: healthcheck_3 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one more test which could be relevant (since healthcheck parsing is pretty complex, it could happen that not specifying a healthcheck is treated as healthcheck changed or removed, which we want to avoid):
- name: healthcheck (no change) | |
docker_container: | |
image: alpine:3.8 | |
command: '/bin/sh -c "sleep 10m"' | |
name: "{{ cname }}" | |
state: started | |
stop_timeout: 1 | |
register: healthcheck_4 |
(The healthcheck_
numbers have to be adjusted below, and a healthcheck_4 is not changed
has to be added to the assert.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. I will update healthcheck tests.
`timedelta` object has days too and seconds are up to one day. Therefore use `total_seconds()` to convert time into seconds. Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
This is to avoid the situation when healthcheck is not specified and treat this as healthcheck is changed or removed. Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
The failed test is completely unrelated to the PR itself. I've asked on IRC if someone can restart it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shipit
@kassiansun Can you also take a look at this one?
state: started | ||
healthcheck: | ||
# The "NONE" check needs to be specified | ||
test: ["NONE"] | ||
''' | ||
|
||
RETURN = ''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is RETURN not impacted by this change at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it simply returns the equivalent of docker inspect
as returned by the API, so it already returned healthcheck information (even though the module didn't allow to modify/set it, in case the container was created by something else with a healthcheck, that information was already available here).
|
||
duration_options = ['interval', 'timeout', 'start_period'] | ||
|
||
for (key, value) in options.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use a list
as key, value
are all same? for option in options
seems make later logic clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only reason would be if we'd rename something in the future. But yeah, a list is probably better.
for (key, value) in options.items(): | ||
if value in self.healthcheck: | ||
if value in duration_options: | ||
time = self._convert_duration_to_nanosecond(self.healthcheck.get(value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pass it into the result
if time
is zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think so. If the user specified zero, it probably has a good reason (or should fail if docker doesn't like it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When specify healthcheck parameter to zero, eg. --health-interval=0s
in following example
docker run --name=nginx-proxy -d --health-cmd='curl --fail http://nginx.host.com || exit 1' --health-interval=0s nginx:1.13
Then check docker inspect <container id>
, It does not displays that parameter.
...
"Healthcheck": {
"Test": [
"CMD-SHELL",
"curl --fail http://nginx.host.com || exit 1"
]
},
Does it mean docker daemon ignore such parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is golang
's omitempty
: https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/api/types/container/config.go#L28
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we'd also better drop it, otherwise we'll get idempotency problems...
result[key] = time | ||
elif self.healthcheck.get(value): | ||
result[key] = self.healthcheck.get(value) | ||
if key == 'test': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reduce the level of indents:
elif option == 'test':
elif option == 'retries':
else:
result[option] = self.healthcheck.get(option)
And I don't thinks someone will be bored enough to specify a test: ""
to get into the elif self.healthcheck.get(value):
check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that could improve it a bit. It will be still a bit complex, though :)
(Also, never underestimate users which will specify something which doesn't make sense ;) )
elif key == 'retries': | ||
try: | ||
result[key] = int(result[key]) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we detect such a stupid mistake from users xD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It could happen that someone templates something and gets the wrong value. If docker_container barfs with an exception, we get an Issue, if it returns a nice error, the user will hopefully solve the problem without filing an Issue ;)
self.fail('Cannot parse number of retries for healthcheck. ' | ||
'Expected an integer, got "{0}".'.format(result[key])) | ||
|
||
if result['test'] == ['NONE']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comparisons
can handle such kind of usage @felixfontein ? And per the documentation, NONE
seems only used in Dockerfile
for overriding the parent configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit more complicated. test: ['NONE']
is equivalent to not having a healthcheck, which is something comparsions
doesn't understand. We decided to use the test: ['NONE']
syntax to disable healthchecks because the alternatives are kind of strange. Accepting healthcheck: None
doesn't work (since that means "I don't care if the container has a healthcheck"), and healthcheck: { }
(which we had before) is also kind of strange (and still needs the extra complexity with disable_healthcheck
since the default comparison for dicts, allow_more_present
, won't detect disabling in this case).
Having the explicit test: ['NONE']
seems like the best way (and the most compatbile one with comparisons
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think healthcheck: None
and comparisons: heathcheck: strict
is enough for this?
Well, I prefer using existing functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because None
is the same as not specified, and thus None
always means that no comparison is done. We really need something which is explicitly not None
.
@@ -2434,11 +2568,17 @@ def __init__(self, **kwargs): | |||
if self.module.params.get("runtime") and not runtime_supported: | |||
self.fail('docker API version is %s. Minimum version required is 1.12 to set runtime option.' % (docker_api_version,)) | |||
|
|||
healthcheck_supported = LooseVersion(docker_version) >= LooseVersion('2.0') | |||
if self.module.params.get("healthcheck") and not healthcheck_supported: | |||
self.module.warn("docker or docker-py version is %s. Minimum version required is 2.0 to set healthcheck option. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should better fail here? It will be more consistent with other non-supported options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be fail
instead of warn
.
@@ -1584,6 +1701,8 @@ def has_different_configuration(self, image): | |||
volumes_from=host_config.get('VolumesFrom'), | |||
working_dir=config.get('WorkingDir'), | |||
publish_all_ports=host_config.get('PublishAllPorts'), | |||
expected_healthcheck=config.get('Healthcheck'), | |||
disable_healthcheck=(not config.get('Healthcheck') or config.get('Healthcheck').get('Test') == ['NONE']), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the disable_healthcheck
is redundant, we can leave the logic to comparisons
and users to handle it. Introducing this logic might also conflict with docker
's original logic, as docker run --health-cmd ANYTHING
will always generate a "CMD-SHELL" entry in HealthCheck.Test
(tested on CentOS with docker 1.13).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't really work that way, since docker indicates "no healthcheck" by returning None
(I think), which isn't compatible to what comparisons
does.
I don't understand why this logic (you mean disable_healthcheck
?) conflicts with docker
's original logic?
In any way, I think we have to convert the string syntax for test
to CMD-SHELL
, otherwise idempotency won't work. That's a good point! (We should also have a test for it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so I think we just need to strict
compare these two options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any way, I think we have to convert the string syntax for test to CMD-SHELL
Right. There could be easy fix, like:
if isinstance(result[key], (tuple, list)):
result[key] = [str(e) for e in result[key]]
else:
result[key] = ["CMD-SHELL", str(result[key])]
I will update code for this along with test for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that looks good. In particular, add an idempotency test for this one.
BTW, I think there're too many |
I guess the reason is that in Ansible, and docker_container in particular, there's a big difference between |
Also add another test case to check idempotency when healthcheck test is specified as string Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
This is to make more consistent with other non-supported options. Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
shipit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shipit
@akshay196 thanks a lot for working on this! This new feature will show up in Ansible 2.8. |
@felixfontein @kassiansun @gundalow Thank you very much. I really appreciate your help. :) |
* Add Support of healthcheck in docker_container module Fixes ansible#33622 Now container can be started with healthcheck enabled Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com> * Extend docker_container healthcheck (#1) * Allowing to disable healthcheck. * Added test for healthcheck. * Make sure correct types are used. * Healthcheck needs to be explicitly disabled with test: ['NONE']. * pep8 fixes Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com> * Fix bug if healthcheck interval is 1 day or more `timedelta` object has days too and seconds are up to one day. Therefore use `total_seconds()` to convert time into seconds. Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com> * Add test for healthcheck when healthcheck is not specified This is to avoid the situation when healthcheck is not specified and treat this as healthcheck is changed or removed. Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com> * Convert string syntax for healthcheck test to CMD-SHELL Also add another test case to check idempotency when healthcheck test is specified as string Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com> * Playbook fails if minimun docker version is not satisfy for healthcheck This is to make more consistent with other non-supported options. Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
Now container can be started with healthcheck enabled
SUMMARY
Fixes #33622
ISSUE TYPE
COMPONENT NAME
docker_container
ANSIBLE VERSION
ADDITIONAL INFORMATION
Container started with healtcheck enabled using docker_container module.