Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support of healthcheck in docker_container module #46772

Merged
merged 7 commits into from
Oct 24, 2018

Conversation

akshay196
Copy link
Contributor

Now container can be started with healthcheck enabled

SUMMARY

Fixes #33622

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

docker_container

ANSIBLE VERSION
ansible 2.8.0.dev0 (docker-healthcheck-fix-33622 71d323d402) last updated 2018/10/10 20:45:20 (GMT +550)
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/akshay/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/akshay/source/ansible/lib/ansible
  executable location = /home/akshay/source/ansible/bin/ansible
  python version = 3.6.6 (default, Jul 19 2018, 14:25:17) [GCC 8.1.1 20180712 (Red Hat 8.1.1-5)]
ADDITIONAL INFORMATION

Container started with healtcheck enabled using docker_container module.

- name: Run nginx docker container
  docker_container:
    name: nginx-example1
    image: nginx:1.13
    state: started
    healthcheck:
      test: ["CMD-SHELL", "curl --fail http://nginx.host.com || exit 1"]
      interval: 5m
      timeout: 3s
      retries: 3
      start_period: 0s
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                            PORTS               NAMES
841853243dbe        nginx:1.13          "nginx -g 'daemon of…"   2 minutes ago       Up 2 minutes (health: starting)   80/tcp              nginx-example1

@ansibot
Copy link
Contributor

ansibot commented Oct 10, 2018

The test ansible-test sanity --test validate-modules [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: E309 version_added for new option (healthcheck) should be 2.8. Currently 0.0

click here for bot help

@ansibot
Copy link
Contributor

ansibot commented Oct 10, 2018

@ansibot ansibot added WIP This issue/PR is a work in progress. Nevertheless it was shared for getting input from peers. affects_2.8 This issue/PR affects Ansible v2.8 ci_verified Changes made in this PR are causing tests to fail. cloud docker feature This issue/PR relates to a feature request. module This issue/PR relates to a module. needs_triage Needs a first human triage before being processed. new_contributor This PR is the first contribution by a new community member. support:community This issue/PR relates to code supported by the Ansible community. labels Oct 10, 2018
Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Looks good. Besides all the little things I commented on, there's one thing which is not good: the option is not checked for idempotency.

If the healthcheck details can only be set on container creation, you need to check Container.has_different_configuration(). If the healthcheck details can be changed during runtime, it must be in Container.has_different_resource_limits() / TaskParameters.update_parameters() / Container.update_limits() (in that case, we should rename some of the functions).

type: dict
description:
- Configure a check that's run to determine whether container is healthy
- 'I(test) - Test to perform to determine container health.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should document the elements in the dict better. It is not clear from this short description that test is a list of strings, which is the command which will be executed (and what CMD in front of it does). Also, it is unclear to me how this command reports the health (return code? timeout?). If it is too complicated to describe it here, you should link to a document where this is explained in detail (docker documentation?) and say that it is similar to there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general: you should also document default values, if there are any.

description:
- Configure a check that's run to determine whether container is healthy
- 'I(test) - Test to perform to determine container health.'
- 'I(interval) - The time to wait between check.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (or somewhere else) you should describe in which format time can be specified.

image: image_name
state: started
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should add a short comment explaining what this test achieves. I.e. something like Tries to make a HTTP GET to localhost. If this fails or timeouts, the healthcheck fails. (if that's correct).

if self.healthcheck.get('test') is None:
return None

if self.healthcheck.get('interval') is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you create a new dict which only contains the parameters you support?

parts = regex.match(time_str)

if not parts:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can't parse it, you should fail instead of returning None.

'''
Return time duration in nanosecond
'''
regex = re.compile(r'((?P<hours>\d+?)hr)?((?P<minutes>\d+?)m)?((?P<seconds>\d+?)s)?')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you allow fractions of a second? (Not that it makes too much sense, but after all, docker expects nanoseconds.)

@@ -753,6 +775,7 @@ def __init__(self, client):
self.exposed_ports = None
self.force_kill = None
self.groups = None
self.healthcheck = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use dict() here; use None. It will be overwritten anyway.

@@ -845,6 +868,10 @@ def __init__(self, client):
self.ulimits = self._parse_ulimits()
self.sysctls = self._parse_sysctls()
self.log_config = self._parse_log_config()

if self.healthcheck:
self.healthcheck = self._parse_healthcheck_options()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the if and make _parse_healthcheck_options() handle self.healthcheck == None.


return self.healthcheck

def convert_duration_to_nanosecond(self, time_str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private function, so its name should start with a _.

@ansibot ansibot removed the needs_triage Needs a first human triage before being processed. label Oct 10, 2018
@akshay196 akshay196 force-pushed the docker-healthcheck-fix-33622 branch 2 times, most recently from 2b881f1 to 04565c3 Compare October 11, 2018 17:25
@ansibot
Copy link
Contributor

ansibot commented Oct 11, 2018

The test ansible-test sanity --test ansible-doc --python 2.6 [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: has a documentation error formatting or is missing documentation.

The test ansible-test sanity --test ansible-doc --python 2.7 [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: has a documentation error formatting or is missing documentation.

The test ansible-test sanity --test ansible-doc --python 3.5 [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: has a documentation error formatting or is missing documentation.

The test ansible-test sanity --test ansible-doc --python 3.6 [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: has a documentation error formatting or is missing documentation.

The test ansible-test sanity --test ansible-doc --python 3.7 [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: has a documentation error formatting or is missing documentation.

The test ansible-test sanity --test docs-build [explain] failed with the error:

Command "/usr/bin/python test/sanity/code-smell/docs-build.py" returned exit status 1.
>>> Standard Error
Command 'make singlehtmldocs' failed with status code: 2
--> Standard Output
cat _themes/srtd/static/css/theme.css | sed -e 's/^[ 	]*//g; s/[ 	]*$//g; s/\([:{;,]\) /\1/g; s/ {/{/g; s/\/\*.*\*\///g; /^$/d' | sed -e :a -e '$!N; s/\n\(.\)/\1/; ta' > _themes/srtd/static/css/theme.min.css
PYTHONPATH=../../lib ../bin/dump_config.py --template-file=../templates/config.rst.j2 --output-dir=rst/reference_appendices/ -d ../../lib/ansible/config/base.yml
mkdir -p rst/cli
PYTHONPATH=../../lib ../bin/generate_man.py --template-file=../templates/cli_rst.j2 --output-dir=rst/cli/ --output-format rst ../../lib/ansible/cli/*.py
PYTHONPATH=../../lib ../bin/dump_keywords.py --template-dir=../templates --output-dir=rst/reference_appendices/ -d ./keyword_desc.yml
PYTHONPATH=../../lib ../bin/plugin_formatter.py -t rst --template-dir=../templates --module-dir=../../lib/ansible/modules -o rst/modules/ 
Makefile:95: recipe for target 'modules' failed
--> Standard Error
Traceback (most recent call last):
  File "../bin/plugin_formatter.py", line 720, in <module>
    main()
  File "../bin/plugin_formatter.py", line 678, in main
    plugin_info, categories = get_plugin_info(options.module_dir, limit_to=options.limit_to, verbose=(options.verbosity > 0))
  File "../bin/plugin_formatter.py", line 269, in get_plugin_info
    doc, examples, returndocs, metadata = plugin_docs.get_docstring(module_path, fragment_loader, verbose=verbose)
  File "/root/ansible/lib/ansible/utils/plugin_docs.py", line 99, in get_docstring
    data = read_docstring(filename, verbose=verbose, ignore_errors=ignore_errors)
  File "/root/ansible/lib/ansible/parsing/plugin_docs.py", line 62, in read_docstring
    data[varkey] = AnsibleLoader(child.value.s, file_name=filename).get_single_data()
  File "/usr/local/lib/python3.6/dist-packages/yaml/constructor.py", line 35, in get_single_data
    node = self.get_single_node()
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 82, in compose_node
    node = self.compose_sequence_node(anchor)
  File "/usr/local/lib/python3.6/dist-packages/yaml/composer.py", line 110, in compose_sequence_node
    while not self.check_event(SequenceEndEvent):
  File "/usr/local/lib/python3.6/dist-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/usr/local/lib/python3.6/dist-packages/yaml/parser.py", line 393, in parse_block_sequence_entry
    "expected <block end>, but found %r" % token.id, token.start_mark)
yaml.parser.ParserError: while parsing a block collection
  in "<unicode string>", line 140, column 7:
          - Configure a check that's run t ... 
          ^
expected <block end>, but found '<scalar>'
  in "<unicode string>", line 143, column 97:
     ... be either string or list. If it's list first item must
                                         ^
make: *** [modules] Error 1

The test ansible-test sanity --test validate-modules [explain] failed with 33 errors:

lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec ('0') for "memory" does not match the documentation (None)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec ('auto') for "api_version" does not match the documentation (None)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec ('localhost') for "tls_hostname" does not match the documentation (None)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec ('started') for "state" does not match the documentation (None)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec ('unix://var/run/docker.sock') for "docker_host" does not match the documentation (None)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec (60) for "timeout" does not match the documentation (None)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec (True) for "detach" does not match the documentation (False)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E324 Value for "default" from the argument_spec (True) for "keep_volumes" does not match the documentation (False)
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "auto_remove" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "cleanup" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "debug" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "detach" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "force_kill" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "ignore_image" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "init" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "interactive" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "keep_volumes" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "oom_killer" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "output_logs" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "paused" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "privileged" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "pull" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "purge_networks" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "read_only" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "recreate" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "restart" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "tls" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "tls_verify" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "trust_image_content" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E325 argument_spec for "tty" defines type="bool" but documentation does not
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E326 Value for "choices" from the argument_spec (['absent', 'present', 'started', 'stopped']) for "state" does not match the documentation ([])
lib/ansible/modules/cloud/docker/docker_container.py:0:0: E326 Value for "choices" from the argument_spec (['no', 'on-failure', 'always', 'unless-stopped']) for "restart_policy" does not match the documentation ([])
lib/ansible/modules/cloud/docker/docker_container.py:157:97: E302 DOCUMENTATION is not valid YAML

The test ansible-test sanity --test yamllint [explain] failed with 1 error:

lib/ansible/modules/cloud/docker/docker_container.py:157:97: error DOCUMENTATION: syntax error: expected <block end>, but found '<scalar>'

click here for bot help

@ansibot ansibot removed the ci_verified Changes made in this PR are causing tests to fail. label Oct 11, 2018
Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idempotency question is still open. Also, there's something you have to think about: how can the user indicate that an existing healthcheck for a container should be removed? There must be three states:

  1. Healthcheck is explicitly added
  2. Healthcheck is explicitly removed (maybe indicate as empty dict?)
  3. Healthcheck is ignored (when parameter healthcheck is None / not specified)

- 'Configure a check that is run to determine whether or not containers for this service are "healthy".
See the docs for the L(HEALTHCHECK Dockerfile instruction,https://docs.docker.com/engine/reference/builder/#healthcheck)
for details on how healthchecks work.'
- 'I(test) - Command to run to check health. C(test) must be either string or list. If it is list first item must
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C(test) must be either *a* sting or *a* list. If it is *a* list, *the* first item must be *one of* C(NONE), ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. I will update this. Thank you @felixfontein for pointing.

duration_options = ['interval', 'timeout', 'start_period']

for (key, value) in supported_healthcheck_options.items():
if not self.healthcheck.get(key):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should write if key not in self.healthcheck: to avoid confusing falsish values with non-specified values.

# Convert provided durations into nanoseconds
self.healthcheck[key] = self._convert_duration_to_nanosecond(self.healthcheck[key])

return self.healthcheck
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again: why don't you create a new dict which only contains the parameters you support?

@@ -931,6 +963,7 @@ def create_parameters(self):
labels='labels',
stop_signal='stop_signal',
working_dir='working_dir',
healthcheck='healthcheck',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

healthcheck was only added in docker-py 2.0 (see changelog), while we support also older versions of docker-py. You have to make sure that this parameter is only added for docker-py 2.0 or newer (and inform the user that if healthcheck is specified and docker-py < 2.0, that the value will be ignored -- see other examples in docker_container, like init).

@akshay196 akshay196 force-pushed the docker-healthcheck-fix-33622 branch 3 times, most recently from eb216eb to f837373 Compare October 14, 2018 17:57
time_params[name] = int(value)

time_in_seconds = timedelta(**time_params).seconds
time_in_nanoseconds = time_in_seconds * 1000000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you ignore the fractional parts (milliseconds, microseconds). You need to add 1000 * timedelta(**time_params).microseconds. (Also, you should store timedelta(**time_params) in a variable, instead of time_in_seconds, and work with that one.)

I.e. something like:

time = timedelta(**time_params)
time_in_nanoseconds = (time.seconds * 1000000 + time.microseconds) * 1000

@akshay196 akshay196 force-pushed the docker-healthcheck-fix-33622 branch 2 times, most recently from 69cf25d to c597b44 Compare October 15, 2018 19:36
Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I think the only unsolved problem is: how to remove healthcheck from an existing container? Specifying an empty dict for healthcheck does not remove it (since it is interpreted the same as when it is not specified).

@felixfontein
Copy link
Contributor

Two more things:

  1. Can you add a small changelog fragment? You can take this one as a template: https://github.com/ansible/ansible/blob/devel/changelogs/fragments/44789-docker_container-comparisons.yaml
  2. Can you add a short integration test? Maybe similar to this one: https://github.com/ansible/ansible/blob/devel/test/integration/targets/docker_container/tasks/tests/options.yml#L1436-L1495

As soon as you added a way to disable healthchecks, you should also add that to the test (i.e. the sequence should be enable, idempotency, change, disable) :)

@felixfontein
Copy link
Contributor

Regarding implementing disabling: maybe you should define parameters.disable_healthcheck as a boolean which is set to True if self.healthcheck is not None but empty (before self.healthcheck = self._parse_healthcheck()), and to None otherwise.

Then you can add disable_healthcheck to config_mapping in has_different_configuration() and set it to True if config.get('Healthcheck') is None or empty. That way (since parameters.disable_healthcheck is either True or None), idempotency handling should work as expected.

@ansibot ansibot removed the ci_verified Changes made in this PR are causing tests to fail. label Oct 21, 2018
@felixfontein
Copy link
Contributor

The error in rhel/7.5/2 seems to be a random fluke (the real test didn't even start); you can trigger a restart of the CI by closing and re-opening the PR.

@felixfontein
Copy link
Contributor

(It's also possible to restart individual CI tests, but that can only be done by very few people, and the chances finding someone right now aren't too good :) )

@akshay196 akshay196 closed this Oct 21, 2018
@akshay196 akshay196 reopened this Oct 21, 2018
Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, up to two minor improvements (one bug if the healthcheck interval is 1 day or more, and one additional test).

time_params[name] = int(value)

time = timedelta(**time_params)
time_in_nanoseconds = (time.seconds * 1000000 + time.microseconds) * 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, there's one thing we forgot: a timedelta object also has days, and seconds are up to one day. Fortunately, there's an easier way to do this all:

Suggested change
time_in_nanoseconds = (time.seconds * 1000000 + time.microseconds) * 1000
time_in_nanoseconds = int(time.total_seconds() * 1000000000)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Using ((time.days * 3600 * 24 + time.seconds) * 1000000 + time.microseconds) * 1000 is more exact, but that exactness is only relevant if the time interval is more than 270 years or so, which seems slightly unreasonable for docker healthchecks ;-) )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with changes. Thanks @felixfontein for suggestion.

@@ -2381,6 +2512,9 @@ def _parse_comparisons(self):
# Add implicit options
comparisons['publish_all_ports'] = dict(type='value', comparison='strict', name='published_ports')
comparisons['expected_ports'] = dict(type='dict', comparison=comparisons['published_ports']['comparison'], name='expected_ports')
comparisons['disable_healthcheck'] = dict(type='value',
comparison='ignore' if comparisons['healthcheck']['comparison'] == 'ignore' else 'strict',
name='disable_healthcheck')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting this :) I forgot to run flake8 manually...

retries: 3
stop_timeout: 1
register: healthcheck_3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one more test which could be relevant (since healthcheck parsing is pretty complex, it could happen that not specifying a healthcheck is treated as healthcheck changed or removed, which we want to avoid):

Suggested change
- name: healthcheck (no change)
docker_container:
image: alpine:3.8
command: '/bin/sh -c "sleep 10m"'
name: "{{ cname }}"
state: started
stop_timeout: 1
register: healthcheck_4

(The healthcheck_ numbers have to be adjusted below, and a healthcheck_4 is not changed has to be added to the assert.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. I will update healthcheck tests.

`timedelta` object has days too and seconds are up to one day.
Therefore use `total_seconds()` to convert time into seconds.

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
This is to avoid the situation when healthcheck is not specified and
treat this as healthcheck is changed or removed.

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
@felixfontein
Copy link
Contributor

The failed test is completely unrelated to the PR itself. I've asked on IRC if someone can restart it.

Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shipit

@kassiansun Can you also take a look at this one?

@ansibot ansibot added community_review In order to be merged, this PR must follow the community review workflow. and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Oct 23, 2018
state: started
healthcheck:
# The "NONE" check needs to be specified
test: ["NONE"]
'''

RETURN = '''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is RETURN not impacted by this change at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it simply returns the equivalent of docker inspect as returned by the API, so it already returned healthcheck information (even though the module didn't allow to modify/set it, in case the container was created by something else with a healthcheck, that information was already available here).


duration_options = ['interval', 'timeout', 'start_period']

for (key, value) in options.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a list as key, value are all same? for option in options seems make later logic clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only reason would be if we'd rename something in the future. But yeah, a list is probably better.

for (key, value) in options.items():
if value in self.healthcheck:
if value in duration_options:
time = self._convert_duration_to_nanosecond(self.healthcheck.get(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass it into the result if time is zero?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think so. If the user specified zero, it probably has a good reason (or should fail if docker doesn't like it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When specify healthcheck parameter to zero, eg. --health-interval=0s in following example

docker run --name=nginx-proxy -d --health-cmd='curl --fail http://nginx.host.com || exit 1' --health-interval=0s   nginx:1.13

Then check docker inspect <container id>, It does not displays that parameter.

...
            "Healthcheck": {
                "Test": [
                    "CMD-SHELL",
                    "curl --fail http://nginx.host.com || exit 1"
                ]
            },

Does it mean docker daemon ignore such parameter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we'd also better drop it, otherwise we'll get idempotency problems...

result[key] = time
elif self.healthcheck.get(value):
result[key] = self.healthcheck.get(value)
if key == 'test':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reduce the level of indents:

elif option == 'test':
elif option == 'retries':
else:
    result[option] = self.healthcheck.get(option)

And I don't thinks someone will be bored enough to specify a test: "" to get into the elif self.healthcheck.get(value): check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that could improve it a bit. It will be still a bit complex, though :)

(Also, never underestimate users which will specify something which doesn't make sense ;) )

elif key == 'retries':
try:
result[key] = int(result[key])
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we detect such a stupid mistake from users xD?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It could happen that someone templates something and gets the wrong value. If docker_container barfs with an exception, we get an Issue, if it returns a nice error, the user will hopefully solve the problem without filing an Issue ;)

self.fail('Cannot parse number of retries for healthcheck. '
'Expected an integer, got "{0}".'.format(result[key]))

if result['test'] == ['NONE']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comparisons can handle such kind of usage @felixfontein ? And per the documentation, NONE seems only used in Dockerfile for overriding the parent configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit more complicated. test: ['NONE'] is equivalent to not having a healthcheck, which is something comparsions doesn't understand. We decided to use the test: ['NONE'] syntax to disable healthchecks because the alternatives are kind of strange. Accepting healthcheck: None doesn't work (since that means "I don't care if the container has a healthcheck"), and healthcheck: { } (which we had before) is also kind of strange (and still needs the extra complexity with disable_healthcheck since the default comparison for dicts, allow_more_present, won't detect disabling in this case).

Having the explicit test: ['NONE'] seems like the best way (and the most compatbile one with comparisons).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think healthcheck: None and comparisons: heathcheck: strict is enough for this?

Well, I prefer using existing functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because None is the same as not specified, and thus None always means that no comparison is done. We really need something which is explicitly not None.

@@ -2434,11 +2568,17 @@ def __init__(self, **kwargs):
if self.module.params.get("runtime") and not runtime_supported:
self.fail('docker API version is %s. Minimum version required is 1.12 to set runtime option.' % (docker_api_version,))

healthcheck_supported = LooseVersion(docker_version) >= LooseVersion('2.0')
if self.module.params.get("healthcheck") and not healthcheck_supported:
self.module.warn("docker or docker-py version is %s. Minimum version required is 2.0 to set healthcheck option. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should better fail here? It will be more consistent with other non-supported options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be fail instead of warn.

@@ -1584,6 +1701,8 @@ def has_different_configuration(self, image):
volumes_from=host_config.get('VolumesFrom'),
working_dir=config.get('WorkingDir'),
publish_all_ports=host_config.get('PublishAllPorts'),
expected_healthcheck=config.get('Healthcheck'),
disable_healthcheck=(not config.get('Healthcheck') or config.get('Healthcheck').get('Test') == ['NONE']),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the disable_healthcheck is redundant, we can leave the logic to comparisons and users to handle it. Introducing this logic might also conflict with docker's original logic, as docker run --health-cmd ANYTHING will always generate a "CMD-SHELL" entry in HealthCheck.Test(tested on CentOS with docker 1.13).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really work that way, since docker indicates "no healthcheck" by returning None (I think), which isn't compatible to what comparisons does.

I don't understand why this logic (you mean disable_healthcheck?) conflicts with docker's original logic?

In any way, I think we have to convert the string syntax for test to CMD-SHELL, otherwise idempotency won't work. That's a good point! (We should also have a test for it.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I think we just need to strict compare these two options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any way, I think we have to convert the string syntax for test to CMD-SHELL

Right. There could be easy fix, like:

if isinstance(result[key], (tuple, list)):
    result[key] = [str(e) for e in result[key]]
else:
    result[key] = ["CMD-SHELL", str(result[key])]

I will update code for this along with test for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that looks good. In particular, add an idempotency test for this one.

@kassiansun
Copy link
Contributor

BTW, I think there're too many None or some empty thing checks, I thinks this is against python best practice? I'm not familiar with python, at least with ruby we almost never do such kind of detection.

@felixfontein
Copy link
Contributor

BTW, I think there're too many None or some empty thing checks, I thinks this is against python best practice? I'm not familiar with python, at least with ruby we almost never do such kind of detection.

I guess the reason is that in Ansible, and docker_container in particular, there's a big difference between None (which usually means "argument was not specified") and something else (which means something was specified). So there's always a need to distinguish between these cases.

Also add another test case to check idempotency when healthcheck test
is specified as string

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
This is to make more consistent with other non-supported options.

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
@kassiansun
Copy link
Contributor

shipit

Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shipit

@ansibot ansibot added shipit This PR is ready to be merged by Core and removed community_review In order to be merged, this PR must follow the community review workflow. labels Oct 24, 2018
@gundalow gundalow merged commit 20b95ad into ansible:devel Oct 24, 2018
@felixfontein
Copy link
Contributor

@akshay196 thanks a lot for working on this! This new feature will show up in Ansible 2.8.
@kassiansun thanks for reviewing!
@gundalow thanks for merging!

@akshay196
Copy link
Contributor Author

@felixfontein @kassiansun @gundalow Thank you very much. I really appreciate your help. :)

Tomorrow9 pushed a commit to Tomorrow9/ansible that referenced this pull request Dec 4, 2018
* Add Support of healthcheck in docker_container module

Fixes ansible#33622
Now container can be started with healthcheck enabled

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>

* Extend docker_container healthcheck (#1)

* Allowing to disable healthcheck.

* Added test for healthcheck.

* Make sure correct types are used.

* Healthcheck needs to be explicitly disabled with test: ['NONE'].

* pep8 fixes

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>

* Fix bug if healthcheck interval is 1 day or more

`timedelta` object has days too and seconds are up to one day.
Therefore use `total_seconds()` to convert time into seconds.

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>

* Add test for healthcheck when healthcheck is not specified

This is to avoid the situation when healthcheck is not specified and
treat this as healthcheck is changed or removed.

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>

* Convert string syntax for healthcheck test to CMD-SHELL

Also add another test case to check idempotency when healthcheck test
is specified as string

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>

* Playbook fails if minimun docker version is not satisfy for healthcheck

This is to make more consistent with other non-supported options.

Signed-off-by: Akshay Gaikwad <akgaikwad001@gmail.com>
@ansible ansible locked and limited conversation to collaborators Jul 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.8 This issue/PR affects Ansible v2.8 cloud docker feature This issue/PR relates to a feature request. module This issue/PR relates to a module. new_contributor This PR is the first contribution by a new community member. shipit This PR is ready to be merged by Core support:community This issue/PR relates to code supported by the Ansible community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support of healthcheck in docker_container module
6 participants