Skip to content

Latest commit



986 lines (764 loc) · 38.8 KB


File metadata and controls

986 lines (764 loc) · 38.8 KB


Quick Steps to Restarting Squid Proxy Container

Downloading and installing several hundred packages per host while testing provisioning of multiple Vagrant virtual machines can take several hours to perform over a 1-5 Mbps network connection. Even a single Vagrant can take around 45 minutes to fully provision after a vagrant destroy. Since this task may need to be done over and over again, even for just one system, the process becomes very tedious and time consuming.

To minimize the number of remote downloads, a local proxy can help immensely. The DIMS project utilizes a squid-deb-proxy running in a Docker container on VM host systems to allow all of the local VMs to take advantage of a single cacheing proxy on the host. This significantly improves performance (cutting the time down to just a few minutes), but this comes at a cost in occassional instability due to the combination of iptables firewall rules that must contain a DOCKER chain for Docker, which attempts to keep the squid-deb-proxy container running across reboots of the VM host can result in the the container effectively "hanging" from time to time. This manifests as a random failure in an Ansible task that is trying to use the configured proxy (e.g., see the python-virtualenv build failure in Section :ref:`using_dims_functions_in_bats`.)

A bats test exists to test the proxy:

$ test.runner integration/proxy
[+] Running test integration/proxy
 ✗ [S][EV] HTTP download test (using wget, w/proxy if configured)
   (in test file integration/proxy.bats, line 16)
     `[ ! -z "$(wget -q -O - | grep non-free/source/Release 2>/dev/null)" ]' failed
 ✗ [S][EV] HTTPS download test (using wget, w/proxy if configured)
   (in test file integration/proxy.bats, line 26)
     `[ ! -z "$(wget -q -O - | grep 0install 2>/dev/null)" ]' failed

2 tests, 2 failures

This error will manifest itself sometimes when doing development work on Vagrants, as can be seen here:

 $ cd /vm/run/purple
 $ make up && make DIMS_ANSIBLE_ARGS="--tags base" reprovision-local
 [+] Creating Vagrantfile
 . . .
 TASK [base : Only "update_cache=yes" if >3600s since last update (Debian)] ****
 Wednesday 16 August 2017  16:55:35 -0700 (0:00:01.968)       0:00:48.823 ******
 fatal: [purple.devops.local]: FAILED! => {
     "changed": false,
     "failed": true


 Failed to update apt cache.

 RUNNING HANDLER [base : update timezone] **************************************
 Wednesday 16 August 2017  16:56:18 -0700 (0:00:43.205)       0:01:32.028 ******

 PLAY RECAP ********************************************************************
 purple.devops.local        : ok=15   changed=7    unreachable=0    failed=1

 Wednesday 16 August 2017  16:56:18 -0700 (0:00:00.000)       0:01:32.029 ******
 base : Only "update_cache=yes" if >3600s since last update (Debian) ---- 43.21s
 . . .
 make[1]: *** [provision] Error 2
 make[1]: Leaving directory `/vm/run/purple'
 make: *** [reprovision-local] Error 2

When it fails like this, it usually means that iptables must be restarted, followed by restarting the docker service. That usually is enough to fix the problem. If not, it may be necessary to also restart the squid-deb-proxy container.


The cause of this the recreation of the DOCKER chain, which removes the rules added by Docker, when restarting just the iptables-persistent service as can be seen here:

$ sudo iptables -nvL | grep "Chain DOCKER"
Chain DOCKER (2 references)
Chain DOCKER-ISOLATION (1 references)
$ sudo iptables-persistent restart
sudo: iptables-persistent: command not found
$ sudo service iptables-persistent restart
 * Loading iptables rules...
 *  IPv4...
 *  IPv6...
$ sudo iptables -nvL | grep "Chain DOCKER"
Chain DOCKER (0 references)

Restarting the docker service will restore the rules for containers that Docker is keeping running across restarts.

$ sudo service docker restart
docker stop/waiting
docker start/running, process 18276
$ sudo iptables -nvL | grep "Chain DOCKER"
Chain DOCKER (2 references)
Chain DOCKER-ISOLATION (1 references)

The solution for this is to notify a special handler that conditionally restarts the docker service after restarting iptables in order to re-establish the proper firewall rules. The handler is shown here:

 - name: conditional restart docker
   service: name=docker state=restarted
   when: hostvars[inventory_hostname].ansible_docker0 is defined

Use of the handler (from roles/base/tasks/main.yml) is shown here:

 - name: iptables v4 rules (Debian)
     src: '{{ item }}'
     dest: /etc/iptables/rules.v4
     owner: '{{ root_user }}'
     group: '{{ root_group }}'
     mode: 0o600
     validate: '/sbin/iptables-restore --test %s'
     - files:
         - '{{ iptables_rules }}'
         - rules.v4.{{ inventory_hostname }}.j2
         - rules.v4.category-{{ category }}.j2
         - rules.v4.deployment-{{ deployment }}.j2
         - rules.v4.j2
         - '{{ dims_private }}/roles/{{ role_name }}/templates/iptables/'
         - iptables/
     - "restart iptables ({{ ansible_distribution }}/{{ ansible_distribution_release }})"
     - "conditional restart docker"
   become: yes
   when: ansible_os_family == "Debian"
   tags: [ base, config, iptables ]

A tag iptables exists to allow regeneration of the iptables rules and perform the proper restarting sequence, which should be used instead of just restarting the iptables-persistent service manually. Use ansible-playbook instead (e.g., run.playbook --tags iptables) after making changes to variables that affect iptables rules.

$ cd $GIT/dims-dockerfiles/dockerfiles/squid-deb-proxy

$ for S in iptables-persistent docker; do sudo service $S restart; done
 * Loading iptables rules...
 *  IPv4...
 *  IPv6...
docker stop/waiting
docker start/running, process 22065

$ make rm
docker stop dims.squid-deb-proxy
test.runner -dims.squid-deb-proxy
docker rm dims.squid-deb-proxy

$ make daemon
docker run \
          --name dims.squid-deb-proxy \
          --restart unless-stopped \
          -v /vm/cache/apt:/cachedir -p squid-deb-proxy:0.7 2>&1 >/dev/null &
2017/07/22 19:31:29| strtokFile: /etc/squid-deb-proxy/autogenerated/pkg-blacklist-regexp.acl not found
2017/07/22 19:31:29| Warning: empty ACL: acl blockedpkgs urlpath_regex "/etc/squid-deb-proxy/autogenerated/pkg-blacklist-regexp.acl"

The test should now succeed:

$ test.runner --level '*' --match proxy
[+] Running test integration/proxy
 ✓ [S][EV] HTTP download test (using wget, w/proxy if configured)
 ✓ [S][EV] HTTPS download test (using wget, w/proxy if configured)

2 tests, 0 failures

Recovering From Operating System Corruption

Part of the reason for using a Python virtual environment for development is to encapsulate the development Python and its libraries from the system Python and its libraries, in case a failed upgrade breaks Python. Since Python is a primary dependency of Ansible, a broken system Python is a Very Bad Thing. ™

For example, the following change was attempted to try to upgrade pip packages during application of the base role. Here are the changes:

$ git diff
diff --git a/roles/base/tasks/main.yml b/roles/base/tasks/main.yml
index 3ce57d8..182e7d8 100644
--- a/roles/base/tasks/main.yml
+++ b/roles/base/tasks/main.yml
@@ -717,7 +717,7 @@
 - name: Ensure pip installed for system python
     name: '{{ item }}'
-    state: installed
+    state: latest
     - python-pip
   become: yes
@@ -725,7 +725,7 @@
   tags: [ base, config ]

 - name: Ensure required system python packages present
-  shell: 'pip install {{ item }}'
+  shell: 'pip install -U {{ item }}'
     - urllib3
     - pyOpenSSL

Applying the base role against two systems resulted in a series of error messages.

$ ansible-playbook master.yml --limit trident --tags base

. . .

PLAY [Configure host "purple.devops.local"] ***********************************

. . .

TASK [base : Ensure required system python packages present] ******************
Thursday 17 August 2017  10:36:13 -0700 (0:00:01.879)       0:02:22.637 *******
changed: [purple.devops.local] => (item=urllib3)
failed: [purple.devops.local] (item=pyOpenSSL) => {
    "changed": true,
    "cmd": "pip install -U pyOpenSSL",
    "delta": "0:00:07.516760",
    "end": "2017-08-17 10:36:24.256121",
    "failed": true,
    "item": "pyOpenSSL",
    "rc": 1,
    "start": "2017-08-17 10:36:16.739361"


Downloading/unpacking pyOpenSSL from
Downloading/unpacking six>=1.5.2 from
611d59a81a315dc8 (from pyOpenSSL)
  Downloading six-1.10.0-py2.py3-none-any.whl
Downloading/unpacking cryptography>=1.9 (from pyOpenSSL)
  Running (path:/tmp/pip-build-FCbUwT/cryptography/ egg_info for package cryptography

    no previously-included directories found matching 'docs/_build'
    warning: no previously-included files matching '*' found under directory 'vectors'
Downloading/unpacking idna>=2.1 (from cryptography>=1.9->pyOpenSSL)
Downloading/unpacking asn1crypto>=0.21.0 (from cryptography>=1.9->pyOpenSSL)
Downloading/unpacking enum34 (from cryptography>=1.9->pyOpenSSL)
  Downloading enum34-1.1.6-py2-none-any.whl
Downloading/unpacking ipaddress (from cryptography>=1.9->pyOpenSSL)
  Downloading ipaddress-1.0.18-py2-none-any.whl
Downloading/unpacking cffi>=1.7 (from cryptography>=1.9->pyOpenSSL)
  Running (path:/tmp/pip-build-FCbUwT/cffi/ egg_info for package cffi

Downloading/unpacking pycparser from
88136 (from cffi>=1.7->cryptography>=1.9->pyOpenSSL)
  Running (path:/tmp/pip-build-FCbUwT/pycparser/ egg_info for package pycparser

    warning: no previously-included files matching 'yacctab.*' found under directory 'tests'
    warning: no previously-included files matching 'lextab.*' found under directory 'tests'
    warning: no previously-included files matching 'yacctab.*' found under directory 'examples'
    warning: no previously-included files matching 'lextab.*' found under directory 'examples'
Installing collected packages: pyOpenSSL, six, cryptography, idna, asn1crypto, enum34, ipaddress, cffi, pycparser
  Found existing installation: pyOpenSSL 0.14
    Not uninstalling pyOpenSSL at /usr/lib/python2.7/dist-packages, owned by OS
  Found existing installation: six 1.8.0
    Not uninstalling six at /usr/lib/python2.7/dist-packages, owned by OS
  Found existing installation: cryptography 0.6.1
    Not uninstalling cryptography at /usr/lib/python2.7/dist-packages, owned by OS
  Running install for cryptography

    Installed /tmp/pip-build-FCbUwT/cryptography/cffi-1.10.0-py2.7-linux-x86_64.egg
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-FCbUwT/cryptography/", line 312, in <module>
      File "/usr/lib/python2.7/distutils/", line 111, in setup
        _setup_distribution = dist = klass(attrs)
      File "/usr/lib/python2.7/dist-packages/setuptools/", line 266, in __init__
      File "/usr/lib/python2.7/distutils/", line 287, in __init__
      File "/usr/lib/python2.7/dist-packages/setuptools/", line 301, in finalize_options
        ep.load()(self,, value)
      File "/usr/lib/python2.7/dist-packages/", line 2190, in load
    ImportError: No module named setuptools_ext
    Complete output from command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-FCbUwT/cryptography/';exec(compile(getattr(tokenize, 'open', open)(__file__)
.read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-qKjzie-record/install-record.txt --single-version-externally-managed --compile:

Installed /tmp/pip-build-FCbUwT/cryptography/cffi-1.10.0-py2.7-linux-x86_64.egg

Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/tmp/pip-build-FCbUwT/cryptography/", line 312, in <module>


  File "/usr/lib/python2.7/distutils/", line 111, in setup

    _setup_distribution = dist = klass(attrs)

  File "/usr/lib/python2.7/dist-packages/setuptools/", line 266, in __init__


  File "/usr/lib/python2.7/distutils/", line 287, in __init__


  File "/usr/lib/python2.7/dist-packages/setuptools/", line 301, in finalize_options

    ep.load()(self,, value)

  File "/usr/lib/python2.7/dist-packages/", line 2190, in load


ImportError: No module named setuptools_ext

  Can't roll back cryptography; was not uninstalled
Cleaning up...
Command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-FCbUwT/cryptography/';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '
\n'), __file__, 'exec'))" install --record /tmp/pip-qKjzie-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip-build-FCbUwT/cryptogra
Storing debug log for failure in /root/.pip/pip.log

. . .

PLAY RECAP ********************************************************************
purple.devops.local        : ok=60   changed=35   unreachable=0    failed=1

Thursday 17 August 2017  10:36:29 -0700 (0:00:00.001)       0:02:38.799 *******
base : Ensure required system python packages present ------------------ 16.16s
base : Ensure dims (system-level) subdirectories exist ----------------- 15.85s
base : Only "update_cache=yes" if >3600s since last update (Debian) ----- 5.65s
base : conditional restart docker --------------------------------------- 5.60s
base : Make sure required APT packages are present (Debian) ------------- 2.14s
base : Clean up dnsmasq build artifacts --------------------------------- 2.09s
base : Make sure blacklisted packages are absent (Debian) --------------- 2.03s
base : Check to see if https_proxy is working --------------------------- 1.99s
base : Log start of 'base' role ----------------------------------------- 1.95s
base : Make backports present for APT on Debian jessie ------------------ 1.89s
base : Ensure pip installed for system python --------------------------- 1.88s
base : Only "update_cache=yes" if >3600s since last update -------------- 1.85s
base : Make dbus-1 development libraries present ------------------------ 1.85s
base : iptables v4 rules (Debian) --------------------------------------- 1.84s
base : iptables v6 rules (Debian) --------------------------------------- 1.84s
base : Make full dnsmasq package present (Debian, not Trusty) ----------- 1.82s
base : Create base /etc/hosts file (Debian, RedHat, CoreOS) ------------- 1.64s
base : Make /etc/rsyslog.d/49-consolidation.conf present ---------------- 1.63s
base : Make dnsmasq configuration present on Debian --------------------- 1.60s
base : Ensure DIMS system shell init hook is present (Debian, CoreOS) --- 1.56s

The base role is supposed to ensure the operating system has the fundamental settings and pre-requisites necessary for all other DIMS roles, so applying that role should hopefully fix things, right?

$ ansible-playbook master.yml --limit trident --tags base

. . .

PLAY [Configure host "purple.devops.local"] ***********************************

. . .

TASK [base : Make sure blacklisted packages are absent (Debian)] **************
Thursday 17 August 2017  11:05:08 -0700 (0:00:01.049)       0:00:30.456 *******
An exception occurred during task execution. To see the full traceback, use
-vvv. The error was: AttributeError: 'FFI' object has no attribute 'new_allocator'
failed: [purple.devops.local] (item=[u'modemmanager', u'resolvconf', u'sendmail']) => {
    "failed": true,
    "item": [
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_ehzfMx/\", line 239, in <module>\n    from ansible.module_utils.urls import fetch_url\n
File \"/tmp/ansible_ehzfMx/\", line 153,
in <module>\n  File \"/usr/local/lib/python2.7/dist-packages/urllib3/contrib/\", line 46,
in <module>\n    import OpenSSL.SSL\n  File \"/usr/local/lib/python2.7/dist-packages/OpenSSL/\",
line 8, in <module>\n    from OpenSSL import rand, crypto, SSL\n  File \"/usr/local/lib/
python2.7/dist-packages/OpenSSL/\", line 10, in <module>\n    from OpenSSL._util
import (\n  File \"/usr/local/lib/python2.7/dist-packages/OpenSSL/\", line 18, in
<module>\n    no_zero_allocator = ffi.new_allocator(should_clear_after_alloc=False)\n
AttributeError: 'FFI' object has no attribute 'new_allocator'\n",
    "module_stdout": "",
    "rc": 1



TASK [base : Only "update_cache=yes" if >3600s since last update (Debian)] ****
Thursday 17 August 2017  11:05:10 -0700 (0:00:01.729)       0:00:32.186 *******
An exception occurred during task execution. To see the full traceback, use -vvv.
The error was: AttributeError: 'FFI' object has no attribute 'new_allocator'
fatal: [purple.devops.local]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_ganqlZ/\", line 239, in <module>\n    from ansible.module_utils.urls import fetch_url\n
File \"/tmp/ansible_ganqlZ/\", line 153, in
<module>\n  File \"/usr/local/lib/python2.7/dist-packages/urllib3/contrib/\", line 46,
in <module>\n    import OpenSSL.SSL\n  File \"/usr/local/lib/python2.7/dist-packages/
OpenSSL/\", line 8, in <module>\n    from OpenSSL import rand, crypto, SSL\n
File \"/usr/local/lib/python2.7/dist-packages/OpenSSL/\", line 10, in <module>\n
from OpenSSL._util import (\n  File \"/usr/local/lib/python2.7/dist-packages/OpenSSL/\",
line 18, in <module>\n    no_zero_allocator = ffi.new_allocator(should_clear_after_alloc=False)\n
AttributeError: 'FFI' object has no attribute 'new_allocator'\n",
    "module_stdout": "",
    "rc": 1



RUNNING HANDLER [base : update timezone] **************************************
Thursday 17 August 2017  11:05:11 -0700 (0:00:01.530)       0:00:33.716 *******

PLAY RECAP ********************************************************************
purple.devops.local        : ok=14   changed=7    unreachable=0    failed=1

Thursday 17 August 2017  11:05:11 -0700 (0:00:00.001)       0:00:33.717 *******
base : Log start of 'base' role ----------------------------------------- 1.88s
base : Make sure blacklisted packages are absent (Debian) --------------- 1.73s
base : Create base /etc/hosts file (Debian, RedHat, CoreOS) ------------- 1.55s
base : Only "update_cache=yes" if >3600s since last update (Debian) ----- 1.53s
base : Set timezone variables (Debian) ---------------------------------- 1.53s
base : iptables v6 rules (Debian) --------------------------------------- 1.48s
base : iptables v4 rules (Debian) --------------------------------------- 1.48s
base : Ensure getaddrinfo configuration is present (Debian) ------------- 1.48s
base : Check to see if dims.logger exists yet --------------------------- 1.31s
base : Set domainname (Debian, CoreOS) ---------------------------------- 1.17s
base : Check to see if gpk-update-viewer is running on Ubuntu ----------- 1.16s
base : Set hostname (runtime) (Debian, CoreOS) -------------------------- 1.16s
base : Make /etc/hostname present (Debian, CoreOS) ---------------------- 1.16s
base : Disable IPv6 in kernel on non-CoreOS ----------------------------- 1.16s
debug : include --------------------------------------------------------- 1.07s
base : iptables v4 rules (CoreOS) --------------------------------------- 1.06s
base : iptables v6 rules (CoreOS) --------------------------------------- 1.06s
debug : debug ----------------------------------------------------------- 1.05s
debug : debug ----------------------------------------------------------- 1.05s
debug : debug ----------------------------------------------------------- 1.05s

Since Debian apt is a Python program, it requires Python to install packages. The Python packages are corrupted, so Python will not work properly. This creates a deadlock condition. There is another way to install Python packages, however, so it can be used via Ansible ad-hoc mode:

$ ansible -m shell --become -a 'easy_install -U cffi' trident
yellow.devops.local | SUCCESS | rc=0 >>
Searching for cffi
Best match: cffi 1.10.0
Processing cffi-1.10.0.tar.gz
Writing /tmp/easy_install-RmOJBU/cffi-1.10.0/setup.cfg
Running cffi-1.10.0/ -q bdist_egg --dist-dir /tmp/easy_install-RmOJBU/cffi-1.10.0/egg-dist-tmp-lNCOck
compiling '_configtest.c':
__thread int some_threadlocal_variable_42;

x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -c
 _configtest.c -o _configtest.o
removing: _configtest.c _configtest.o
compiling '_configtest.c':
int main(void) { __sync_synchronize(); return 0; }

x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -c
 _configtest.c -o _configtest.o
x86_64-linux-gnu-gcc -pthread _configtest.o -o _configtest
removing: _configtest.c _configtest.o _configtest
Adding cffi 1.10.0 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/cffi-1.10.0-py2.7-linux-x86_64.egg
Processing dependencies for cffi
Finished processing dependencies for cffi

purple.devops.local | SUCCESS | rc=0 >>
Searching for cffi
Best match: cffi 1.10.0
Processing cffi-1.10.0.tar.gz
Writing /tmp/easy_install-fuS4hd/cffi-1.10.0/setup.cfg
Running cffi-1.10.0/ -q bdist_egg --dist-dir /tmp/easy_install-fuS4hd/cffi-1.10.0/egg-dist-tmp-nOgko4
compiling '_configtest.c':
__thread int some_threadlocal_variable_42;

x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -c
 _configtest.c -o _configtest.o
removing: _configtest.c _configtest.o
compiling '_configtest.c':
int main(void) { __sync_synchronize(); return 0; }

x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -c
 _configtest.c -o _configtest.o
x86_64-linux-gnu-gcc -pthread _configtest.o -o _configtest
removing: _configtest.c _configtest.o _configtest
Adding cffi 1.10.0 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/cffi-1.10.0-py2.7-linux-x86_64.egg
Processing dependencies for cffi
Finished processing dependencies for cffi

Now we can back out the addition of the -U flag that caused the corruption and apply the base role to the two hosts using the master.yml playbook.

$ ansible-playbook master.yml --limit trident --tags base

. . .

PLAY [Configure host "purple.devops.local"] ***********************************

. . .

PLAY [Configure host "yellow.devops.local"] ***********************************

. . .

PLAY RECAP ********************************************************************
purple.devops.local        : ok=136  changed=29   unreachable=0    failed=0
yellow.devops.local        : ok=139  changed=53   unreachable=0    failed=0

Thursday 17 August 2017  11:20:08 -0700 (0:00:01.175)       0:10:03.307 *******
base : Make defined bats tests present --------------------------------- 29.18s
base : Make defined bats tests present --------------------------------- 28.95s
base : Ensure dims (system-level) subdirectories exist ----------------- 15.89s
base : Ensure dims (system-level) subdirectories exist ----------------- 15.84s
base : Ensure required system python packages present ------------------- 8.81s
base : Make sure common (non-templated) BASH scripts are present -------- 8.79s
base : Make sure common (non-templated) BASH scripts are present -------- 8.74s
base : Ensure required system python packages present ------------------- 8.71s
base : Make subdirectories for test categories present ------------------ 6.84s
base : Make links to helper functions present --------------------------- 6.83s
base : Make subdirectories for test categories present ------------------ 6.83s
base : Make links to helper functions present --------------------------- 6.81s
base : Ensure bashrc additions are present ------------------------------ 4.63s
base : Ensure bashrc additions are present ------------------------------ 4.59s
base : Only "update_cache=yes" if >3600s since last update (Debian) ----- 4.45s
base : Make sure common (non-templated) Python scripts are present ------ 3.77s
base : Make sure common (non-templated) Python scripts are present ------ 3.77s
base : conditional restart docker --------------------------------------- 3.17s
base : Make sure common (templated) scripts are present ----------------- 2.96s
base : Make sure common (templated) scripts are present ----------------- 2.94s

In this case, the systems are now back to a functional state and the disruptive change backed out. Were these Vagrants, the problem of a broken system is lessened, so testing should always be done first on throw-away VMs. But on those occassions where something goes wrong on "production" hosts, Ansible ad-hoc mode is a powerful debugging and corrective capability.

Advanced Ansible Tasks or Jinja Templating

This section includes some advanced uses of Ansible task declaration and/or Jinja templating that may be difficult to learn from Ansible documentation or other sources. Some useful resources that were identified during the DIMS Project are listed in Section :ref:`bestpractices`.

Multi-line fail or debug Output

There are times when it is necessary to produce a long message in a fail or debug play. An answer to the stackoverflow post In YAML, how do I break a string over multiple lines? includes multiple ways to do this. Here is one of them in action in the virtualbox role:

.. literalinclude:: ../../roles/virtualbox/tasks/main.yml
   :language: yaml
   :emphasize-lines: 35-41

When this code is triggered, the output is now clean and clear about what to do.

TASK [virtualbox : fail] *******************************************************************
task path: /home/dittrich/dims/git/ansible-dims-playbooks/roles/virtualbox/tasks/main.yml:33
Wednesday 06 September 2017  12:45:38 -0700 (0:00:01.046)       0:00:51.117 ***
fatal: [dimsdemo1.devops.develop]: FAILED! => {
    "changed": false,
    "failed": true


Found 1 running Virtualbox VM.
Virtualbox cannot be updated while VMs are running.
Please halt or suspend this VM and apply this role again.

dittrich 15289 /usr/lib/virtualbox/VBoxHeadless --comment orange_default_1504485887221_79778 --startvm 62e20c31-7c2c-417a-a5ab-3a056aa81e2d --vrde config

Leveraging the Terraform State File

Terraform maintains state in a file named terraform.tfstate (and a backup file terraform.tfstate.backup) in the home directory where Terraform was initialized. While the terraform.tfstate file is a JSON object that can be manipulated using programs like jq, the proper way to exploit this state is to use terraform output --json.

Introduction to jq

To better understand how to manipulate the contents of the terraform.tfstate file with jq, we will start out by directly manipulating the file so we don't have to also struggle with defining Terraform output variables.


See Reshaping JSON with jq for examples of how to use jq.

Using the filter . with jq will show the entire structure. Here are the first 10 lines in a terraform.tfstate file

$ jq -r '.' terraform.tfstate | head
  "version": 3,
  "terraform_version": "0.11.1",
  "serial": 7,
  "lineage": "755c781e-407c-41e2-9f10-edd0b80bcc9f",
  "modules": [
      "path": [


To more easily read the JSON, you can pipe the output through pygmentize to colorize it, then less -R to preserve the ANSI colorization codes. The command line to use is:

$ jq -r '.' terraform.tfstate | pygmentize | less -R

By choosing a specific field for the filter, jq will print just that field.

$ jq -r '.lineage' terraform.tfstate

Adding [] to a field that is an array produces a list, and piping filters with a | allows additional filtering to be applied to narrow the results. Functions like select() can be used to extract a specific field from a list element that is a dictionary, allowing selection of just specific members. In the next example, the nested structures named resources within the structure modules are evaluated, selecting only those where the type field is digitalocean_record (i.e., DNS records).

$ jq -r '.modules[] | .resources[] | select(.type | test("digitalocean_record"))' terraform.tfstate

The first record is highlighted in the output here. Within the record are two fields (.primary.attributes.fqdn and .primary.attributes.value) that are needed to help build /etc/hosts style DNS mappings, or to generate a YAML inventory file.

   "type": "digitalocean_record",
   "depends_on": [
   "primary": {
     "id": "XXXXXXXX",
     "attributes": {
       "domain": "",
       "fqdn": "",
       "id": "XXXXXXXX",
       "name": "blue",
       "port": "0",
       "priority": "0",
       "ttl": "360",
       "type": "A",
       "value": "XXX.XXX.XXX.XX",
       "weight": "0"
     "meta": {},
     "tainted": false
   "deposed": [],
   "provider": "provider.digitalocean"
   "type": "digitalocean_record",
   "depends_on": [
   "primary": {
     "id": "XXXXXXXX",
     "attributes": {
       "domain": "",
       "fqdn": "",
       "id": "XXXXXXXX",
       "name": "orange",
       "port": "0",
       "priority": "0",
       "ttl": "360",
       "type": "A",
       "value": "XXX.XXX.XXX.XXX",
       "weight": "0"
     "meta": {},
     "tainted": false
   "deposed": [],
   "provider": "provider.digitalocean"

By adding another pipe step to create an list item with just these two fields, and adding the -c option to create a single-line JSON object.

$ jq -c '.modules[] | .resources[] | select(.type | test("digitalocean_record")) | [ .primary.attributes.fqdn, .primary.attributes.value ]' terraform.tfstate

These can be further converted into formats parseable by Unix shell programs like awk, etc., using the filters @csv or @sh:

$ jq -r '.modules[] | .resources[] | select(.type | test("digitalocean_record")) | [, .primary.attributes.fqdn, .primary.attributes.value ]| @csv' terraform.tfstate
$ jq -r '.modules[] | .resources[] | select(.type | test("digitalocean_record")) | [, .primary.attributes.fqdn, .primary.attributes.value ]| @sh' terraform.tfstate
'blue' '' 'XXX.XXX.XXX.XX'"
'blue' '' 'XXX.XXX.XXX.XXX'"

Processing terraform output --json

While processing the terraform.tfstate file directly is possible, the proper way to use Terraform state is to create output variables and expose them using terraform output:

$ terraform output
blue = { = XXX.XX.XXX.XXX
orange = { = XXX.XX.XXX.XXX

This output could be processed with awk, but we want to use jq instead to more directly process the output using JSON. To get JSON output, add the --json flag:

$ terraform output --json
    "blue": {
        "sensitive": false,
        "type": "map",
        "value": {
            "": "XXX.XX.XXX.XXX"
    "orange": {
        "sensitive": false,
        "type": "map",
        "value": {
            "": "XXX.XX.XXX.XXX"

To get to clean single-line, multi-colum output, we need to use to_entries[] to turn the dictionaries into key/value pairs, nested two levels deep in this case.

$ terraform output --json | jq -r 'to_entries[] | [ .key, (.value.value|to_entries[]| .key, .value) ]|@sh'
'blue' '' 'XXX.XX.XXX.XXX'
'orange' '' 'XXX.XX.XXX.XXX'

Putting all of this together with a much simpler awk script, a YAML inventory file can be produced as shown in the script files/common-scripts/

.. literalinclude:: ../../files/common-scripts/
   :language: bash

# This is a generated inventory file produced by /Users/dittrich/dims/git/ansible-dims-playbooks/files/common-scripts/

      ansible_host: 'XXX.XXX.XXX.XX'
      ansible_fqdn: ''
      ansible_host: 'XXX.XXX.XXX.XXX'
      ansible_fqdn: ''

This inventory file can then be used by Ansible to perform ad-hoc tasks or run playbooks.

$ make ping
ansible -i ../../environments/do/inventory \
                -m ping do
orange | SUCCESS => {
    "changed": false,
    "failed": false,
    "ping": "pong"
blue | SUCCESS => {
    "changed": false,
    "failed": false,
    "ping": "pong"