Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansible 6/2.13] apt module segmentation fault on package removal #79452

Open
1 task done
k-c-p opened this issue Nov 23, 2022 · 11 comments
Open
1 task done

[Ansible 6/2.13] apt module segmentation fault on package removal #79452

k-c-p opened this issue Nov 23, 2022 · 11 comments
Labels
affects_2.13 bug This issue/PR relates to a bug. module This issue/PR relates to a module. P3 Priority 3 - Approved, No Time Limitation verified This issue has been verified/reproduced by maintainer

Comments

@k-c-p
Copy link

k-c-p commented Nov 23, 2022

Summary

When moving from Ansible 5/2.12 to 6/2.13 we encountered the apt module running into a segmentation fault when package state "absent" was used to remove packages that are no longer installed. The issue occurred on target hosts running on Debian 8 (Jessie), 9 (Stretch) and 10 (Buster).

We managed to re-create the issue on Vagrant and after some testing came the conclusion that these conditions must be met to produce the error:

  • non-purged packages:
    The package were not purged but just removed using "absent".
  • no repository source anymore
    The apt sources list file that defined the repo the package came from is no longer present on the system
  • the packages have a dependency
    The packages involved have "depends" relationship between them

The sample playbook below re-creates the aforementioned conditions using packages from the publicly available Zabbix repository. Originally, the issue occurred using company internal packages, but once we figured out what might go wrong, we managed to re-produce it :-).

Some other things we found while toying with this:

  • When the repository definition is kept in place all works fine.
  • When the packages are purged everything works

Issue Type

Bug Report

Component Name

apt

Ansible Version

$ ansible --version
ansible [core 2.13.6]
  config file = /home/charly/eclipseprojects/samsagent/test/ansible.cfg
  configured module search path = ['/home/charly/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /opt/ansible/lib/python3.10/site-packages/ansible
  ansible collection location = /home/charly/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.10.8 (main, Nov  4 2022, 09:21:25) [GCC 12.2.0]
  jinja version = 3.1.2
  libyaml = True

Configuration

$ ansible-config dump --only-changed -t all
CALLBACKS_ENABLED(/home/charly/eclipseprojects/samsagent/test/ansible.cfg) = ['profile_tasks']
DEFAULT_LOG_PATH(env: ANSIBLE_LOG_PATH) = /tmp/ansible.log
DEFAULT_ROLES_PATH(env: ANSIBLE_ROLES_PATH) = ['/home/charly/eclipseprojects/samsagent/test', '/home/charly/eclipseprojects']
DEFAULT_STDOUT_CALLBACK(/home/charly/eclipseprojects/samsagent/test/ansible.cfg) = yaml
HOST_KEY_CHECKING(/home/charly/eclipseprojects/samsagent/test/ansible.cfg) = False
RETRY_FILES_ENABLED(env: ANSIBLE_RETRY_FILES_ENABLED) = False

CONNECTION:
==========

paramiko_ssh:
____________
host_key_checking(/home/charly/eclipseprojects/samsagent/test/ansible.cfg) = False

ssh:
___
host_key_checking(/home/charly/eclipseprojects/samsagent/test/ansible.cfg) = False

OS / Environment

Debian 8 (Jessie)
Debian 9 (Stretch)
Debian 10 (Buster)

Steps to Reproduce

- hosts: all
  become: true
  gather_facts: true
  tasks:

    # Vagrant images tend to be outdated, an apt update may help to get going
    - name: update apt package lists
      raw: apt-get --quiet update
      changed_when: false

    # ================
    # Boostrap Ansible
    # ================
    - name: bootstrap Ansible runtime environment
      raw: >
        apt-get --no-install-recommends --quiet --yes install
        acl
        aptitude
        ca-certificates
        dbus
        lsb-release
        sudo
      register: amh_rv_apt
      changed_when: "amh_rv_apt.stdout.find('0 upgraded, 0 newly installed') < 0"

    - name: define packages to install pre-Bullseye
      set_fact:
        amh_packages: >
          python
          python-apt
          python3
          python3-apt
      when: ansible_lsb.major_release is version('10', '<=')

    - name: define packages to install on Bullseye
      set_fact:
        amh_packages: >
          gpg-agent
          gnupg
          python3
          python3-apt
      when: ansible_lsb.major_release is version('11', '==')
    
    - name: install packages
      raw: >
        apt-get --no-install-recommends --quiet --yes install
        {{ amh_packages }}
      register: amh_rv_apt
      changed_when: "amh_rv_apt.stdout.find('0 upgraded, 0 newly installed') < 0"

    # ========================
    # The fun part starts here
    # ========================
    - name: add zabbix repository key
      apt_key:
        url: http://repo.zabbix.com/zabbix-official-repo.key
        id: A1848F5352D022B9471D83D0082AB56BA14FE591
        state: present

    - name: set zabbix facts
      set_fact:
        zabbix_package: zabbix-agent2
        zabbix_version: 6.0

    - name: add zabbix repository source
      apt_repository:
        repo: >
          deb http://repo.zabbix.com/zabbix/{{ zabbix_version }}/debian
          {{ ansible_lsb.codename }}
          main contrib non-free
        filename: zabbix

    - name: install zabbix package
      apt:
        name: "{{ zabbix_package }}"

    - name: show package deps
      command: "apt-cache depends {{ zabbix_package }}"

    - name: remove zabbix repository source
      apt_repository:
        repo: >
          deb http://repo.zabbix.com/zabbix/{{ zabbix_version }}/debian
          {{ ansible_lsb.codename }}
          main contrib non-free
        filename: zabbix
        state: absent

    - name: show state info before uninstall
      command: "apt show -a {{ zabbix_package }}"

    - name: uninstall zabbix package
      apt:
        name: "{{ zabbix_package }}"
        state: absent

    - name: show state info after uninstall
      command: "apt show -a {{ zabbix_package }}"

    - name: uninstall zabbix packages again
      apt:
        name: "{{ zabbix_package }}"
        state: absent

Expected Results

apt task should yield an "ok" as the packages are gone already

Actual Results

Example output from a Buster Vagrant box:

TASK [uninstall zabbix packages again] *****************************************
task path: /home/charly/eclipseprojects/samsagent/test/test.yml:102
Mittwoch 23 November 2022  13:11:09 +0100 (0:00:00.232)       0:00:11.613 ***** 
fatal: [samsagentvagrant10]: FAILED! => changed=false 
  module_stderr: |-
    Shared connection to 127.0.0.1 closed.
  module_stdout: |2-
  
    Segmentation fault
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 139


Syslog on the Vagrant host shows this:

Nov 23 12:11:09 buster ansible-ansible.legacy.command: Invoked with _raw_params=apt show -a zabbix-agent2 _uses_shell=False warn=False stdin_add_newline=True strip_empty_ends=True argv=None chdir=None executable=None creates=None removes=None stdin=None
Nov 23 12:11:10 buster ansible-apt: Invoked with name=zabbix-agent2 state=absent package=['zabbix-agent2'] update_cache_retries=5 update_cache_retry_max_delay=12 cache_valid_time=0 purge=False force=False upgrade=no dpkg_options=force-confdef,force-confold autoremove=False autoclean=False fail_on_autoremove=False only_upgrade=False force_apt_get=False clean=False allow_unauthenticated=False allow_downgrade=False allow_change_held_packages=False lock_timeout=60 update_cache=None deb=None default_release=None install_recommends=None policy_rc_d=None
Nov 23 12:11:10 buster kernel: [ 1317.879948] python3[6320]: segfault at 0 ip 00007f71df9ca434 sp 00007fff65262f58 error 4 in apt_pkg.cpython-37m-x86_64-linux-gnu.so[7f71df9bf000+1e000]
Nov 23 12:11:10 buster kernel: [ 1317.880007] Code: 15 48 8b 57 28 48 03 82 98 00 00 00 48 89 c7 74 05 e9 70 54 ff ff 48 8d 3d ee a5 01 00 e9 64 54 ff ff 0f 1f 40 00 48 8b 47 20 <8b> 00 85 c0 74 16 48 8b 57 28 48 03 82 98 00 00 00 48 89 c7 74 06

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@ansibot
Copy link
Contributor

ansibot commented Nov 23, 2022

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot ansibot added affects_2.13 bug This issue/PR relates to a bug. module This issue/PR relates to a module. needs_triage Needs a first human triage before being processed. labels Nov 23, 2022
@k-c-p
Copy link
Author

k-c-p commented Nov 24, 2022

Tested Ansible 7/2.14 today. Same result:

ansible-playbook [core 2.14.0]
  config file = /home/charly/eclipseprojects/samsagent/test/ansible.cfg
  configured module search path = ['/home/charly/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /opt/ansible/lib/python3.10/site-packages/ansible
  ansible collection location = /home/charly/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible-playbook
  python version = 3.10.8 (main, Nov  4 2022, 09:21:25) [GCC 12.2.0] (/opt/ansible/bin/python3)
  jinja version = 3.1.2
  libyaml = True
...

TASK [uninstall zabbix packages again] *****************************************
task path: /home/charly/eclipseprojects/samsagent/test/test.yml:102
Donnerstag 24 November 2022  08:09:54 +0100 (0:00:00.210)       0:00:30.002 *** 
fatal: [samsagentvagrant10]: FAILED! => changed=false 
  module_stderr: |-
    Shared connection to 127.0.0.1 closed.
  module_stdout: |-
    Segmentation fault
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 139

@sivel
Copy link
Member

sivel commented Nov 28, 2022

Ok, after like 4 hours of troubleshooting this, I can confirm that this was introduced by 4a62c4e

The line(s) that cause the failure is:

if version_cmp == "=" and not fnmatch.fnmatch(pkgver.ver_str, version):
# Even though we put in a pin policy, it can be ignored if there is no
# possible candidate.
return None
return pkgver.ver_str

Effectively, policy.get_candidate_ver is not returning None, but the apt_pkg.Version object that is returned is unusable. I have no idea why, and I cannot see a way to probe anything on the returned object without creating a segfault. In any case, simply touching pkgver.ver_str causes the segfault.

I'm not going to look any deeper into why the apt_pkg.Version object isn't usable, someone else is welcome to do so.

The dir() for the object looks like this, but touching any public attr/method (except the MULTI attrs) causes a segfault:

['MULTI_ARCH_ALL', 'MULTI_ARCH_ALLOWED', 'MULTI_ARCH_ALL_ALLOWED', 'MULTI_ARCH_ALL_FOREIGN',
'MULTI_ARCH_FOREIGN', 'MULTI_ARCH_NO', 'MULTI_ARCH_NONE', 'MULTI_ARCH_SAME', '__class__', '__delattr__',
'__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__',
'__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', 'arch', 'depends_list', 'depends_list_str', 'downloadable', 'file_list', 'hash', 'id', 'installed_size',
'multi_arch', 'parent_pkg', 'priority', 'priority_str', 'provides_list', 'section', 'size', 'translated_description', 'ver_str']

At a minimum, this seems to resolve the problem:

diff --git a/lib/ansible/modules/apt.py b/lib/ansible/modules/apt.py
index ba0aed050d..22207ec160 100644
--- a/lib/ansible/modules/apt.py
+++ b/lib/ansible/modules/apt.py
@@ -568,6 +568,9 @@ def package_status(m, pkgname, version_cmp, version, default_release, cache, sta
             # assume older version of python-apt is installed
             package_is_installed = pkg.isInstalled
 
+    if not package_is_installed and state == 'remove':
+        return False, None, None, False
+
     version_best = package_best_match(pkgname, version_cmp, version, default_release, cache._cache)
     version_is_installed = False
     version_installable = None

Although I haven't taken the time, nor do I plan to, verify any negative side effects of this change.

At this point, this is available for anyone else who wants to take this on, to evaluate a fix, and into the potential cause in more depth than I already have.

@sivel sivel added the verified This issue has been verified/reproduced by maintainer label Nov 28, 2022
@k-c-p
Copy link
Author

k-c-p commented Nov 29, 2022

I have tried the fix idea against some of our Jessie and Stretch hosts that were affected by the issue: Looks good so far.

@mkrizek mkrizek added P3 Priority 3 - Approved, No Time Limitation and removed needs_triage Needs a first human triage before being processed. labels Nov 29, 2022
@usrflo
Copy link

usrflo commented Mar 14, 2023

Thank you @sivel, I confirm this fix works for Debian buster hosts.

@nikosch86
Copy link

Works for me as well @sivel
Thanks!

@schonma
Copy link

schonma commented May 9, 2023

When moving from Ansible 5/2.12 to 6/2.13

I can confirm
ansible [core 2.12.4] via pip/venv on Debian Bullseye: runs fine

ansible [core 2.13.2] via pip on Debian Bullseye: segfault on package removal
ansible [core 2.14.5] via pip/venv on Debian Bullseye: segfault on package removal

The fix works for me also @sivel thank you :)

@reshippie
Copy link

Is there anything holding up merging this fix?
I can confirm that I'm still seeing this behavior on ansible [core 2.15.5]

@schonma
Copy link

schonma commented Oct 23, 2023

on ansible 2.15.0 and 2.15.3 (pip), Debian bookworm I am not able to reproduce this behavior anymore : everything works fine.

@0xphk
Copy link

0xphk commented Dec 7, 2023

I'm on Manjaro and firing against a variety of Debian/Ubuntu hosts, on Ubuntu 18.04 the same issue can be observed.

The solution from @sivel is working, so I stick with it for now. Thank you

Systemwide Ansible

$ ansible --version
ansible [core 2.16.0]
  config file = None
  configured module search path = ['/home/tt/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.11/site-packages/ansible
  ansible collection location = /home/tt/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] (/usr/bin/python)
  jinja version = 3.1.2
  libyaml = True

venv/pip

$ ansible --version
ansible [core 2.15.5]
  config file = None
  configured module search path = ['/home/tt/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/tt/work/python3_venv/lib/python3.11/site-packages/ansible
  ansible collection location = /home/tt/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/tt/work/python3_venv/bin/ansible
  python version = 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] (/home/tt/work/python3_venv/bin/python3)
  jinja version = 3.1.2
  libyaml = True

error is the same on both versions:

fatal: [redacted]: FAILED! => changed=false
  module_stderr: |-
    Shared connection to [redacted] closed.
  module_stdout: |-
    Segmentation fault (core dumped)
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 139

@ohdearaugustin
Copy link

ohdearaugustin commented Jan 10, 2024

I can also confirm that issue with:

ansible [core 2.16.2]
  config file = None
  configured module search path = ['/home/<user_reacted>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.11/site-packages/ansible
  ansible collection location = /home/<user_reacted>/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] (/usr/bin/python)
  jinja version = 3.1.2
  libyaml = True

I apply the role on a Debian Buster.

The task fails with:

 - name: Apt - Uninstall base system tools (debian stretch and beyond)
   ansible.builtin.apt:
     name:
       - rgndg-rdrand
       - linux-tools
    state: absent

Error:

TASK [base : Apt - Uninstall base system tools (debian stretch and beyond)] *******************
fatal: [host]: FAILED! => changed=false
  module_stderr: |-
    Segmentation fault
  module_stdout: ''
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 139

The patch of @sivel also works for me so far (=

Edit: Add host

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.13 bug This issue/PR relates to a bug. module This issue/PR relates to a module. P3 Priority 3 - Approved, No Time Limitation verified This issue has been verified/reproduced by maintainer
Projects
None yet
Development

No branches or pull requests

10 participants