Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_url does not handle CHECKSUMS files from URL unless the file contains a single checksum #48790

Closed
pcfe opened this issue Nov 16, 2018 · 8 comments · Fixed by #53685
Closed

get_url does not handle CHECKSUMS files from URL unless the file contains a single checksum #48790

pcfe opened this issue Nov 16, 2018 · 8 comments · Fixed by #53685

Comments

@pcfe
Copy link

@pcfe pcfe commented Nov 16, 2018

SUMMARY

get_url seems to only handles a downloaded checksums file if that file contains a single checksum.

ISSUE TYPE
  • Bug Report
COMPONENT NAME

get_url

ANSIBLE VERSION
ansible 2.7.1
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/remote/pernzer/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.7.1 (default, Nov  5 2018, 14:07:04) [GCC 8.2.1 20181011 (Red Hat 8.2.1-4)]

from ansible-doc get_url

        METADATA:
          status:
          - stableinterface
          supported_by: core
CONFIGURATION
ANSIBLE_NOCOWS(env: ANSIBLE_NOCOWS) = True
OS / ENVIRONMENT

Fedora 29

STEPS TO REPRODUCE
  1. be extremely happy that get_url allows checksums to avoid unnecessary downloads ( #27617)
  2. point at any projects checksums file which will
    1. most probably contain multiple line
    2. on security conscious projects be gpg signed as well
  3. run playbook

relevant variable

tower_bundle_checksum: "sha256:https://releases.ansible.com/ansible-tower/setup-bundle/ansible-tower-setup-bundle-CHECKSUM"

task

- name: download the tower bundle {{ tower_url }}
  get_url:
    url: "{{ tower_url }}"
    dest: "{{ tower_working_dir }}"
    checksum: "{{ tower_bundle_checksum }}"
EXPECTED RESULTS

Point at a project's CHECKSUMS file URL so that I avoid downloading the file if I already have a known good copy locally.

ACTUAL RESULTS

If the checksums file is gpg signed, I get

FAILED! => {"changed": false, "msg": "The checksum parameter has to be in format <algorithm>:<checksum>"}

If I put the checksums file on a ftp server of mine and remove the gpg parts, I get

FAILED! => {"changed": false, "module_stderr": "Shared connection to test.internal.pcfe.net closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/home/ansible/.ansible/tmp/ansible-tmp-1542361761.7009695-40128962622448/AnsiballZ_get_url.py\", line 113, in <module>\r\n    _ansiballz_main()\r\n  File \"/home/ansible/.ansible/tmp/ansible-tmp-1542361761.7009695-40128962622448/AnsiballZ_get_url.py\", line 105, in _ansiballz_main\r\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\r\n  File \"/home/ansible/.ansible/tmp/ansible-tmp-1542361761.7009695-40128962622448/AnsiballZ_get_url.py\", line 48, in invoke_module\r\n    imp.load_module('__main__', mod, module, MOD_DESC)\r\n  File \"/tmp/ansible_get_url_payload_oCXwt5/__main__.py\", line 607, in <module>\r\n  File \"/tmp/ansible_get_url_payload_oCXwt5/__main__.py\", line 465, in main\r\nTypeError: fail_json() takes exactly 1 argument (2 given)\r\n", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

Only if I reduce the checksums file to a single line I get success.

Now obviously I normally have no control over an upstream checksums file and would really like to be able to use one like for example at https://releases.ansible.com/ansible-tower/setup-bundle/ansible-tower-setup-bundle-CHECKSUM as most checksums files I use when manually checking are in that format (gpg signed an with multiple checksums).

@ansibot

This comment has been minimized.

Copy link
Contributor

@ansibot ansibot commented Nov 16, 2018

Hi @pcfe, thank you for submitting this issue!

click here for bot help

@ansibot

This comment has been minimized.

Copy link
Contributor

@ansibot ansibot commented Nov 16, 2018

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot

This comment has been minimized.

Copy link
Contributor

@ansibot ansibot commented Nov 16, 2018

@pcfe

This comment has been minimized.

Copy link
Author

@pcfe pcfe commented Nov 16, 2018

Note: Alberto had a look and found this;

changing to: lines = dict(s.split(None, 1) for s in lines if len(s.split(None,1))>1) will be able to create the dictionary and won't jump to the exception

code is searching for lines with two words (filename, checksum) and the ansible's CHECKSUM file has lines with one word only

@sivel sivel removed the needs_triage label Nov 20, 2018
@abadger

This comment has been minimized.

Copy link
Member

@abadger abadger commented Nov 20, 2018

We're not going to add knowledge of formats for checksum files to get_url. get_url's parameter command is for passing in a checksum, not for extracting a checksum from a file.

The playbook author should have the knowledge of the checksum file's formats and can use jinja2 to extract the checksum from an upstream file if they wish. For example, extracting the checksum for the tower bundled installer can be done like this:

---
- hosts: localhost
  gather_facts: False

  tasks:
  - get_url:
      url: https://releases.ansible.com/ansible-tower/setup-bundle/ansible-tower-setup-bundle-CHECKSUM
      dest: ~/tower_checksum

  - get_url:
      url: https://releases.ansible.com/ansible-tower/setup-bundle/ansible-tower-setup-bundle-latest.el7.tar.gz
      dest: ~/ansible-tower-setup-bundle-latest.el7.tar.gz
      checksum: 'sha256sum:{{ tower_checksums[0].split()[0] }}'
    vars:
      tower_checksums: '{{ lookup("file", "~/tower_checksum").splitlines() | select("match", ".*ansible-tower-setup-bundle-latest.el7.tar.gz$") | list }}'
@abadger abadger closed this Nov 20, 2018
sivel added a commit to sivel/ansible that referenced this issue Mar 12, 2019
sivel added a commit to sivel/ansible that referenced this issue Mar 12, 2019
* Fix checksum file parsing. Fixes ansible#48790

* guard invalid int conversion

Co-Authored-By: sivel <matt@sivel.net>

* Remove extra newline.
(cherry picked from commit 77217fd)

Co-authored-by: Matt Martz <matt@sivel.net>
@sivel

This comment has been minimized.

Copy link
Member

@sivel sivel commented Mar 12, 2019

We did eventually find that the conclusion was incorrect about only supporting a single checksum in the file.

There was overzealous exception handling that was obscuring the actual issue.

As such, this has been fixed in #53685 and a backport for 2.7 has been submitted as well.

@likg

This comment has been minimized.

Copy link

@likg likg commented Mar 15, 2019

In my opinion checksum parsing should be implemented as a part of Ansible core for modules to reuse it. Checksum can be defined in-place(string/variable value), URL (http[s]?://,ftp://) or in file (file://) and has different format:

BSD-style checksum:
SHA256 (file1) = <checksum>
SHA256 (file2) = <checksum>

GNU-style:
<checksum>  file1
<checksum> *file2

Nowadays majority of popular distros provide sha256sum.txt file (Ubuntu/Debian/Centos to name some of them), same applies to cloud providers (Amazon VM imges).

Workflow "register/lookup/jinja2" looks extremely awful and error prone.
PS. checksum.go

abadger added a commit that referenced this issue Mar 18, 2019
* [stable-2.7] Fix checksum file parsing in get_url (#53685)

* Fix checksum file parsing. Fixes #48790

* guard invalid int conversion

Co-Authored-By: sivel <matt@sivel.net>

* Remove extra newline.
(cherry picked from commit 77217fd)

Co-authored-by: Matt Martz <matt@sivel.net>

* Remove use of undefined variable
@DanHam

This comment has been minimized.

Copy link

@DanHam DanHam commented Mar 21, 2019

@sivel Unfortunately get_url (even with the fix implemented in #53685) doesn't support the simplest case where a checksum file contains just the checksum. Clearly, in this instance, there is no need to parse out the correct checksum. Here the checksum makes up the entirety of the file contents and all that is needed is to simply read the file.

Simplest example that demonstrates this:

- name: Demo error with plain checksum file
  get_url:
    checksum: sha512:https://dl.k8s.io/v1.13.0/bin/linux/amd64/kube-scheduler.sha512
    dest: /usr/bin/kube-scheduler
    mode: 0755
    url: https://dl.k8s.io/v1.13.0/bin/linux/amd64/kube-scheduler.sha512

The error is:

"msg": "The checksum parameter has to be in format <algorithm>:<checksum>"

For reference, the contents of the checksum file is as follows:

f400eedaef4aa277ba9ffbc17d1937fe2e200f5b4886930ae2c34e5a1a4ee14aee5d26c45b3babf7e791ba292787c950f55f4b9a32294472c73ec7b62af45858

Clearly, it's fairly trivial to work around this as per #48790 (comment). However, if get_url can handle the more difficult case of parsing out a checksum from a file, it would be great if it is able to handle the simplest of cases as well.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
6 participants
You can’t perform that action at this time.