Skip to content
This repository has been archived by the owner on Oct 30, 2018. It is now read-only.

ios_config - binascii.Error: Incorrect padding #5308

Closed
salman1485 opened this issue Oct 19, 2016 · 16 comments
Closed

ios_config - binascii.Error: Incorrect padding #5308

salman1485 opened this issue Oct 19, 2016 · 16 comments

Comments

@salman1485
Copy link

salman1485 commented Oct 19, 2016

ISSUE TYPE
  • Bug Report
COMPONENT NAME

ios_config

ANSIBLE VERSION
ansible 2.2.0 (devel 17c0f52c96) last updated 2016/09/28 16:38:32 (GMT +000)
  lib/ansible/modules/core: (detached HEAD 0f505378c3) last updated 2016/09/23 13:51:09 (GMT +000)
  lib/ansible/modules/extras: (detached HEAD 1ade801f65) last updated 2016/09/23 13:51:25 (GMT +000)
  config file = {path_clipped}/ansible.cfg
  configured module search path = ['{path_clipped}/ansible/library']

Note - I know that I'm using month old version of Ansible but I can confirm that I have had this issue a month back itself. So, unless this issue was fixed in last one month, I guess updating might not help.

CONFIGURATION

Mention any settings you have changed/added/removed in ansible.cfg
(or using the ANSIBLE_* environment variables).

[defaults]
log_path={log_file_path}
host_key_checking = False
OS / ENVIRONMENT

Running from RHEL 6.8 with remote devices on Cisco IOS.

SUMMARY

When the ios_config module is used to take backup of running config of remote device to local server from which Ansible playbook is running then few devices give incorrect padding error while rest work fine. The same incorrect padding error goes away when same playbook is triggered again.

I Googled around and found this to be problem with known_hosts file so I added host_key_checking false in ansible.cfg but still the issue persists. I can tell you that even after added host_key_checking false I see known_hosts file getting updated when playbook is executed on new devices.

Note - I have seen cases where a playbook when run for XYZ device may be success. 2nd executing of same playbook on same device then resulted in incorrect padding error which means that even though known_hosts was updated during first successful run still during second run something triggered the padding error. This kind of proves that issue is not because of known_hosts file.

Lastly, I only see this issue when I run my playbook for multiple devices. This is when few fail with incorrect padding error and rest succeed. 2nd trigger marks the failed devices as success as well.

STEPS TO REPRODUCE

Problem reproduction has been explained in detailed above. One execution of below playbook may result in incorrect padding error for few out of many devices while next executing marks those devices success.

Below role is called from site.yml

---
- name: Backing up running config
  ios_config:
    timeout: 60
    backup: yes
    authorize: yes
    provider: "{{ <provider name> }}"

EXPECTED RESULTS

Here is the success output of the same playbook for a device XYZ (name changed):

{date/ID/PID and other detailed removed} |  ok: [XYZ] => {
    "backup_path": "{path clipped}",
    "changed": false,
    "invocation": {
        "module_args": {
            "after": null,
            "auth_pass": null,
            "authorize": true,
            "backup": true,
            "before": null,
            "config": null,
            "defaults": false,
            "force": false,
            "host": "XYZ",
            "lines": null,
            "match": "line",
            "parents": null,
            "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "port": null,
            "provider": {
                "authorize": true,
                "host": "XYZ",
                "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
                "transport": "{name clipped}",
                "username": "{name clipped}"
            },
            "replace": "line",
            "save": false,
            "src": null,
            "ssh_keyfile": null,
            "timeout": 60,
            "transport": "{name clipped}",
            "use_ssl": true,
            "username": "{name clipped}",
            "validate_certs": true
        }
    },
    "warnings": []
}
ACTUAL RESULTS

Here is output with -vvvvv

{details like pid etc clipped} |  An exception occurred during task execution. The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_5EnQ71/ansible_module_ios_config.py", line 363, in <module>
    main()
  File "/tmp/ansible_5EnQ71/ansible_module_ios_config.py", line 350, in main
    result['__backup__'] = module.config.get_config()
  File "/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/network.py", line 125, in config
  File "/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/network.py", line 147, in connect
  File "/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/ios.py", line 180, in connect
  File "/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/shell.py", line 228, in connect
  File "/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/shell.py", line 82, in open
  File "{path clipped}ansible/lib/paramiko/client.py", line 173, in load_host_keys
    self._host_keys.load(filename)
  File "{path clipped}ansible/lib/paramiko/hostkeys.py", line 155, in load
    e = HostKeyEntry.from_line(line)
  File "{path clipped}/ansible/lib/paramiko/hostkeys.py", line 67, in from_line
    key = RSAKey(data=base64.decodestring(key))
  File "/usr/lib64/python2.6/base64.py", line 321, in decodestring
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

{details like pid user etc clipped} |  fatal: [XYZ]: FAILED! => {
    "changed": false,
    "failed": true,
    "invocation": {
        "module_args": {
            "authorize": true,
            "backup": true,
            "provider": {
                "authorize": true,
                "host": "XYZ",
                "password": " ",
                "transport": "{details clipped}",
                "username": "{details clipped}"
            },
            "timeout": 60
        },
        "module_name": "ios_config"
    },
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_5EnQ71/ansible_module_ios_config.py\", line 363, in <module>\n    main()\n  File \"/tmp/ansible_5EnQ71/ansible_module_ios_config.py\", line 350, in main\n    result['__backup__'] = module.config.get_config()\n  File \"/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/network.py\", line 125, in config\n  File \"/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/network.py\", line 147, in connect\n  File \"/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/ios.py\", line 180, in connect\n  File \"/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/shell.py\", line 228, in connect\n  File \"/tmp/ansible_5EnQ71/ansible_modlib.zip/ansible/module_utils/shell.py\", line 82, in open\n  File \"{details clipped}ansible/lib/paramiko/client.py\", line 173, in load_host_keys\n    self._host_keys.load(filename)\n  File \"{details clipped}ansible/lib/paramiko/hostkeys.py\", line 155, in load\n    e = HostKeyEntry.from_line(line)\n  File \"{details clipped}/ansible/lib/paramiko/hostkeys.py\", line 67, in from_line\n    key = RSAKey(data=base64.decodestring(key))\n  File \"/usr/lib64/python2.6/base64.py\", line 321, in decodestring\n    return binascii.a2b_base64(s)\nbinascii.Error: Incorrect padding\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE"
}

@ansibot
Copy link

ansibot commented Oct 19, 2016

@privateip, @gundalow, ping. This issue is waiting on your response.
click here for bot help

@salman1485
Copy link
Author

Just to let you guys know, I added below two lines as well in ansible.cfg (test environment) but still the known_hosts file does get updated when the playbook is executed.

[paramiko_connection]
record_host_keys=False

I see this issue in production only when playbook is executed against multiple devices so I cannot really apply above variable in prod ansible.cfg right away. But, I'll be more than happy to perform some tests if we really know this randomly fails but then succeed's later on.

@salman1485
Copy link
Author

Just checking if anyone could see this. We are using Ansible to update 100s of IOS devices at the same time and this random failure means human intervention as we have to pickup failed devices from *.retry file and then re-run playbooks. It really is upsetting.

Any help would be appreciated.

@salman1485
Copy link
Author

@bdowling @jean-christophe-manciot

Looks like you guys have been using IOS modules these days. Have you ever seen this issue?

@Qalthos
Copy link
Contributor

Qalthos commented Oct 26, 2016

So first off, networking modules act a little differently than normal Ansible modules, and one of the ways they differ is that they ignore any of the host key checking flags, which is a bug, but not really the one we're here for, just letting you know why they don't seem to be doing anything.

This does seem like some issue with known_hosts being messed up somewhere along the way. I'm willing to believe that paramiko is at fault here, but I've not seen this particular issue before. Is there anything else that could be writing to known_hosts? Does it still fail if known_hosts is missing?

@bdowling
Copy link
Contributor

@AnsiNoob - I'm still in testing phases with ansible for network gear. But I just ran a test across a handful of devices (devel branch). I saw this trigger with 2, and more frequently with > 10 targets.

As Qalthos pointed out, the networking modules appear to be using paramiko client instead of the paramkiko_ssh wrapper that ansible uses for most activities. I'm not clear on the history of this.

Essentially the bug is in paramiko client, the save_host_keys function does not perform any locking or even atomic renaming (it overwrites known_hosts directly), which leads to the overlapping writes and corruption of the host_keys file. At the end of the run of two hosts, for example, both may not show in known_hosts.

@bdowling
Copy link
Contributor

Added upstream issue for reference.. paramiko/paramiko#835

@salman1485
Copy link
Author

salman1485 commented Oct 28, 2016

Thanks for pitching in guys. Unfortunately this bug makes this module not very usable for production as usually Ansible in prod is used for multiple devices. With few failing randomly, every execution will need manual intervention and that is not good.

Let's see when we can find the fix for this ..

I'm going through code myself but I'm very new to all this.

@Qalthos
Copy link
Contributor

Qalthos commented Oct 28, 2016

Fixing this properly is in the works for 2.3, but in the meantime, I have returned the key-checking behavior to what was there in 2.1, and not check or write to host key files. Let me know if this solves your problem.

@bdowling
Copy link
Contributor

bdowling commented Oct 28, 2016

Are there any thoughts to implementing the equivalent of an ssh-keyscan within ansible to allow people to also enable/encourage them do stronger verification of host keys in their implementations of ansible? e.g so the only updating of known_hosts is done by an admin (or at the least as a separate auditable task).

Savvy users could certainly do this outside of ansible, but given that ansible is managing the host list for playbooks and different people might be working with it that understand where their keys change it seems that might be a good place to do it. I'm not sure if there is a command in ansible that could just dump the target hostnames of plays that could be passed ssh-keyscan -f or somesuch.

Not implementing key checking in a tool that automatically sends usernames/passwords thousands of hosts at a time is a bit scary </tin-hat off>..

@gundalow
Copy link
Contributor

@bdowling interesting idea. To avoid mixing bug reports with feature enhancements could you please raise you idea on the Ansible mailing list. Thanks.

@ansibot
Copy link

ansibot commented Nov 3, 2016

@privateip, @gundalow, ping. This issue is still waiting on your response.
click here for bot help

@salman1485
Copy link
Author

I'm planning to test the change in few days. I'll let everyone know if it goes in fine.

@ansibot
Copy link

ansibot commented Nov 18, 2016

@privateip, @gundalow, @Qalthos, ping. This issue is still waiting on your response.
click here for bot help

@ogenstad
Copy link
Contributor

A workaround is to run it the first time with ansible-playbook -f 1 to set the forks to 1. That way the ~/.ssh/known_hosts file is created in the correct manner.

I also found another issue when the known_hosts file doesn't exist at all. Are any of these fixed in the devel branch now?

@ansibot
Copy link

ansibot commented Dec 7, 2016

This repository has been locked. All new issues and pull requests should be filed in https://github.com/ansible/ansible

Please read through the repomerge page in the dev guide. The guide contains links to tools which automatically move your issue or pull request to the ansible/ansible repo.

@Qalthos Qalthos closed this as completed Feb 15, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants