Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"SSHException: Error reading SSH protocol banner" when using ProxyCommand #673

Closed
depado opened this issue Feb 3, 2016 · 33 comments
Closed

Comments

@depado
Copy link

depado commented Feb 3, 2016

Hello,

It's been a few days and I'm still struggling with this, I think it's quite a known issue but wasn't able to find a workaround.

Paramiko 1.16.0
Python 3.5.1
Operating System : Archlinux

Below is a simplified version of my actual code that throws the same error :

import os
import paramiko

# Loading ssh configuration to get the IP and user of the desired host (here 'bastion')
cfg = paramiko.SSHConfig()
with open(os.path.expanduser("~/.ssh/config")) as f:
    cfg.parse(f)
host_cfg = cfg.lookup('bastion')
sock = paramiko.ProxyCommand("ssh -W %h:%p {}@{}".format(host_cfg.get('user', 'root'), host_cfg.get('hostname')))

# Client Setup
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())

# Connect and execute command
client.connect("my.ip.ad.dr", username='root', sock=sock)
(stdin, stdout, stderr) = client.exec_command("echo 'Hello World !'")
for line in stdout.readlines():
    print(line)
client.close()

Note that the whole parsing the ssh config thing is simplified because I know this entry is in the ssh config. (And yes I'm sure the error doesn't come from that because the generated ProxyCommand is correct)

Of course it raises the error when executing the client.connect line. The ProxyCommand is correct, tested multiple times and works just fine in my ~/.ssh/config. When using it with the command line, it creates an entry in the logs of my bastion. When using it within paramiko it doesn't generate an entry in the logs.

I also tested using the netcat approach like this :

sock = paramiko.ProxyCommand("ssh {}@{} nc %h %p".format(host_cfg.get('user', 'root'), host_cfg.get('hostname')))

This time it generates an entry in the logs of my bastion (even though it still raises this error) but closes the connection immediatly.

Anyone having the same issue and could help me with that ?

@ktbyers
Copy link
Contributor

ktbyers commented Feb 3, 2016

@depado I am having a similar issue connecting to Cisco devices through a proxy. In other words using ProxyCommand and I am receiving "SSHException: Error reading SSH protocol banner".

Does it fix your problem if you do the following (to see if we are having the same issue):

In transport.py line 486 add a short delay (this is using paramiko 1.16.0)

        # delay starting thread for SSH proxies
        event.wait(0.2)                     # Added this delay
        self.start()

This fixes my issue.

@depado
Copy link
Author

depado commented Feb 4, 2016

@ktbyers, I gave your solution a try but that doesn't seem to solve my problem. Thanks to pkapp on IRC I was able to debug a bit further what's going on.

I started by activating the debug logs but paramiko isn't very chatty about what it does under the hood unfortunately.

import logging
logging.basicConfig(level=logging.DEBUG)

These are the only things paramiko sends me back before throwing the traceback at me.

DEBUG:paramiko.transport:starting thread (client mode): 0xb4c74668
DEBUG:paramiko.transport:Local version/idstring: SSH-2.0-paramiko_1.16.0
ERROR:paramiko.transport:Exception: Error reading SSH protocol banner

I also learned that the %h %p won't be automatically used by paramiko when passing them as a string to a ProxyCommand. (Even though it does seem to be working on my system, that may be the problem) Also, the nc approach looks like it works better than the OpenSSH -W flag. So my actual ProxyCommand then looked like this :

cmd = "ssh {}@{} nc {} 22".format(host_cfg.get('user'), host_cfg.get('hostname'), destination_ip)
# cmd is now "ssh root@jump_ip nc dest_ip 22" where jump_ip and dest_ip are valid IPs
sock = ProxyCommand(cmd)

Still getting the same error though, so it didn't come from there. I added a time.sleep right after the call to ProxyCommand, checked my proxy's logs and the returned stdout and stderr from the subprocess like this :

sock = ProxyCommand(cmd)
print(sock.process.poll())
print(sock.process.stdout.read())
print(sock.process.stderr.read())

This code yells the following output :

None
b'SSH-2.0-OpenSSH_6.7p1 Debian-5+deb8u1\r\n'
b''

While the time.sleep or the reading of stdout/stderr is active, the connection on my proxy is kept open. (Of course I removed these before going any further because I'm not supposed to read directly from the process) What I don't understand is why that _check_banner function fails although the stdout of the socket is clearly beginning with SSH-...
On the other hand, as soon as the client.connect(...) is called, the connection is immediatly destroyed on my proxy. I now need a way to investigate why the connection fails this way.

For those who wants more information, here is the line in paramiko that causes that error : transport.py:1858

(Thanks again pkapp for all the help on IRC o/)

@radssh
Copy link
Contributor

radssh commented Feb 5, 2016

Able to reproduce under Python3 with a program that works find under Python2. Hacking around a bit with proxy.py to find out what the underlying issue really is. Will post findings if/when I can get some better details.

@ktbyers
Copy link
Contributor

ktbyers commented Feb 5, 2016

@radssh @depado I saw similar behavior. I could get @depado code to work under Python2.7 (with minor modifications to the proxy command), but saw 'Error reading SSH protocol banner' message when testing with Python3.4.

@depado
Copy link
Author

depado commented Feb 5, 2016

Well we're getting to the bottom of this. Hope you can sort this out ! Thanks for all the help on IRC @radssh !

@radssh
Copy link
Contributor

radssh commented Feb 7, 2016

PR #681 submitted.

With Python3 switch to io.BufferedReader, the select call here doesn't indicate that buffered data is ready to be read - only if new data has arrived. Changing to unbuffered pipes wound up breaking under Python2, since the method's own buffering code was a bit wonky to begin with.

@depado
Copy link
Author

depado commented Feb 8, 2016

Very nice ! Thanks a lot @radssh !

@kryptek
Copy link

kryptek commented Feb 17, 2016

I'm having an issue with the SSH banner as well. I'm using port forwarding in my ProxyCommand and paramiko doesn't seem to like it. Kills the connection with the same exception. The connection opens just fine and if I set the banner_timeout to a large value, I can connect to localhost:PORT and do whatever I need to do until the SSHException is raised.

Is there any hope for me?

@depado
Copy link
Author

depado commented Feb 18, 2016

Hi @kryptek maybe you can try the fix provided by @radssh ? I didn't have the time to test it yet, but it looks like it fixed the problem I had which was caused by Python3.

@bitprophet
Copy link
Member

We have a whole passle of other issues relating to this (common and often covering unrelated problems, YAY PROGRAMMING!) error, FTR. I don't have time right now to go dig them all up but if others want to do so & link them here, that'd be super appreciated. Would love to either merge some dusty PRs or otherwise have someone sleuth up a better way to surface these.

@depado
Copy link
Author

depado commented Feb 20, 2016

@bitprophet, @radssh opened #681
Looks like it is related to a Python 3.x problem.

@akulakhan
Copy link

@radssh patch (manually inserted) fixed issue with ncclient wherein I am using proxycommand and was receiving paramiko.ssh_exception.SSHException: Error reading SSH protocol banner with logging debug level enabled & ncclient.transport.errors.SSHError: Negotiation failed

@lindycoder
Copy link

lindycoder commented Jun 1, 2016

@radssh +1 patch successful here too!

But that's only for python 3.3 and 3.4 hence why the checks fails in the pull request, if i manage to cook a cross version compatible one i'll bring it back here

@bitprophet
Copy link
Member

Have just merged the related PR, will release momentarily (1.16+).

@depado
Copy link
Author

depado commented Jul 26, 2016

Amazing, thanks :)

@TomCos
Copy link

TomCos commented Nov 7, 2018

For what it's worth, I've had increasing success increasing the banner_timeout when using a proxy, though I'm not 100% convinced that's the issue with this one, just wanted it written down for people also having this issue. 15 seconds is a long time, so I'm not sold, but ya, try increasing your banner_timeout.

@abhiypathak
Copy link

@ktbyers thank you for your help, your fix worked for me, I am using paramiko with corkscrew to tunnel out via squid proxy to a remote SFTP server and was facing this issue. But putting the time out before line 576 and 582 resolved my issue on transport.py

rwalton-arm added a commit to PelionIoT/mbl-cli that referenced this issue Mar 8, 2019
rwalton-arm added a commit to PelionIoT/mbl-cli that referenced this issue Mar 8, 2019
* Attempt workaround for paramiko/paramiko#673
Add a delay and attempt to connect again when an SSHException occurs
@bronek123
Copy link

@ktbyers, @radssh, I read your information here about problem #673. I do have same problem with Python 3.5 and FilleZilla server. I've added all delays but still the data is not read from buffer. How should I use io.BufferedReader?

@bir87
Copy link

bir87 commented Apr 27, 2020

I got a similar issue is a fix already in place?.. any workarounds?
"Error reading SSH protocol banner" + str(e)
paramiko.ssh_exception.SSHException: Error reading SSH protocol banner

@ktbyers @abhiypathak what was the solution

@ktbyers
Copy link
Contributor

ktbyers commented Apr 27, 2020

@bir87 If it is a network device increasing the banner_timeout will frequently help with this issue.

@RanFirstTry
Copy link

@ktbyers I've been having a similar problem with ProxyCommand and trying to SSH into a cisco IOS device through a bastion host. It seems that no matter what I try I get a "Error reading SSH protocol banner" while trying to run an ansible playbook I cant get paramiko to honor the proxycommand / ssh configs. Running ssh -o ProxyCommand="ssh -W %h:%p @ -p " <device_username>@<device_ip> works just fine, but paramiko doesnt seem to be honoring any ssh.cfg settings or hostfiles ansible_ssh_common_args.

@sreekaanth
Copy link

@ktbyers I am also having this issue intermittently when connecting to cisco devices through a proxy i.e. a linux server to the device. I have increased the banner timeout to 100 yet i am still getting these errors intermittently

@ktbyers
Copy link
Contributor

ktbyers commented Apr 29, 2020

@sreekaanth Yes, fix in Netmiko is almost always banner_timeout (in last year or so). If you are doing SSH proxy and banner timeout doesn't work, then I don't know the answer there.

@vparames86
Copy link

@ktbyers I am getting the same issue while trying to run an ansible playbook via a bastion server to csr cisco device. I am using ansible version 2.9.7
fatal: [csr1]: FAILED! => {
"changed": false,
"msg": "Error reading SSH protocol banner"
}

@FloLaco
Copy link

FloLaco commented Jul 20, 2020

@vparames86 @ktbyers Same issue with ansible playbook, bastion and Juniper (PyEZ)

@ktbyers
Copy link
Contributor

ktbyers commented Jul 20, 2020

@FloLaco Are you increasing the banner_timeout?

@FloLaco
Copy link

FloLaco commented Jul 21, 2020

@ktbyers My bad, I found my issue last night. The private key didn't have the proper chmod (400 vs 644) so connection fall in timeout. Unfortunately, the log message was not obvious (Error reading SSH protocol banner) for a simple timeout connection/issue with RSA key.

@venu-devannagari
Copy link

venu-devannagari commented Jul 22, 2020

@ktbyers My bad, I found my issue last night. The private key didn't have the proper chmod (400 vs 644) so connection fall in timeout. Unfortunately, the log message was not obvious (Error reading SSH protocol banner) for a simple timeout connection/issue with RSA key.

Hi @FloLaco is playbook is executed can you please show how your host,proxy command and playbook looks like with some sample values.Actually I am not able to execute netwrok device through proxycommand.Thanks in advance.

@FloLaco
Copy link

FloLaco commented Jul 22, 2020

Playbook is basic, no changes here :

---
- name: Get equipments states
  hosts: physical_eqt
  connection: local
  gather_facts: no
  roles:
    - Juniper.junos
  tasks:
    - name: Get Config with a read-only user
      juniper_junos_command:
        provider: "{{CREDENTIALS}}"
        command: "show configuration | display set"
        dest: "./config/{{ inventory_hostname }}.config.output"

ansible.cfg :

[defaults]
local_tmp = /tmp/autobackup
remote_tmp = /tmp/autobackup
action_warnings = False
inventory = hosts
roles_path = /etc/ansible/roles:./:/root/.ansible/roles/
deprecation_warnings = False
host_key_checking = False
interpreter_python = auto_silent

.ssh/config :

# Connexion directe avec le bastion.
Host bastion
 Hostname 1.2.3.4
 User LINUX_USER
 IdentityFile /root/.ssh/id_rsa

# Pour toutes les machines de la zone privee :
Host 10.240.*
# Proxifier la connexion au travers du bastion.
 ProxyCommand ssh -q -W %h:%p bastion
 Port 22
 User LOGIN

Don't forget to put your private key (chmod 400).
With this config, automaticaly, Ansible use ssh command, and by changing the default behaviour of ssh command (with .ssh/config file), Ansible is using without knowing the jump host

@venu-devannagari
Copy link

Playbook is basic, no changes here :

---
- name: Get equipments states
  hosts: physical_eqt
  connection: local
  gather_facts: no
  roles:
    - Juniper.junos
  tasks:
    - name: Get Config with a read-only user
      juniper_junos_command:
        provider: "{{CREDENTIALS}}"
        command: "show configuration | display set"
        dest: "./config/{{ inventory_hostname }}.config.output"

ansible.cfg :

[defaults]
local_tmp = /tmp/autobackup
remote_tmp = /tmp/autobackup
action_warnings = False
inventory = hosts
roles_path = /etc/ansible/roles:./:/root/.ansible/roles/
deprecation_warnings = False
host_key_checking = False
interpreter_python = auto_silent

.ssh/config :

# Connexion directe avec le bastion.
Host bastion
 Hostname 1.2.3.4
 User LINUX_USER
 IdentityFile /root/.ssh/id_rsa

# Pour toutes les machines de la zone privee :
Host 10.240.*
# Proxifier la connexion au travers du bastion.
 ProxyCommand ssh -q -W %h:%p bastion
 Port 22
 User LOGIN

Don't forget to put your private key (chmod 400).
With this config, automaticaly, Ansible use ssh command, and by changing the default behaviour of ssh command (with .ssh/config file), Ansible is using without knowing the jump host

Hi @FloLaco thank you so much for the reply..

I have tried with above but still getting time out error.

rpc_\nansible.module_utils.connection.ConnectionError: timed out\n",

My doubt was is physical_eqt is junos device or any linux bastion server which is in subnet you mentioned in /.ssh/config

can you please show how the host file looks like with some sample values.

Thanks in advance.

@FloLaco
Copy link

FloLaco commented Jul 24, 2020

The host file is very basic. In the host file, you should have only network equipment. Ansible DOES NOT know any bastion and does not care. Ansible is using SSH application, which is configured to use a bastion :

[physical_eqt]
EQT1                   ansible_host=10.x.x.x

group_vars/physical_eqt

---
CREDENTIALS:
  host: "{{ansible_host}}"
  username: "USERNAME"
  passwd: "PWD"
  timeout: 20

You can try to set ANSIBLE_DEBUG=true and see the traceback to see what is the problem.

@venu-devannagari
Copy link

venu-devannagari commented Jul 24, 2020

Hi @FloLaco thanks for the reply...

I have tried the same way but getting timeout error.

.ssh/config :

Host bastion
Hostname xx.xx.xx.xx
User xxxx
IdentityFile /root/.ssh/id_rsa

Host xx.xx.xx.*
ProxyCommand ssh -q -W %h:%p bastion
Port 22
User xxxxx

[asa]
ansible_host=xx.xx.xx.xx

group_vars/asa.yml

CREDENTIALS:
host: xx.xx.xx.xx
username: xxxxx
password: xxxxx
authorize: xxxx
auth_pass: xxxx

Playbook:-

  • name: show version
    hosts: asa
    connection: local
    gather_facts: no

    tasks:

    • name: show version
      ansible_command:
      provider: "{{CREDENTIALS}}"
      commands:
      • show version

ansible.cfg :
[defaults]
local_tmp = /tmp/autobackup
remote_tmp = /tmp/autobackup
action_warnings = False
inventory = etc/ansible/hosts
deprecation_warnings = False
host_key_checking = False
interpreter_python = auto_silent

Can you please help if something is reqired or missing something.

Thanks in advance.

@FloLaco
Copy link

FloLaco commented Jul 30, 2020

Did you try to set ANSIBLE_DEBUG=true and -vvvv and see what's going on ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests