Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/ssh: Client does not wait long enough for reply from server and sends EOF #29954

Closed
sbezverk opened this issue Jan 27, 2019 · 9 comments
Closed
Milestone

Comments

@sbezverk
Copy link

@sbezverk sbezverk commented Jan 27, 2019

What version of Go are you using (go version)?

$ go version
go version go1.11.4 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env


GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/sbezverk/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/sbezverk/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/8r/m13mcmf15z3d0dq7d412rnph0000gn/T/go-build111235926=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I am using "golang.org/x/crypto/ssh" library to run remotely commands against ssh server which is a cisco router. Router's debug shows that the client sends successfully command, but then the client send EOF without waiting for the result.


  | RP/0/0/CPU0:Jan   27 08:00:16.345 EST: SSHD_[65740]: 0000 - 62 00 00 00 01 00 00 00-04 65 78 65   63 01 00 00   b........exec...
-- | --
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: 0010 - 00 0d 73 68 6f 77 20 70-6c 61 74 66 6f 72 6d      ..show platform
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: sshd_interactive_shell: message code 0x62 |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: sshd_interactive_shell - Channel request exec ->  for chan 1 want_reply = 1
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Executing remote command: show platform |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: SHM lock acquired |   |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: (sshd_increment_exec_count) exec_count before (0)
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: SHM Lock released |   |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: (open_master_file) command added show platform
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: (open_master_file:585) clearing alarm. Remaining 309 secs
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: (open_master_file:633) setting alarm to 309 secs |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Spawned exec session |   |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: (update_sshd_session_details) addrem: 3 channel_id: 1,   tty:vty0, session_state: 10 channel_state:1 user:root connection_type: 2   session_id: 3, session_pid:1568972
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: %SECURITY-SSHD-6-INFO_SUCCESS_COMMAND : Successfully   authenticated user 'root' from '10.117.94.148' on 'vty0'(cipher   'aes128-gcm@openssh.com', mac 'aes-gcm', command 'show platform')
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Sending Channel request response msg to 0 |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Transmission:Original: len = 5 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: 0000 - 63 00 00 00 00                                      c.... |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Transmitting data in bin mode |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: ssh_encrypt: Outdata PTR is 1300f24c |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]:  The outdataptr at   transmit is 20 outmaclen: 16 |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]:  The outdata ptr (3) is   1300f24c |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Transmission:Final: len = 36 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: 0000 - 00 00 00 10 f2 f9 fa 52-59 2d 59 38 eb 46 bc 28   .......RY-Y8.F.(
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: 0010 - 74 86 7a 1f e6 2d 4a cc-6f 02 20 e7 5e d3 ac 49   t.z..-J.o. .^..I
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: 0020 - fc e6 86 76                                         ...v |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: writing bytes to sockfd |   |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: done writing bytes to sockfd (rc:0) |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: (sshd_interactive_shell:5864) clearing alarm. Remaining 309   secs
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: END INTERACTIVE SHELL |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: sshd_interactive_shell - ptyfd = 50 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: event_contex_init done |   |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.345 EST:   SSHD_[65740]: Inside (ssh_receive) GCM packklen zero |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: GCM mode Receive:Original: len = 36 |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: 0000 - 00 00 00 10 da 76 63 a9-be b6 25 14 5c 9d eb bf   .....vc...%.\...
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: 0010 - 6f 18 4d c8 16 d8 cb 81-b2 08 fb ac 46 3c 02 d5   o.M.........F<..
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: 0020 - d9 50 1d 94                                         .P.. |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: ssh_decrypt: incipher.keylen:16, outcipher.keylen:16   name:aes128-gcm@openssh.com
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: ssh_decrypt: lenToDecrypt:16 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: Printing the plain text and size:16 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: 0000 - 0a 60 00 00 00 01 41 b5-be f9 7a 9c f2 14 e6 31   .`....A...z....1
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]:  The packlen embeded is   :4106 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: Receive:Final: tcpbuflen = 36, len = 5 |   |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: 0000 - 60 00 00 00 01                                      `.... |  
  | RP/0/0/CPU0:Jan 27 08:00:16.355 EST:   SSHD_[65740]: sshd_nettopty: EOF received. Disconnecting session

What did you expect to see?

Client should receive server's reply

What did you see instead?

Client does not receive data from the server.

The workaround I found was adding a 1 second delay at this line:
https://github.com/golang/crypto/blob/master/ssh/channel.go#L591

@hanwen
Copy link
Contributor

@hanwen hanwen commented Jan 28, 2019

can you send a client trace (set debugMux=true) of the client?

You should probably wait to receive exit-status before closing the channel.

@odeke-em odeke-em changed the title client does not wait long enough reply from ssh server and sends EOF x/ssh: Client does not wait long enough for reply from server and sends EOF Jan 28, 2019
@gopherbot gopherbot added this to the Unreleased milestone Jan 28, 2019
@odeke-em odeke-em changed the title x/ssh: Client does not wait long enough for reply from server and sends EOF x/crypto/ssh: Client does not wait long enough for reply from server and sends EOF Jan 28, 2019
@sbezverk
Copy link
Author

@sbezverk sbezverk commented Jan 28, 2019

@hanwen Here is the trace, I added also a bit of prints to better understand the logic.

time="2019-01-28T08:45:57-05:00" level=info msg="router name: 192.168.218.132"
time="2019-01-28T08:45:57-05:00" level=info msg="Callback is called with hostname: 192.168.218.132:22 remote address: 192.168.218.132:22"
time="2019-01-28T08:45:58-05:00" level=debug msg="Successfully dialed router: 192.168.218.132:22"
2019/01/28 08:45:58 send global(1): ssh.channelOpenMsg{ChanType:"session", PeersID:0x1, PeersWindow:0x200000, MaxPacketSize:0x8000, TypeSpecificData:[]uint8(nil)}
2019/01/28 08:45:58 decoding(1): 91 &ssh.channelOpenConfirmMsg{PeersID:0x1, MyID:0x1, MyWindow:0x200000, MaxPacketSize:0x8000, TypeSpecificData:[]uint8{}} - 17 bytes
><SB> message: &{PeersID:1 MyID:1 MyWindow:2097152 MaxPacketSize:32768 TypeSpecificData:[]}
time="2019-01-28T08:45:58-05:00" level=debug msg="Running command: show platform against router: 192.168.218.132:22"
2019/01/28 08:45:58 send(1): ssh.channelRequestMsg{PeersID:0x1, Request:"exec", WantReply:true, RequestSpecificData:[]uint8{0x0, 0x0, 0x0, 0xd, 0x73, 0x68, 0x6f, 0x77, 0x20, 0x70, 0x6c, 0x61, 0x74, 0x66, 0x6f, 0x72, 0x6d}}
2019/01/28 08:45:58 decoding(1): 99 &ssh.channelRequestSuccessMsg{PeersID:0x1} - 5 bytes
><SB> message: &{PeersID:1}
!
!  The problem is here, as with 600 millisecond delay, here I could see  the code handles 3 packets with server's data. 
!
><SB>  m: &{PeersID:1} ok: true
><SB> channelRequestSuccessMsg 
><SB> Sending EOF to &{chanType:session extraData:[] localId:1 remoteId:1 maxIncomingPayload:32768 maxRemotePayload:32768 mux:0xc00000e150 decided:true direction:1 msg:0xc000084b40 sentRequestMu:{state:0 sema:0} incomingRequests:0xc000084ae0 sentEOF:true remoteWin:{Cond:0xc00001c6c0 win:2097152 writeWaiters:0 closed:false} pending:0xc00000b100 extPending:0xc00000b140 windowMu:{state:0 sema:0} myWindow:2097152 writeMu:{state:0 sema:0} sentClose:false packetPool:map[]}
2019/01/28 08:45:58 send(1): ssh.channelEOFMsg{PeersID:0x1}
2019/01/28 08:45:58 decoding(1): 98 &ssh.channelRequestMsg{PeersID:0x1, Request:"exit-status", WantReply:false, RequestSpecificData:[]uint8{0x0, 0x0, 0x0, 0x0}} - 25 bytes
><SB> message: &{PeersID:1 Request:exit-status WantReply:false RequestSpecificData:[0 0 0 0]}
2019/01/28 08:45:58 decoding(1): 96 &ssh.channelEOFMsg{PeersID:0x1} - 5 bytes
><SB> handlePacket calls  msgChannelEOF
2019/01/28 08:45:58 decoding(1): 97 &ssh.channelCloseMsg{PeersID:0x1} - 5 bytes
><SB> handlePacket calls  msgChannelClose
2019/01/28 08:45:58 send(1): ssh.channelCloseMsg{PeersID:0x1}
><SB> Wait's exit status <nil>
2019/01/28 08:45:58 send(1): ssh.channelCloseMsg{PeersID:0x1}
time="2019-01-28T08:45:58-05:00" level=debug msg="getInventory(): Received reply from 192.168.218.132:22 of 0 bytes, Reply:  "
time="2019-01-28T08:45:58-05:00" level=debug msg="getInventory(): parseReply result of 0 bytes, Result: [] "
time="2019-01-28T08:45:58-05:00" level=error msg="Failed to collect router: 192.168.218.132:22  inventory with error: Failed to parse inventory for router: 192.168.218.132:22 with error: no inventory data found"
2019/01/28 08:45:58 loop exit read tcp 192.168.218.1:61355->192.168.218.132:22: use of closed network connection
@hanwen
Copy link
Contributor

@hanwen hanwen commented Jan 28, 2019

", but then the client send EOF without waiting for the result."

you are in control of the client. Why are you sending EOF if it looks like the server can't handle it?

@sbezverk
Copy link
Author

@sbezverk sbezverk commented Jan 28, 2019

Sorry I do not get your comment, I am not sending EOF, it is all done within ssh client code, in my code I just call: reply, err := session.Output(cmd) . As per server not being able to handle EOF, if I use this command:
ssh root@192.168.218.132 "show version"
returns data as expected, makes me doubt that the server cannot handle EOF.

It seems to me the client does not get blocked on a signal or something indicating that all data has already been received and ready to be consumed.

@hanwen
Copy link
Contributor

@hanwen hanwen commented Jan 28, 2019

2019/01/28 08:45:58 send(1): ssh.channelEOFMsg{PeersID:0x1}
2019/01/28 08:45:58 decoding(1): 98 &ssh.channelRequestMsg{PeersID:0x1, Request:"exit-status", WantReply:false, RequestSpecificData:[]uint8{0x0, 0x0, 0x0, 0x0}} - 25 bytes

message: &{PeersID:1 Request:exit-status WantReply:false RequestSpecificData:[0 0 0 0]}
2019/01/28 08:45:58 decoding(1): 96 &ssh.channelEOFMsg{PeersID:0x1} - 5 bytes
handlePacket calls msgChannelEOF
2019/01/28 08:45:58 decoding(1): 97 &ssh.channelCloseMsg{PeersID:0x1} - 5 bytes
handlePacket calls msgChannelClose
2019/01/28 08:45:58 send(1): ssh.channelCloseMsg{PeersID:0x1}
Wait's exit status

according to this trace, it's all looking fine. client sends EOF. Server sends command exit-status back (status 0 = success). Then its stops sending data (EOF), and tears down the channel.

ie. I don't understand the problem. What should have happened that you are not seeing?

@sbezverk
Copy link
Author

@sbezverk sbezverk commented Jan 28, 2019

Here is the trace with 600 millisecond sleep added here:
https://github.com/golang/crypto/blob/master/ssh/channel.go#L589

As you can see from the trace, there are 3 packets coming from the server, sizes 13, 38 and 206 bytes. Which are missing when there is no delay. What is interesting if the delay is 200 millisecond then I get only 1 packet instead of 3. I think this supports my theory that ssh client does not either wait enough or getting buffer ready signal prematurely. If you are interested, I can easily demonstrate it.

time="2019-01-28T09:56:31-05:00" level=info msg="router name: 192.168.218.132"
time="2019-01-28T09:56:31-05:00" level=info msg="Callback is called with hostname: 192.168.218.132:22 remote address: 192.168.218.132:22"
time="2019-01-28T09:56:32-05:00" level=debug msg="Successfully dialed router: 192.168.218.132:22"
2019/01/28 09:56:32 send global(1): ssh.channelOpenMsg{ChanType:"session", PeersID:0x1, PeersWindow:0x200000, MaxPacketSize:0x8000, TypeSpecificData:[]uint8(nil)}
2019/01/28 09:56:32 decoding(1): 91 &ssh.channelOpenConfirmMsg{PeersID:0x1, MyID:0x1, MyWindow:0x200000, MaxPacketSize:0x8000, TypeSpecificData:[]uint8{}} - 17 bytes
><SB> message: &{PeersID:1 MyID:1 MyWindow:2097152 MaxPacketSize:32768 TypeSpecificData:[]}
time="2019-01-28T09:56:32-05:00" level=debug msg="Running command: show platform against router: 192.168.218.132:22"
2019/01/28 09:56:32 send(1): ssh.channelRequestMsg{PeersID:0x1, Request:"exec", WantReply:true, RequestSpecificData:[]uint8{0x0, 0x0, 0x0, 0xd, 0x73, 0x68, 0x6f, 0x77, 0x20, 0x70, 0x6c, 0x61, 0x74, 0x66, 0x6f, 0x72, 0x6d}}
2019/01/28 09:56:32 decoding(1): 99 &ssh.channelRequestSuccessMsg{PeersID:0x1} - 5 bytes
><SB> message: &{PeersID:1}
2019/01/28 09:56:32 decoding(1): data packet - 13 bytes
><SB> handlePacket calls  handleData
2019/01/28 09:56:32 decoding(1): data packet - 38 bytes
><SB> handlePacket calls  handleData
2019/01/28 09:56:32 decoding(1): data packet - 206 bytes
><SB> handlePacket calls  handleData
2019/01/28 09:56:32 decoding(1): 98 &ssh.channelRequestMsg{PeersID:0x1, Request:"exit-status", WantReply:false, RequestSpecificData:[]uint8{0x0, 0x0, 0x0, 0x0}} - 25 bytes
><SB> message: &{PeersID:1 Request:exit-status WantReply:false RequestSpecificData:[0 0 0 0]}
2019/01/28 09:56:32 decoding(1): 96 &ssh.channelEOFMsg{PeersID:0x1} - 5 bytes
><SB> handlePacket calls  msgChannelEOF
2019/01/28 09:56:32 decoding(1): 97 &ssh.channelCloseMsg{PeersID:0x1} - 5 bytes
><SB> handlePacket calls  msgChannelClose
2019/01/28 09:56:32 send(1): ssh.channelCloseMsg{PeersID:0x1}
><SB> Closing channel &{chanType:session extraData:[] localId:1 remoteId:1 maxIncomingPayload:32768 maxRemotePayload:32768 mux:0xc00000e150 decided:true direction:1 msg:0xc000096b40 sentRequestMu:{state:1 sema:0} incomingRequests:0xc000096ae0 sentEOF:false remoteWin:{Cond:0xc00001c6c0 win:2097152 writeWaiters:0 closed:false} pending:0xc00000b0c0 extPending:0xc00000b100 windowMu:{state:0 sema:0} myWindow:2096922 writeMu:{state:0 sema:0} sentClose:true packetPool:map[]}
2019/01/28 09:56:32 loop exit EOF
><SB>  m: &{PeersID:1} ok: true
><SB> channelRequestSuccessMsg 
><SB> Wait's exit status <nil>
><SB> Sending EOF to &{chanType:session extraData:[] localId:1 remoteId:1 maxIncomingPayload:32768 maxRemotePayload:32768 mux:0xc00000e150 decided:true direction:1 msg:0xc000096b40 sentRequestMu:{state:0 sema:0} incomingRequests:0xc000096ae0 sentEOF:true remoteWin:{Cond:0xc00001c6c0 win:2097152 writeWaiters:0 closed:true} pending:0xc00000b0c0 extPending:0xc00000b100 windowMu:{state:0 sema:0} myWindow:2096922 writeMu:{state:0 sema:0} sentClose:true packetPool:map[]}
2019/01/28 09:56:32 send(1): ssh.channelEOFMsg{PeersID:0x1}
2019/01/28 09:56:32 send(1): ssh.windowAdjustMsg{PeersID:0x0, AdditionalBytes:0xe6}
2019/01/28 09:56:32 send(1): ssh.channelCloseMsg{PeersID:0x1}
@hanwen
Copy link
Contributor

@hanwen hanwen commented Jan 28, 2019

The original trace looked sane.

As far as I can see, something is broken in the server. If the server somehow closes & tears down the channel before it sends stderr/stdout of the process, that is a bug in the server implementation.

I suggest to use https://godoc.org/golang.org/x/crypto/ssh#Session.Run instead, where you can insert sleep to your heart's desire.

(for the record, what is the type of cisco device, and what version of the firmware is it running?)

@sbezverk
Copy link
Author

@sbezverk sbezverk commented Jan 28, 2019

It is funny, first I opened a bug for the server side, but it was confirmed from debugs that it is the client initiates EOF before getting data. See debug trace from the router:

Jan 18 14:05:22.633 EST: SSHD_[67344]: 0000 - 0a 60 00 00 00 01 5b cf-e9 52 25 81 cd 1c 80 d9   .`....[..R%.....
RP/0/RP0/CPU0:Jan 18 14:05:22.633 EST: SSHD_[67344]:  The packlen embeded is :4106
RP/0/RP0/CPU0:Jan 18 14:05:22.633 EST: SSHD_[67344]: Receive:Final: tcpbuflen = 36, len = 5
RP/0/RP0/CPU0:Jan 18 14:05:22.633 EST: SSHD_[67344]: 0000 - 60 00 00 00 01                                    `....
RP/0/RP0/CPU0:Jan 18 14:05:22.633 EST: SSHD_[67344]: sshd_nettopty: EOF received. Disconnecting session

The issue is experienced with any routers running IOS-XR code. I tried on multiple platforms, CRS. ASR9k, xrvr and xrv9k with codes 6.1.3, 6.4.2, 7.0.1.

@hanwen
Copy link
Contributor

@hanwen hanwen commented Jan 28, 2019

client sending EOF means that the client is not going to send more data. It does not mean the channel is to be torn down.

See https://tools.ietf.org/html/rfc4254#section-5.3

" Note that the channel remains open after this message, and
more data may still be sent in the other direction. "

@hanwen hanwen closed this Jan 28, 2019
@golang golang locked and limited conversation to collaborators Jan 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.