Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSch SFTP connections getting stuck #38

Open
manishbansal8843 opened this issue Jul 18, 2021 · 1 comment
Open

JSch SFTP connections getting stuck #38

manishbansal8843 opened this issue Jul 18, 2021 · 1 comment

Comments

@manishbansal8843
Copy link

Problem is described on here on stackoverflow.

We have deployed some fix in our production which seems to be working fine. Will raise PR here so that it can be verified by experts and help others in need. Thanks.

@manishbansal8843
Copy link
Author

Pasting the problem here also for clarity.

We have a java 7 code base where we are using Apache commons vfs2 v2.2 which is using JSch-0.1.54 as sftp provider.

Now, the use case is to transfer files to a remote host via sftp. But, every now and then, the file upload process gets stuck. After taking the thread dump of the application, we found both the threads (t1, which sends the data to remote sftp & t2, which receives data from sftp) are in wait state forever. Below is the thread dump snapshot.

JSch session thread:

"Connect thread remote.ftp.com session" daemon prio=10 tid=0x00007f99cc243000 nid=0x144 in Object.wait() [0x00007f9985606000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.io.PipedInputStream.awaitSpace(PipedInputStream.java:273)
    at java.io.PipedInputStream.receive(PipedInputStream.java:231)
    - locked <0x000000043eda02d8> (a com.jcraft.jsch.Channel$MyPipedInputStream)
    at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
    at com.jcraft.jsch.IO.put(IO.java:64)
    at com.jcraft.jsch.Channel.write(Channel.java:438)
    at com.jcraft.jsch.Session.run(Session.java:1459)
    at java.lang.Thread.run(Thread.java:748)

Application thread used to upload file data.

"akka.actor.default-dispatcher-19" prio=10 tid=0x00007f99d4012000 nid=0xea in Object.wait() [0x00007f9988785000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at com.jcraft.jsch.Session.write(Session.java:1269)
    - locked <0x0000000440a61b48> (a com.jcraft.jsch.ChannelSftp)
    at com.jcraft.jsch.ChannelSftp.sendWRITE(ChannelSftp.java:2646)
    at com.jcraft.jsch.ChannelSftp.access$100(ChannelSftp.java:36)
    at com.jcraft.jsch.ChannelSftp$1.write(ChannelSftp.java:806)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
    - locked <0x0000000440aab240> (a org.apache.commons.vfs2.provider.sftp.SftpFileObject$SftpOutputStream)
    at org.apache.commons.vfs2.util.MonitorOutputStream.write(MonitorOutputStream.java:104)
    - locked <0x0000000440aab240> (a org.apache.commons.vfs2.provider.sftp.SftpFileObject$SftpOutputStream)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
    - locked <0x0000000440aac278> (a org.apache.commons.vfs2.provider.DefaultFileContent$FileContentOutputStream)
    at org.apache.commons.vfs2.util.MonitorOutputStream.write(MonitorOutputStream.java:104)
    - locked <0x0000000440aac278> (a org.apache.commons.vfs2.provider.DefaultFileContent$FileContentOutputStream)
    at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:741)
    at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:720)
    at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:691)
    at org.apache.commons.vfs2.provider.DefaultFileContent.write(DefaultFileContent.java:707)
    at org.apache.commons.vfs2.FileUtil.copyContent(FileUtil.java:78)
    at org.apache.commons.vfs2.provider.AbstractFileObject.copyFrom(AbstractFileObject.java:289)

After looking at the codebase of Jsch library, Here is what i feel is happening.

  1. Application thread is uploading file data in the chunks of 4KB.
  2. After every chunk write, application reads the input socket for any acknowledgements until input socket buffer is empty.
    During chunk writing, it checks for the ssh window size. If it is less than the payload size, we wait until remote server resizes it.(This is a place where my app thread is waiting forever) This resizing message is listened by the ssh session thread and same is updated at the channel object after which application thread proceeds for writing.
  3. In a separate thread, session is listening for the incoming data from the remote server. Depending on the message it receives, it takes the relevant action like resizing the channel window, pass the acknowledgement message to the channel (read application) for consumption.
  4. Now, when a message arrives for channel consumption, it is written to the PipedOutStream which is linked to the PipedInputStream. This input stream is read by the application thread for acknowledgement messages. In case, the application thread fails to read any message, the buffer of PipedOutputstream gets full and hence it goes into wait state until application reads some data. (This is the place where session thread is waiting forever)
  5. Now, both the threads are depending on each other. Hence, this is kind of a deadlock.

Further, i have checked the linux machine on which this app is running, RecQ of the socket is kept on building up. Which means, socket is still alive and remote server is keep on sending 32KB packets every now and then.

sudo netstat -anpt | grep 19321
tcp6       0      0 10.14.233.97:59594      64.233.167.130:19321    TIME_WAIT   -
tcp6   58256      0 10.14.233.97:58214      64.233.167.130:19321    ESTABLISHED 460144/java
tcp6  499888      0 10.14.233.97:58422      64.233.167.130:19321    ESTABLISHED 460144/java
tcp6       0      0 10.14.233.97:59622      64.233.167.130:19321    ESTABLISHED 460144/java
tcp6       0      0 10.14.233.97:59608      64.233.167.130:19321    TIME_WAIT   -
tcp6   74672      0 10.14.233.97:56656      64.233.167.130:19321    ESTABLISHED 460144/java
tcp6   92688      0 10.14.233.97:56842      64.233.167.130:19321    ESTABLISHED 460144/java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant