Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCP data transfer speed comparison with OpenSSH #688

Open
simhro opened this issue May 14, 2021 · 25 comments
Open

SCP data transfer speed comparison with OpenSSH #688

simhro opened this issue May 14, 2021 · 25 comments

Comments

@simhro
Copy link

simhro commented May 14, 2021

I was testing the performance of the SCP method from the sshj library for uploading a file from my machine to a remote one and I find it significantly slower compared to the OpenSSH's scp command which I executed from my centos machine.

The tests which I performed were all done in the same environment (local network with 2 centos machines) and I was transferring 1 GB file to a remote machine.

  • The scp command gives me about 180 MB/s upload speed:
    scp testfile-1gb username@1.2.3.4:/tmp

  • When I tried to use the sshj's scp to upload the same file it only gives me about 40 MB/s. Here is a simple java code I'm using:

import net.schmizz.sshj.SSHClient;
import net.schmizz.sshj.transport.verification.PromiscuousVerifier;
import java.io.IOException;

public class TestSshj {
    public static void main(String[] args) throws IOException {
        SSHClient ssh = new SSHClient();
        try {
            ssh.addHostKeyVerifier(new PromiscuousVerifier());
            ssh.connect("1.2.3.4");
            ssh.authPassword("username", "password");
            ssh.newSCPFileTransfer().upload("test-1gb", "/tmp");
        }
        finally {
            ssh.close();
        }
    }
}

I also tried using SFTP method for uploading the data to see if there is any difference but it gives pretty much the same slow-ish speed.
Another thing I tried changing was the encryption algorithms (I tried using the same as OpenSSH was using) but that didn't help either.

Is this kind of performance difference expected or can it be improved somehow?

@stjava
Copy link

stjava commented May 17, 2021

I also have this problem

@fire2
Copy link

fire2 commented Jun 9, 2021

I have tried transferring a 5.4GB file via SFTP from server A to server B, via the OpenSSH SFTP client in a shell script, a small Java application using the JSCH library, and this SSHJ library and the results are the following, in all cases with compression on:

  • native OpenSSH SFTP client in shell script:
    • with compression: 1 min 38 seconds.
    • without compression: 3 minutes 4 seconds.
  • JSCH library with compression: 3 minutes 42 seconds.
  • SSHJ library with compression: 13 minutes 52 seconds.

Why is it so slow? Any way to improve performance for large files?

@hierynomus
Copy link
Owner

Are you not using buffered streams maybe? A performance test is only as good as its implementation

@hierynomus
Copy link
Owner

Otherwise you would need to use a profiler to find the hotspot for this

@fire2
Copy link

fire2 commented Jun 9, 2021

Here is my sshj-relevant lines implementation, no BufferedStreams:

this.sshClient = new SSHClient();
this.sshClient.addHostKeyVerifier(new PromiscuousVerifier());
this.sshClient.useCompression();
this.sshClient.connect(host, port);
this.sshClient.authPassword(user, password);
this.sftpClient = this.sshClient.newSFTPClient();
this.sftpClient.get(ftpFile, localFile);

@hierynomus
Copy link
Owner

And without compression?

@simhro
Copy link
Author

simhro commented Jun 9, 2021

@hierynomus what buffered stream do you mean? If you check my code example above you will see that I'm using the methods from the sshj library. Am I doing something wrong?

@fire2 I have a similar code for testing sftp. I don't use compression because it makes the speed even slower. I suggest you disable compression and then you should see some improvement but still not as fast as the OpenSSH implementation.

@hierynomus
Copy link
Owner

If you use FileSystemFile, that uses direct FileInputStream and FileOutputStream. It might be worthwhile to wrap those in a Buffered(Input|Output)Stream before passing them

@fire2
Copy link

fire2 commented Jun 9, 2021

And without compression?

I just tried and it took 7 minutes 22 second. Even though I can't understand why without compression it is faster than with compression (opposite to OpenSSH client), it is still way too slow.

@hierynomus
Copy link
Owner

I can have a look, I think the compression speed itself is suboptimal, thus a limiting factor.

@simhro
Copy link
Author

simhro commented Jun 9, 2021

@hierynomus how to use the buffered stream? The upload method from the sshj lib requires FileSystemFile as a parameter. I think something needs to be changed internally in the library.

@fire2 If I try to use compression I see a decrease in speed in both implementations - sshj and OpenSSH.

@fire2
Copy link

fire2 commented Jun 9, 2021

@fire2 If I try to use compression I see a decrease in speed in both implementations - sshj and OpenSSH.

For me the OpenSSH implementation takes 3m 04s without compression (-C flag), and 1m 38s with compression. It is always quite faster. It's using the zlib compression.

@hierynomus
Copy link
Owner

@fire2 That means that most probably your bandwidth is the limiting factor in your test, not the CPU speed.

@fire2
Copy link

fire2 commented Jun 9, 2021

Hmm, I don't understand. Why is the library then using less bandwidth? I am running all tests on the same machine.

@simhro
Copy link
Author

simhro commented Jun 9, 2021

Regardless of the compression, I think some kind of improvement has to be done inside the sshj library. As @hierynomus mentioned, the use of buffered streams might be the answer here but if that's true then it probably needs to be implemented in the library.

@yantom
Copy link

yantom commented Oct 2, 2021

In my case the native SCP/rsync is 6.5ms/file and SFTP over SSHJ 8.4ms/file after the mentioned buffered input stream optimization (without optimization it was 18.7 ms/file). Also compression made it worse. I agree that it would be worthy to implement buffered streams directly to the lib.

@simhro
Copy link
Author

simhro commented Dec 9, 2021

@yantom how did you implement the buffered input stream? Can you paste here a code snippet for uploading the file?

@yantom
Copy link

yantom commented Dec 9, 2021

File file = Paths.get("sample.pdf").toFile();
try (BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file))) {
    transfer.upload(new InputStreamSource(bis, file.getName()), "/samplelocation/sample.pdf");
}

uses this class:

    private static class InputStreamSource extends InMemorySourceFile {
        private String name;
        private InputStream inputStream;
        public InputStreamSource(InputStream is, String name) {
            this.inputStream = is;
            this.name = name;
        }
        public String getName() {
            return name;
        }
        public long getLength() {
            return -1;
        }
        public InputStream getInputStream() throws IOException {
            return inputStream;
        }
    }

@simhro
Copy link
Author

simhro commented Dec 16, 2021

I've tried with the buffered stream as you've shown but I'm not getting any speed improvement. I was testing it by uploading a 5GB file.

@corporate-gadfly
Copy link

@simhro 0.33.0 was recently released and I'm curious if #778 had a positive impact on performance.

@simhro
Copy link
Author

simhro commented May 30, 2022

I tested with version 0.33.0 but unfortunately, it doesn't make any improvement in speed.
Based on the comments in #778 this is only supposed to fix a bug introduced in #769 but the performance issues were also in earlier versions (sshj-0.31.0).

@vladimirlagunov
Copy link
Contributor

@simhro just a random guess: does it help if you upload a file via remoteFile.new ReadAheadRemoteFileInputStream(Int.MAX_VALUE) ?

@Hollerweger
Copy link

Hollerweger commented Feb 3, 2023

We made a JFR recoding and we see many Socket read I/O pauses while actually writing (only 1 write visible as all others where below 20ms threshold) to a sftp server.
From a 30 second recording while writing ~29 second seem to be only read I/O time.
image

@2211898719
Copy link

参考大佬的评论 这样实现可以几乎跑满我服务器的带宽,适当调节缓冲区大小和请求数
` ServletOutputStream outputStream = response.getOutputStream();

    SFTPClient sftp = getSftp(id);
    RemoteFile readFile = sftp.open(remotePath);

    RemoteFile.ReadAheadRemoteFileInputStream readAheadRemoteFileInputStream = readFile.new ReadAheadRemoteFileInputStream(
            15);

    BufferedInputStream inputStream = new BufferedInputStream(readAheadRemoteFileInputStream, 1024 * 1024);
    inputStream.transferTo(outputStream);

    IoUtil.close(inputStream);
    IoUtil.close(outputStream);`

@2211898719
Copy link

使用ReadAheadRemoteFileInputStream提升空间很大 BufferedInputStream也是 ,sshClient.useCompression();对部分文件有提升

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants