-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow configurable read and write packet length for SFTP #1320
Conversation
I'm not fond of this kind of change, as it's not safe to just use whatever values you want. The read and write limit behavior is copied from OpenSSH, so if OpenSSH exhibits significantly faster speeds with the same cryptographic algorithms and same protocol (e.g. ensuring SFTP is actually used and not SCP or something), then perhaps the existing logic should be tweaked instead. |
I hear you, but I'm not sure what other options there are - .3MB/s to 3MB/s transfer rate difference is the difference between a job that takes 2 hours, and one that takes 20 hours for me. Others also implement this as an option, while recognizing that it's unsafe Unless you can think of a better way to address the slow transfer rate problem with a readable stream? |
If using |
Unfortunately, the server I'm downloading from is SFTP only. It's about 200gigs of healthcare files that need to be parsed and processed - so the download speed is a big deal. ssh2/sftp is the perfect solution because I can use streams to parse/filter the files "in-flight" without needing to store to disk and then re-read. Honestly, fastGet would be fine and I could just use the step function if I could have a data delivery order guarantee. The data is in ndjson, so I need to split on lines and then parse json. Out-of-order chunks make that impossible with just using the step function and fastXfer. If fastXfer was able "slot fill" a continuous buffer and call "step" once the full buffer was filled, it would make life brilliant. At that point, it could just be used as a really fast readable stream too, I guess. |
Have you benchmarked against using OpenSSH's SFTP client to see how it compares to |
I'm also seeing a very slow transfer speed on SFTP, so I tried benchmarking the speed of ssh2 against the native
So, executing a Here is the script I've used, in case someone else once to try (adjust variables on top to match your target): // ADJUST
const file = '</path/to/file/on/server>';
const keyFile = '</path/to/local/privatekey>';
const user = '<remote-user>';
const host = '<remote-hostname>';
// END ADJUST
const { spawn } = require('child_process');
const { createWriteStream, readFileSync } = require('fs');
const { join } = require('path');
const { Client } = require('ssh2');
const { pipeline } = require('stream');
const runNativeExec = () => new Promise((resolve) => {
console.time('native exec');
const process = spawn('/usr/bin/ssh', [`${user}@${host}`, 'cat', `"${file}"`], { shell: true });
process.stderr.on('data', (data) => console.log(data.toString()));
pipeline(process.stdout, createWriteStream(join(__dirname, 'ssh-exec-native')), () => {
console.timeEnd('native exec');
resolve();
});
});
const runNativeSftp = () => new Promise((resolve) => {
console.time('native sftp');
const process = spawn('/usr/bin/sftp', ['-q', '-R 1', '-s sftp', `${user}@${host}:${file}`, 'ssh-sftp-native'], { shell: true });
process.stderr.on('data', (data) => console.log(data.toString()));
process.on('close', () => {
console.timeEnd('native sftp');
resolve();
});
});
const runSsh2Exec = () => new Promise((resolve) => {
console.time('ssh2 exec');
const client = new Client();
client.on('ready', () => {
client.exec(`cat "${file}"`, { pty: true }, (err, process) => {
pipeline(process, createWriteStream(join(__dirname, 'ssh-ssh2-exec')), () => {
console.timeEnd('ssh2 exec');
client.end();
resolve();
});
});
});
client.connect({
host,
username: user,
privateKey: readFileSync(keyFile)
});
});
const runSsh2Sftp = () => new Promise((resolve) => {
console.time('ssh2 sftp');
const client = new Client();
client.on('ready', () => {
client.sftp((err, sftp) => {
pipeline(sftp.createReadStream(file), createWriteStream(join(__dirname, 'ssh-ssh2-sftp')), () => {
console.timeEnd('ssh2 sftp');
client.end();
resolve();
});
});
});
client.connect({
host,
username: user,
privateKey: readFileSync(keyFile)
});
});
(async () => {
await runNativeExec();
await runSsh2Exec();
await runNativeSftp();
await runSsh2Sftp();
})(); Note: it also writes the transferred file to the folder of your test script, so you can verify the size. |
I found this Stackoverflow answer gave me some peace of mind explaining these huge differences in transfer speeds: https://stackoverflow.com/a/21097076 |
While it's not a complete solution, this helps with poor throughput issues when using createReadSteam()