Bash script utility for running secure copy of a file in chunks in parallel.
From your favorite terminal, run:
curl -s https://raw.githubusercontent.com/yinonavraham/pscp/master/bin/setup/install.sh | bash
pscp [OPTIONS] FILE DEST
In order to copy a local file named myfile
to path /path/to/
in a remote host named some-other-host
, authenticating as myuser
(based on SSH key), use:
pscp myfile myuser@some-other-host:/path/to/
--help
Show usage help. Use this to inspect all options the utility supports.--threads=<value>
Specify the number of chunks to split the file to. These chunks are then sent in parallel.--verbose
Print verbose output of the actions done by the script.
The script performs the following main operations:
- Create a temporary local transaction directory. This directory is used to save all temporary files created during the transfer.
- Split the source file to chunks (according to the threads value). These chunk files are saved in the local transaction directory.
- Create a temporary transaction directory in the remote host.
- Start background processes to transfer each chunk file (using
scp
). Each such process appends a line to a control file (saved in the local transaction directory). - Wait for all background processes to finish. This is done by polling on the control file.
- Assemble the file in the remote host from all the chunk files.
- Verify the checksum of the source file and the remote file is the same.
- Cleanup the transaction directories (local and remote).
Update
To update pscp
to its latest version, use:
pscp --setup=update
Uninstall
To remove pscp
installation, use:
pscp --setup=uninstall
This script started as an experiment.
As part of the experiment files of various sizes were copied between AWS instances, from Frankfurt DE to Oregon US.
The same files were copied 10 times using scp
(i.e. single process) and 10 times using pscp
with default threads (i.e. 10 parallel scp
processes, each transfers a single chunk).
The results, as can be seen below, transfer time in parallel (using pscp
) is approximately 30%-40% of the time compared to using scp
.
There are several overheads which are not included in the calculations above:
- Splitting the file in the source and re-assembling it in the destination.
This is a mandatory overhead, but it usually takes no more than 2-3 seconds. - In order to verify that the file was transferred successfully, the script performs a checksum verification.
It calculates the checksum (SHA-1) of the file in the source, the checksum of the file in the destination and compares them.
This verification also takes ~2 seconds, but can be disabled using the
--no-verify
flag. - The script cleans up temporary files (mainly the chunks and a control file) from both the source and the destination.
This cleanup also takes ~2 seconds, but can be disabled using the
--no-cleanup
flag.
Contributions of all kinds are more than welcome (bug fixes, new features, documentation, etc.).
Feel free to create an issue, or even better - open a pull request.