Gtransfer is a wrapper script for tgftp (which itself wraps globus-url-copy) and also uses functionality of uberftp. Gtransfer provides an advanced command line interface for performing GridFTP data transfers. The primary aim of gtransfer is to make GridFTP data transfers on the command line as easy as possible for the user. Therefore a user only has to provide the source and the destination to perform a data transfer:
$ gt -s <SOURCE> -d <DESTINATION>
Gtransfer can transfer files along predefined paths by using transit sites and can therefore bridge different network domains.
Example:
$ gt -s host1:/files/* -d host3:/files/
................
NOTICE: This examples uses two host aliases -
host1:
andhost3:
- which can point to ordinary host addresses likegsiftp://host1.domain.tld:2811
.
The host host1
is located in a private network, host3
is located in the
Internet and host2
has connections to both networks. To transfer files from
host1
to host3
gtransfer copies the files to the transit host host2
(first
step) and afterwards from host2
to host3
(second step). After the transfer
has finished temporary files are removed from host2
. See dpath(5) for
details.
Gtransfer can distribute a data transfer over multiple paths. This way users can benefit from the combined bandwidth of multiple paths.
Example
$ gt -s host1:/file/* -d host3:/files/ -m all
010101011
The host host1
has connections to both the Internet and a private network. The
bandwidth of the Internet connection is limited to 1 Gb/s, but the connection to
the private network has a bandwidth of 10 Gb/s. The host host2
has a bandwidth
of 10 Gb/s on connections to both the Internet and the private network. In
effect there are two paths available from host1
to host3
, one direct path
and one indirect path using host2
as transit site. With multipathing, instead
of using only one path, both paths can be used to combine the available
bandwidth. To distribute a data transfer over those two paths, gtransfer splits
the list of files to be transferred into two lists according to the bandwidth
proportions taking into account the file size. I.e. the connection with the
greater bandwidth will transfer a greater amount of the total file size of the
data transfer than the other connection.
NOTICE: Because the second path uses a transit site and needs two transfer steps to complete, the effective bandwidth is lower than the bandwidth of the used connections.
Another aim of gtransfer is to allow well-performing data transfers without detailed knowledge about the underlying facilities. Therefore gtransfer supports usage of pre-optimized data transfer parameters for specific connections. See dparam(5) for details. In addition gtransfer can also automatically optimize a data transfer depending on the size of the files.
Gtransfer supports interruption and continuation of transfers. You can interrupt
a transfer by hitting CTRL+C
. To continue an interrupted transfer simply issue
the very same command, gtransfer will then continue the transfer where it was
interrupted. The same procedure also works for a failed transfer.
Gtransfer supports automatic retries of failed transfer steps. The number of retries is configurable. See gtransfer(1) for details.
Gtransfer makes use of bash completion to ease usage. This supports completion
of options and URLs. URL completion also expands (remote) paths directly on the
command line. Just hit the TAB
key to see what's possible.
Gtransfer can use host aliases as alternatives to host addresses. E.g. a user
can use myGridFTP:
and gsiftp://host1.domain.tld:2811
synonymically. See
host aliases for more details.
Gtransfer can use persistent identifiers (PIDs) as used by EUDAT and provided by EPIC as source of a data transfer. See persistent identifiers for more details.
As said, the primary aim of gtransfer is to make GridFTP data transfers on the command line as easy as possible for the user. Therefore the simple example in the description should be already suitable for most users.
You can find more detailed examples in the gtransfer wiki on GitHub. Additional examples will be made available occasionally.
This is a list of HPC centers in Europe that use gtransfer in production:
Höchstleistungsrechenzentrum Stuttgart (HLRS - Germany)
CSC - IT Center for Science (CSC - Finland)
Leibniz-Rechenzentrum (LRZ) der Bayerischen Akademie der Wissenschaften (LRZ - Germany)
Irish Centre for High-End Computing (ICHEC - Ireland)
Centro di supercalcolo, Consorzio di università (CINECA - Italy)
SURFsara (SURFsara - The Netherlands)
Centre Informatique National de l’Enseignement Supérieur (CINES - France)
IT4Innovations national supercomputing center (IT4Innovations - Czech republic)
Karlsruhe Institute of Technology (KIT - Germany)
(GPLv3)
Copyright (C) 2010, 2011, 2013-2017 Frank Scheiner, HLRS, Universitaet Stuttgart
Copyright (C) 2011, 2012, 2013 Frank Scheiner
The software is distributed under the terms of the GNU General Public License
This software is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.