## Transferring files to and from a remote machine

Transfer a local file to a remote host's filesystem using Rsync via SSH.

### Procedure

Transfer a file located on the remote host's filesystem at `/home/ubuntu/remote_unprocessed_file.png` to the local filesystem at `unprocessed_file.png` using Rsync via SSH.

1. Define an `Rsync` strategy with the remote host and private key path to be used for SSH:

In [1]:
import covalent as ct
from pathlib import Path
from typing import List, Tuple
from skimage import io, color

private_key = "/path/to/private/key"
host_address = "123.remote.host.address.com"
username = "ubuntu"

unprocessed_filename = "unprocessed_file.png"
processed_filename = "processed_file.png"

unprocessed_filepath = str(Path(unprocessed_filename).resolve())
processed_filepath = str(Path(processed_filename).resolve())

remote_source_path = f"/home/{username}/remote_{unprocessed_filename}"
remote_dest_path = f"/home/{username}/remote_{processed_filename}"

rsync_strategy = ct.fs_strategies.Rsync(user=username, host=host_address, private_key_path=private_key)

2. Generate the `FileTransfer` objects using `TransferFromRemote` and `TransferToRemote` factories for easier accessibility:

In [2]:
ft_1 = ct.fs.TransferFromRemote(remote_source_path, unprocessed_filepath, strategy=rsync_strategy)
ft_2 = ct.fs.TransferToRemote(remote_dest_path, processed_filepath, strategy=rsync_strategy)

The `Transfer*` functions intelligently assign the order in which the file transfer should take place, i.e in the `TransferFromRemote` case, it would make sense that the file transfer takes place before the task is executed so that it can process that file, vice-versa in the `TransferToRemote` case.

One thing to note here is that `TransferToRemote` is the only case where destination path is passed first and then the source. The `FileTransfer` object generated from it still adheres to the convention of `(<source_file_path>, <dest_file_path>).`

2. Define an electron, passing the Covalent `FileTransfer` objects to the `files` keyword argument in the decorator:

In [3]:
@ct.electron(files=[ft_1, ft_2]) # ft_1 is done before the electron is executed and ft_2 after.
def to_grayscale(files: List[Tuple[str]] = None):

    # Get the downloaded file's path
    image_path = files[0][1] # destination filepath of first file transfer, which has been downloaded
    
    # Convert the image to grayscale
    img = io.imread(image_path)[:, :, :3] # limiting image to 3 channels
    gray_img = color.rgb2gray(img)

    # Save the grayscale image to the to-be-uploaded file's path
    gray_image_path = files[1][0] # source filepath of second file transfer, which will be uploaded
    io.imsave(gray_image_path, gray_img)


3. Define a lattice in which to dispatch the workflow:

In [4]:
@ct.lattice
def process_remote_data():
    return to_grayscale()

dispatch_id = ct.dispatch(process_remote_data)()
status = ct.get_result(dispatch_id, wait=True).status
print(status)

COMPLETED


The unprocessed file located on the remote machine is transferred to the local specified file path, which then gets processed, and subsequently gets transferred back to the remote machine. The transfer operations use `rsync` to perform the transfer. In a typical real-world scenario, this kind of transfer can be used to move data generated by the workflow.

### See Also

[Transferring Local Files During Workflows](./file_transfers_for_workflows_local.ipynb)

[Transferring Files to and from an S3 Bucket](/file_transfers_for_workflows_to_from_s3.ipynb)