## Transferring Files to and from an S3 Bucket

Retrieve files from an S3 bucket before executing a task, and upload files to an S3 bucket after a task, using the `boto3` library.

### Procedure

Transfer a file from an S3 bucket to the local filesystem using the boto3 library. 

In the following example a zip file is downloaded from an S3 bucket before electron execution. The electron processes the file's contents, then the processed files are uploaded back to the S3 bucket.

1. Define two Covalent `FileTransfer` objects and a Covalent `S3` strategy object:

In [1]:
import covalent as ct
from typing import List, Tuple
from pathlib import Path
from skimage import io, color

strategy = ct.fs_strategies.S3()

unprocessed_filename = "unprocessed_file.png"
processed_filename = "processed_file.png"

unprocessed_filepath = str(Path(unprocessed_filename).resolve())
processed_filepath = str(Path(processed_filename).resolve())


s3_source_path = f"s3://covalent-howto-tmp/remote_{unprocessed_filename}"
s3_dest_path = f"s3://covalent-howto-tmp/remote_{processed_filename}"

ft_1 = ct.fs.FileTransfer(s3_source_path, unprocessed_filepath, strategy=strategy)
ft_2 = ct.fs.FileTransfer(processed_filepath, s3_dest_path, strategy=strategy, order=ct.fs.Order.AFTER)


2. Define an electron to:
    1. Download the unprocessed file from S3
    2. Perform some processing on the contents
    3. Upload the processed file to S3

Now, let's say you'd like to access the file paths inside the electron as shown below. You can do so by using the "files" keyword argument. Covalent will inject the source/destination filepaths of the `FileTransfer` objects passed to the `electron` in to the `files` argument. In this case, the `files` variable will look something like this:
```python
[('/remote_unprocessed_file.png', '/path/to/current/dir/unprocessed_file.png'), ('/path/to/current/dir/processed_file.png', '/remote_processed_file.png')]
```
The format of each tuple is `(<source-path>, <destination-path>)`. The name of the S3 bucket is omitted within the electron because a) for security reasons as bucket names are globally unique, b) since we've already mentioned it once at definition time, covalent should automatically take care of it without user's involvement

In [2]:
@ct.electron(files=[ft_1, ft_2]) # ft_1 is done before the electron is executed and ft_2 after.
def to_grayscale(files: List[Tuple[str]] = None):

    # Get the downloaded file's path
    image_path = files[0][1] # destination filepath of first file transfer, which has been downloaded
    
    # Convert the image to grayscale
    img = io.imread(image_path)[:, :, :3] # limiting image to 3 channels
    gray_img = color.rgb2gray(img)

    # Save the grayscale image to the to-be-uploaded file's path
    gray_image_path = files[1][0] # source filepath of second file transfer, which will be uploaded
    io.imsave(gray_image_path, gray_img)


3. Create and dispatch a lattice to run the electron:

In [3]:
@ct.lattice
def process_s3_data():
    return to_grayscale()

dispatch_id = ct.dispatch(process_s3_data)()
result = ct.get_result(dispatch_id, wait=True)
print(result)


Lattice Result
status: COMPLETED
result: None
input args: []
input kwargs: {}
error: None

start_time: 2023-02-22 07:23:14.389188
end_time: 2023-02-22 07:23:16.403313

results_dir: /home/neptune/.local/share/covalent/data
dispatch_id: 232331bb-f693-401a-b860-ccca8c946f99

Node Outputs
------------
to_grayscale(0): None



Notes:
- This example illustrates a typical pattern in which files are downloaded from remote storage, are processed, and the results are uploaded to the same remote storage. Other scenarios can of course be implemented with the Covalent components illustrated here (`FileTransfer`, `FileTransferStrategy`, `@electron`).
- The example puts everything in one electron (file download, processing, file upload). For a real-world scenario of any complexity, a better practice would be to break the task into small sub-tasks, each in its own electron.

### See Also

[Transferring Local Files During Workflows](./file_transfers_for_workflows_local.ipynb)

[Transferring Remote Files During Workflows](./file_transfers_to_remote.ipynb)