## Transferring Files to and from an S3 Bucket

Retrieve files from an S3 bucket before executing a task, and upload files to an S3 bucket after a task, using the `boto3` library.

### Prerequisites

1. Define the read (source) file path. 
2. Create a source file to transfer.

In [1]:
import covalent as ct 

from pathlib import Path

# Define source and destination filepaths
source_filepath = Path('./my_source_file').resolve()

# Create an example file
source_filepath.touch()

### Procedure

Transfer a file from an S3 bucket to a local filesystem using the boto3 library. 

In the following example a zip file is downloaded from an S3 bucket before electron execution. The electron processes the file's contents, then the processed files are uploaded back to the S3 bucket.

1. Define two Covalent `FileTransfer` objects and a Covalent `S3` strategy object:

In [2]:
import covalent as ct
import zipfile
import os

strategy = ct.fs_strategies.S3()

ft_2 = ct.fs.FileTransfer('/home/ubuntu/tmp-dir/images.zip','s3://covalent-tmp/images.zip',strategy = strategy,order=ct.fs.Order.AFTER)
ft_1 = ct.fs.FileTransfer('s3://covalent-tmp/test_vids.zip','/home/ubuntu/tmp-dir/test_vids.zip',strategy = strategy)


NoCredentialsError: Unable to locate credentials

2. Define an electron to:
    1. Download and unzip the zip file from S3;
    2. Unzip the file;
    3. Perform some processing on the contents;
    4. Zip the files; and
    5. Upload the zip file to S3.

In [None]:
@ct.electron(files = [ft_1,ft_2]) # ft_1 is done before the electron is executed and ft_2 after.
def unzip_zip(files=[]):
    path = "/home/ubuntu/tmp-dir"
    # Unzip the downloaded data
    with zipfile.ZipFile(path + "/test_vids.zip", 'r') as zip_ref:
        zip_ref.extractall(path)
        
    # Perform operations on the files.
    # (In a real-world workflow, these operations would be the purpose of this electron.
    # They are omitted for the file transfer demonstration.)
    
    # Zip files to upload    
    with zipfile.ZipFile(path + "/images.zip",  'w', zipfile.ZIP_DEFLATED) as ziph:
        for root, dirs, files in os.walk(path + '/test_vids'):
            for file in files:
                ziph.write(os.path.join(root, file), 
                           os.path.relpath(os.path.join(root, file), 
                                           os.path.join(path, '..')))

3. Create and dispatch a lattice to run the electron:

In [None]:
@ct.lattice
def process_s3_data():
    return unzip_zip()

dispatch_id = ct.dispatch(process_s3_data)()

Notes:
- This example illustrates a typical pattern in which files are downloaded from remote storage, are processed, and the results are uploaded to the same remote storage. Other scenarios can of course be implemented with the Covalent components illustrated here (`FileTransfer`, `FileTransferStrategy`, `@electron`).
- The example puts everything in one electron (file download, processing, file upload). For a real-world scenario of any complexity, a better practice would be to break the task into small sub-tasks, each in its own electron.

### See Also

[Transferring Local Files During Workflows](./file_transfers_for_workflows_local.ipynb)

[Transferring Remote Files During Workflows](./file_transfers_to_remote.ipynb)