# Using cloudknot to batch write files to an S3 bucket

This example uses cloudknot to write files to an Amazon S3 bucket. Note that cloudknot usually returns output to the user, using Amazon S3 as an intermediary. But sometimes you might want to leave your data on S3 without transfering it back to your local machine. In that case, use the following example as a guide.

In [1]:
import cloudknot as ck

First, we write the python script that we want to run on AWS batch. Note that we import the necessary python packages within the function `write_to_bucket`. You should change the `bucket_name` to the S3 bucket that you would like to write to. Since our function returns `None`, it minimizes the content added to your cloudknot S3 bucket.

In [2]:
def write_to_bucket(index):
    import boto3
    import platform
    
    client = boto3.resource('s3')
    
    host = platform.node()
    
    fn = 'temp_{i:03d}.txt'.format(i=int(index))
    with open(fn, 'w') as f:
        f.write("Hello World from index {i:s} on host {host:s}!".format(
            i=str(index), host=host))

    bucket_name = 'escience.washington.edu.public'
    b = client.Bucket(bucket_name)
    b.upload_file(fn, fn)

By default, cloudknot does not attach any additional policies to its IAM roles. But since we are writing to an S3 bucket, we want our IAM roles to have S3 access. So we add that to the roles in the PARS that our knot is based on.

Create a knot using the `write_to_bucket` function and the "AmazonS3FullAccess" policy.

In [3]:
knot = ck.Knot(name='write-to-s3-bucket-2',
               func=write_to_bucket,
               base_image="python:3.7",
               pars_policies=('AmazonS3FullAccess',))

Submit 10 batch jobs to the knot.

In [4]:
result_futures = knot.map(range(10))

We can query the jobs associated with this knot by calling `knot.view_jobs()`, prints a bunch of job info and provides a consice summary of job statuses.

In [5]:
# Rerun this cell as often as you like to update your job status info
knot.view_jobs()

Job ID              Name                        Status   
---------------------------------------------------------
c49bdc96-7b49-45d5-886c-8c22dbfad38d        write-to-s3-bucket-2-0        SUBMITTED


To check the results, you can login to the S3 console page at https://s3.console.aws.amazon.com/s3/home. Verify that cloudknot created a bunch of text files in your S3 bucket.

Once you're all done, clobber the knot, including the underlying PARS, local docker image, and the remote repo.

In [6]:
knot.clobber(clobber_pars=True, clobber_repo=True, clobber_image=True)