# File operations on AWS S3 with Python

There are several ways to access files on Amazon Web Services S3 where the FlyLight imagery, color depth MIPs, and templates are stored. A simple way is to use the [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) library:

In [None]:
import boto3

S3 stores files (called *objects*) in *buckets*. With boto3, it's easy to access an S3 bucket:

In [None]:
S3_RESOURCE = boto3.resource('s3')
bucket = S3_RESOURCE.Bucket("janelia-flylight-templates")

The resource we're creating above uses a high level API that gives us access to Amazon Web Services. Using the resource, we create a bucket object.

## Listing the contents of a bucket

In [None]:
for obj in bucket.objects.all():
    print(obj.key)

In the listing, you'll see what looks like folders or directories, and then files in those directories. Although S3 will show objects in a bucket with directories, there actually are none - storage in a bucket is flat, rather than heirarcical (more information [here](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/using-folders.html)).

## Accessing individual files

Let's look at one of the objects from the FlyLight_Gen1_GAL4 color depth MIP collection - a thumbnail of the original image. Objects are referred to with *keys*:

In [None]:
bucket = S3_RESOURCE.Bucket("janelia-flylight-color-depth-thumbnails")
KEY = "JRC2018_Unisex_20x_HR/FlyLight_Gen1_GAL4/R10C09-20090919_08_fA01b_20090919091328496-GAL4-f-20x-brain-JRC2018_Unisex_20x_HR-CDM_1.jpg"
obj = bucket.Object(KEY)
print(obj)

As you can see above, the object (an [S3 Object](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Object)) stores the bucket name and key. Let's get a few attributes:

In [None]:
print("File type is %s and the size is %d bytes" % (obj.content_type, obj.content_length))

## Downloading a file

We can easily download a file from a bucket using the [download_file](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.download_file) method:

In [None]:
bucket.download_file(KEY, '/tmp/demo_file.jpg')

The image was downloaded to */tmp/demo_file.png*. Let's display it:

In [None]:
from IPython.display import Image
Image(filename='/tmp/demo_file.jpg')

# Using Quilt

Quilt is a versioned data portal for AWS. Among other projects on Quilt, you'll also find Janelia's FlyLight imagery.
There is a simple API for accessing Quilt data:

In [None]:
import quilt3
quilt3.config("https://open.quiltdata.com")

After importing the quilt3 module, we're simply configuring it to use the Quilt site for Open Data. Let's initialize a bucket:

In [None]:
bucket = quilt3.Bucket("s3://janelia-flylight-templates")

...and then list the contents

In [None]:
bucket.ls()

Note that by default ls only shows one level of prefixes and objects. If you want everything, specify recursive=True:

In [None]:
bucket.ls(recursive=True)

If we want to look at all of the keys in a bucket, we can do that:

In [None]:
bucket.keys()

If we're interested in listing only one prefix, we can specify it:

In [None]:
bucket.ls(path="JFRC2017_VNC_FEMALE_SYMMETRIC")

Let's download a file from the above prefix:

In [None]:
bucket.fetch("JFRC2017_VNC_FEMALE_SYMMETRIC/20x_flyVNCtemplate_Female_symmetric_16bit.nrrd", "/tmp/female.nrrd")