# Tutorial for Interactig with OpenStack Swift from Python

### Purpose

The purpose of this notebook is to show how to interact with an instance of OpenStack Swift from Python. This notebook uses the official OpenStack SDK for Python and has been tested using Anancoda Python 3. Please note that not all the error checking best practices has been implemented in this tutorial for keeping the focus in the essential mechanics of the SDK in a reduced amount of code.

### Reference documents

* Source of the Python bindings to the OpenStack object storage API: [https://github.com/openstack/python-swiftclient](https://github.com/openstack/python-swiftclient)
* `python-swiftclient` in PyPi: [https://pypi.python.org/pypi/python-swiftclient](https://pypi.python.org/pypi/python-swiftclient)
* `swifclient` documentation: [http://docs.openstack.org/developer/python-swiftclient/swiftclient.html#swiftclient.client.head_object](http://docs.openstack.org/developer/python-swiftclient/swiftclient.html#swiftclient.client.head_object)

### Connect to the OpenStack Swift service
We use the OpenStack service credentials extracted from the environmental variables:

* `OS_AUTH_URL`: authentication end point (Keystone)
* `OS_USERNAME`: user name
* `OS_TENANT_NAME`: tenant name
* `OS_PASSWORD`: very long key

You can specify your credentials directly in this notebook, if you wish. The values associated to this credentials must be provided to you by the administration of the OpenStack Swift instance you need to interact with.

In [135]:
import os
import datetime
import tempfile
import hashlib

import swiftclient as swift
import requests

In [136]:
authurl = os.getenv('OS_AUTH_URL', 'your_auth_endpoint')
user = os.getenv('OS_USERNAME', 'your_username')
tenant_name = os.getenv('OS_TENANT_NAME', 'your_tenant_name')
key = os.getenv('OS_PASSWORD', 'your_password')

conn = swift.Connection(authurl=authurl, user=user, key=key, tenant_name=tenant_name, auth_version='2')

We have now a connection object `conn` through which we will interact with Swift.

### Create a new container
Let's create a new container for our tests, we will destroy later. The name of the container is composed of the prefix `butlerswift` followed by the user name and the creation timestamp.

Creating a new container is an idempotent operation: if the container already exists it is a no-operation.

In [147]:
container_prefix = 'butlerswift-{0}-'.format(user)
container_name = container_prefix + '{:%Y%m%d%H%M%S}'.format(datetime.datetime.now())
conn.put_container(container_name)

### Check the existence of the container
To check the existence of the container we issue an `HTTP HEAD` request. If the container exists, a set of `HTTP` headers associated to this container is returned in the form of a dictionary. We use the value of one of them, `x-container-object-count` which tells us the number of objects in the container.

In [148]:
try:
    headers = conn.head_container(container_name)
    num_objects = headers['x-container-object-count']
    print("Container \'{0}\' does exist and contains {1} objects".format(container_name, num_objects))
except swift.ClientException as err:
    print("Container \'{0}\' not found: {1}".format(container_name, err))

Container 'butlerswift-fabio-20160902152905' does exist and contains 0 objects


### Upload a new object into the container
Download a sample FITS file and store it in our local disk. We will then use that file to upload it to Swift.

In [149]:
def download_file(url):
    """ Download a file from the argument url and store its contents in a local file
    The name of the local file is built from the last component of the url path.
    The file is downloaded only if there is no local file with the same name. Therefore, if this
    function is called several times with the same url the file will be downloaded only once.
    Returns the file name on disk.
    """
    file_name = url.split('/')[-1]
    if os.path.exists(file_name):
        return file_name
    
    req = requests.get(url, stream=True)
    with open(file_name, 'wb') as f:
        for chunk in req.iter_content(chunk_size=1024*2014):
            if chunk:
                f.write(chunk)
    return file_name
    
# Download a sample FITS file from http://fits.gsfc.nasa.gov/fits_samples.html
url = "http://fits.gsfc.nasa.gov/samples/WFPC2u5780205r_c0fx.fits"
file_name = download_file(url)

Now upload the local FITS file to Swift:

In [150]:
# This is the object key: in Swift, an object is uniquely identified by the container it resides in and its object key.
# The slash (/) characters in the key have no meaning for Swift.
def get_file_size(file_name):
    statinfo = os.stat(file_name)
    return statinfo.st_size

object_size = get_file_size(file_name)
object_key = "fits/" + file_name

with open(file_name, 'rb') as f:
    # put_object returns the object's etag
    etag = conn.put_object(container=container_name, obj=object_key, contents=f, content_length=object_size)

### Check for the existence of a particular object
To check the existence of an object within a container the SDK will issue an `HTTP HEAD` request. The Swift service responds with a dictionary of headers. We use the value of the header `content-length` to retrieve the size in bytes of the object.

In [151]:
try:
    headers = conn.head_object(container_name, object_key)
    count = headers['content-length']
    print("Object \'{0}\' does exist and contains {1} bytes".format(object_key, count))
except swift.ClientException as err:
    print("Object \'{0}\' not found: {1}".format(object_key, err))

Object 'fits/WFPC2u5780205r_c0fx.fits' does exist and contains 699840 bytes


### List all the objects of a container


In [152]:
try:
    headers, objects = conn.get_container(container_name)
    
    # Show some information about this container
    num_objects = int(headers['x-container-object-count'])
    print("Container \'{0}\':".format(container_name))
    print("   number of bytes used:", headers['x-container-bytes-used'])
    print("   number of objects:", num_objects)
    
    # Show some details of the objects of this container
    if num_objects > 0:
        print("\nObject details:")
        for o in objects:
            print("   ", o['bytes'], o['name'])
        
except swift.ClientException as err:
    print("Container \'{0}\' not found: {1}".format(container_name, err))

Container 'butlerswift-fabio-20160902152905':
   number of bytes used: 699840
   number of objects: 1

Object details:
    699840 fits/WFPC2u5780205r_c0fx.fits


### Download an object
Here we download the contents of a Swift object to a disk file and compare its contents against the contents of the the original file uploaded above.

In [153]:
def get_md5_digest(file_name):
    """Computes and returns the MD5 digest of a disk file
    """
    hasher = hashlib.md5()
    with open(file_name, 'rb') as f:
        block_size = 64 * 1024
        buffer = f.read(block_size)
        while len(buffer) > 0:
            hasher.update(buffer)
            buffer = f.read(block_size)
            
    return hasher.hexdigest()

In [154]:
# Download a Swift object and compare its contents to the contents of the original file
copy_file_name = 'copy-' + file_name
try:
    headers, contents = conn.get_object(container_name, object_key)
    with open(copy_file_name, 'wb') as f:
        f.write(contents)
        
    copy_md5 = get_md5_digest(copy_file_name)
    original_md5 = get_md5_digest(file_name)
    if copy_md5 != original_md5:
        print("the contents of the uploaded file and the downloaded file do not match")
except swift.ClientException as err:
    print("Could not download object \'{0}/{1}\' not found: {2}".format(container_name, object_key, err))

### Delete my containers
Delete the containers created by the execution of this notebook, that is, the containers with prefix "`butlerswift-user-`"

In [155]:
def delete_container(conn, container):
    try:
        # Delete all the objects in this container
        headers, objects = conn.get_container(container)
        for o in objects:
            conn.delete_object(container, o['name'])
        
        # Delete the container itself
        conn.delete_container(container)
        
    except swift.ClientException as err:
        print("Error deleting \'{0}\' not found: {1}".format(container, err))

In [156]:
# Delete all my containers, i.e. all containers starting with prefix "butlerswift-user-"
resp_headers, containers = conn.get_account()
for c in containers:
    name = c['name']
    if name.startswith(container_prefix):
        print("Deleting container \'{}\'".format(name))
        delete_container(conn, name)

Deleting container 'butlerswift-fabio-20160902152905'
