# The Agave API

In this notebook we explore the Agave API and use it to store and retrieve files. We will use the requests library to make all requests. For full documentation on the Agave API, see: http://developer.agaveapi.co/

To get started, we need to generate a set of Agave developer client keys (OAuth credentials). Generating OAuth clients uses HTTP Basic Authentication (https://tools.ietf.org/html/rfc2617) with your TACC username and password.

In [4]:
# import the requests library
import requests

# import getpass to prompt for a password
from getpass import getpass

In [5]:
# the base URL for interacting with the Agave API
base_url = 'https://api.tacc.utexas.edu'

In [6]:
# Set up your TACC credentials. Modify the username appropriately
username = 'jstubbs'
password = getpass(prompt='Hello {}. Please enter your TACC password: '.format(username))

Hello jstubbs. Please enter your TACC password: ········


## OAuth Client And Access Token Generation

With the username and password in place, we are ready to interact with Agave's OAuth server to generate an OAuth client and then an access token. Agave has a full OAuth provider server and supports 4 major grant types: password, authorization_code, refresh_token and implicit. For more details on OAuth see the spec (https://tools.ietf.org/html/rfc6749).

### OAuth Client Generation

We can use Agave's clients service to generate and manage our OAuth clients. First let's make a GET request to the clients service to see what clients we have. We will use the provided HTTPBasicAuth class in requests, which comes with a convenient shortcut:

In [16]:
rsp = requests.get(url='{}/clients/v2'.format(base_url), auth=(username, password))
rsp.status_code

200

In [17]:
# the clients service, like all Agave services, returns us JSON:
rsp.json()

{'message': 'Clients retrieved successfully.',
 'result': [{'_links': {'self': {'href': 'https://api.tacc.utexas.edu/clients/v2/DefaultApplication'},
    'subscriber': {'href': 'https://api.tacc.utexas.edu/profiles/v2/jstubbs'},
    'subscriptions': {'href': 'https://api.tacc.utexas.edu/clients/v2/DefaultApplication/subscriptions/'}},
   'callbackUrl': None,
   'consumerKey': 'VzN6Cxj9GirfMsINqUVQxbhrQSQa',
   'description': None,
   'name': 'DefaultApplication',
   'tier': 'Unlimited'},
  {'_links': {'self': {'href': 'https://api.tacc.utexas.edu/clients/v2/postman'},
    'subscriber': {'href': 'https://api.tacc.utexas.edu/profiles/v2/jstubbs'},
    'subscriptions': {'href': 'https://api.tacc.utexas.edu/clients/v2/postman/subscriptions/'}},
   'callbackUrl': 'https://www.getpostman.com/oauth2/callback',
   'consumerKey': 'fYEbfnaxo4LbqShSebng4f5LD18a',
   'description': '',
   'name': 'postman',
   'tier': 'Unlimited'},
  {'_links': {'self': {'href': 'https://api.tacc.utexas.edu/client

To create a simple OAuth client that has access to all basic Agave APIs, we need to make a POST request to the clients service. The only required field we need to pass in is `clientName` to give a name to our client. Each client we create must have a unique name.

In [18]:
# Pick a name for your client; this name will have to be different every time you run this cell. Otherwise, you
# will try to recreate a client with the same name and you will get an error.
client_name = 'cic_institute'

# make a POST request to the client's service, passing only that field. 
# Note that the parameter name uses camel case
data = {'clientName': client_name}
rsp = requests.post(url='{}/clients/v2'.format(base_url), data=data, auth=(username, password))
rsp.status_code

201

Note that Agave returned a 201 status code to indicate that resource was successfully created. 

Let's explore the response.

In [19]:
rsp.json()

{'message': 'Client created successfully.',
 'result': {'_links': {'self': {'href': 'https://api.tacc.utexas.edu/clients/v2/cic_institute'},
   'subscriber': {'href': 'https://api.tacc.utexas.edu/profiles/v2/jstubbs'},
   'subscriptions': {'href': 'https://api.tacc.utexas.edu/clients/v2/cic_institute/subscriptions/'}},
  'callbackUrl': '',
  'consumerKey': 'f_9vaEvm3oxCnNe9Z4VoxRXZGwca',
  'consumerSecret': 'MTXYtMiwOmxNI2mRLrwOwG_9Vpwa',
  'description': '',
  'name': 'cic_institute',
  'tier': 'Unlimited'},
 'status': 'success',
 'version': '2.0.0-SNAPSHOT-rc3fad'}

Two important string fields are returned, the `consumerKey` and the `consumerSecret`. We will need these fields to interact with the Agave OAuth token service and generate an access token. Let's take note of those variables.

In [21]:
key = rsp.json()['result']['consumerKey']

In [22]:
secret = rsp.json()['result']['consumerSecret']

### Generating Access and Refresh Tokens

We're now ready to generate an OAuth token. This token will represent both the user and client application. In this case, they are owned by the same individual, but in general they will not be.

To generate an OAuth token, we make a POST request to Agave's token service. We have to pass in several fields to make the request:

In [23]:
# POST payload for generating a token using the password grant:
# scope will always be PRODUCTION.
data = {'username': username,
       'password': password,
       'grant_type': 'password',
       'scope': 'PRODUCTION'}
# note that authentication is technically HTTPBasicAuth with the OAuth client key and secret
rsp = requests.post('{}/token'.format(base_url), data=data, auth=(key, secret))
rsp.status_code

200

In [24]:
# check the response message:
rsp.json()

{'access_token': '6f8f0b94c413c8a7b5ba428bd50d4',
 'expires_in': 14400,
 'refresh_token': 'dafb6ab783b8edd2cfceda40a7f134',
 'scope': 'default',
 'token_type': 'bearer'}

We see that Agave generated an access token and a refresh token. The access token is good for 4 hours (14400 seconds). But we can use the refresh token to get a new access token at any point. To do so, we will use the refresh_token grant type and a modified payload. For now, let's just grab the tokens from the response.

In [26]:
access_token = rsp.json()['access_token']
refresh_token = rsp.json()['refresh_token']

Once we have an access token we are ready to interact with the rest of the Agave services. All requests to Agave using this access token will be done on behalf of the user who's credentials were used to retrieve the token (as well as the OAuth client that was used).

In order to make a request to Agave using the access token, we need to pass the token into the Authorization header of the request. The value of the header must be formated like so: `"Bearer <access_token>"`

As a simple check, we'll use the Agave Profiles service to pull the "profile" associated with this token. The Agave Profiles service maintains some details about registered users.

In [27]:
# build the Authorization header in a headers dictionary
headers = {'Authorization': 'Bearer {}'.format(access_token)}

# make a request to the profiles service; the "me" endpoint is a special reserved word in Agave to indicate
# we want information about the associated token.
rsp = requests.get(url='{}/profiles/v2/me'.format(base_url), headers=headers)
rsp.status_code

200

In [28]:
rsp.json()

{'message': 'User details retrieved successfully.',
 'result': {'create_time': '20140515180254Z',
  'email': 'jstubbs@tacc.utexas.edu',
  'first_name': 'Joe',
  'full_name': 'jstubbs',
  'last_name': 'Stubbs',
  'mobile_phone': '',
  'phone': '',
  'status': '',
  'username': 'jstubbs'},
 'status': 'success',
 'version': '2.0.0-SNAPSHOT-rc3fad'}

Indeed, the profile belongs to me. We are now ready to interact with Agave's cloud storage.

## Working with Agave Cloud Storage

One of the strengths of the Agave API is its ability to interact with a remote storage systems. Agave allows users to register and share (virtual) storage systems with other users, and interact with the associated files and folders on such systems.

In preparation for this class, we set up a file server in JetStream and used Agave's systems API to register it and share it with each of you. We will not go into all the details of Agave's Systems service, but we will point out that it has a fine grained permissions model which allows us to give each of you access to a different part of the file system. For more information on Agave's systems service, see: http://developer.agaveapi.co/#systems

Today, we will focus on Agave's files service, which allows us to interact with the files and folders that we have access to on the Agave system. There are three activities we will explore:
  * Listing files and folders
  * Uploading and downloading files
  * Creating directories


### Working with the Files Service

All actions through the files service are in reference to a specific Agave storage system. One speficies the system one wishes to interact with by providing the system's id. We will be exclusively using the storage system we set up for this class, and it's id is provided below.

# first, set the system id we will be working with. The id below is the id of the storage system we registered
# for the class
system_id = 'cic.storage'

Beyond the system id, in order to list files we also need to provide a path on the system to list. Paths are relative to the storage system's home 

In [33]:
# let's start by using the files service to list the files in our home directory, which is given by our username.
rsp = requests.get('{}/files/v2/listings/system/{}/{}'.format(base_url, system_id, username), headers=headers)
rsp.status_code

200

In [36]:
rsp.json()['result']

[{'_links': {'history': {'href': 'https://api.tacc.utexas.edu/files/v2/history/system/cic.storage//home/jstubbs'},
   'metadata': {'href': 'https://api.tacc.utexas.edu/meta/v2/data?q=%7B%22associationIds%22%3A%227129190038640988647-242ac113-0001-002%22%7D'},
   'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'folder',
  'lastModified': '2017-07-20T13:28:30.000-05:00',
  'length': 17,
  'mimeType': 'text/directory',
  'name': '.',
  'path': '/home/jstubbs',
  'permissions': 'ALL',
  'system': 'cic.storage',
  'type': 'dir'},
 {'_links': {'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/foo'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'raw',
  'lastModified': '2017-07-20T13:28:30.000-05:00',
  'length': 5,
  'mimeType': 'application/octet-stream',
  'name': 'f

In [39]:
# let's create a directory called 'test' inside our home directory. To do this, we make a PUT request 
# to the files service and we pass a specific payload
# note as well that we use the 'media' endpoint instead of the listings endpoint.
data = {'action': 'mkdir', 'path': 'test'}
rsp = requests.put(url='{}/files/v2/media/system/{}/{}'.format(base_url, system_id, username), data=data, headers=headers)
rsp.status_code

201

In [40]:
# check the response
rsp.json()['result']

{'_links': {'history': {'href': 'https://api.tacc.utexas.edu/files/v2/history/system/cic.storage//home/jstubbs/test'},
  'profile': {'href': 'https://api.tacc.utexas.edu/profiles/v2/jstubbs'},
  'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/test'},
  'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
 'internalUsername': None,
 'lastModified': '2017-07-20T18:28:02.510-05:00',
 'name': 'test',
 'nativeFormat': 'dir',
 'owner': 'jstubbs',
 'path': 'jstubbs/test',
 'source': None,
 'status': 'TRANSFORMING_COMPLETED',
 'systemId': 'cic.storage',
 'uuid': '1371613682016129511-242ac113-0001-002'}

In [41]:
# now, let's list our home directory again and check that the directory is there
rsp = requests.get('{}/files/v2/listings/system/{}/{}'.format(base_url, system_id, username), headers=headers)
rsp.status_code

200

In [42]:
rsp.json()['result']

[{'_links': {'history': {'href': 'https://api.tacc.utexas.edu/files/v2/history/system/cic.storage//home/jstubbs'},
   'metadata': {'href': 'https://api.tacc.utexas.edu/meta/v2/data?q=%7B%22associationIds%22%3A%227129190038640988647-242ac113-0001-002%22%7D'},
   'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'folder',
  'lastModified': '2017-07-20T18:28:02.000-05:00',
  'length': 29,
  'mimeType': 'text/directory',
  'name': '.',
  'path': '/home/jstubbs',
  'permissions': 'ALL',
  'system': 'cic.storage',
  'type': 'dir'},
 {'_links': {'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/foo'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'raw',
  'lastModified': '2017-07-20T13:28:30.000-05:00',
  'length': 5,
  'mimeType': 'application/octet-stream',
  'name': 'f

In [43]:
# we can also list its contents directly by appending it to the path:
rsp = requests.get('{}/files/v2/listings/system/{}/{}/test'.format(base_url, system_id, username), headers=headers)
rsp.json()['result']

[{'_links': {'history': {'href': 'https://api.tacc.utexas.edu/files/v2/history/system/cic.storage//home/jstubbs/test'},
   'metadata': {'href': 'https://api.tacc.utexas.edu/meta/v2/data?q=%7B%22associationIds%22%3A%221371613682016129511-242ac113-0001-002%22%7D'},
   'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/test'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'folder',
  'lastModified': '2017-07-20T18:28:02.000-05:00',
  'length': 6,
  'mimeType': 'text/directory',
  'name': '.',
  'path': '/home/jstubbs/test',
  'permissions': 'ALL',
  'system': 'cic.storage',
  'type': 'dir'}]

Finally, let's upload and download a file into our new directory. These actions are also done with the media endpoint.

In [54]:
# first, let's upload a file called foo.txt to our test directory. we'll create the file locally real quick
f= open("foo.txt","w+")
f.write("This is a test.")
f.close()

In [55]:
# check that our file is there:
! ls -l

total 156
-rw-r--r-- 1 root root 25703 Jul 20 19:36 Agave API.ipynb
-rw-r--r-- 1 root root 45279 Jul 20 11:38 Analysis with Dask Distributed.ipynb
-rw-r--r-- 1 root root 11075 Jul 19 19:45 dask.ipynb
-rw-r--r-- 1 root root 51034 Jul 19 19:48 dask_small.ipynb
-rw-r--r-- 1 root root    15 Jul 20 19:36 foo.txt
-rw-r--r-- 1 root root 10429 Jul 20 16:46 REST APIs - the github API.ipynb


In [56]:
# now let's upload the file to the test directory:
rsp = requests.post('{}/files/v2/media/system/{}/{}/test'.format(base_url, system_id, username), 
                    files={'fileToUpload': open('foo.txt', 'rb')}, 
                    headers=headers)
rsp.json()['result']

{'_links': {'history': {'href': 'https://api.tacc.utexas.edu/files/v2/history/system/cic.storage//home/jstubbs/test/foo.txt'},
  'notification': [],
  'profile': {'href': 'https://api.tacc.utexas.edu/profiles/v2/jstubbs'},
  'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/test/foo.txt'},
  'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
 'internalUsername': None,
 'lastModified': '2017-07-20T18:38:08.353-05:00',
 'name': 'foo.txt',
 'nativeFormat': 'raw',
 'owner': 'jstubbs',
 'path': 'jstubbs/test/foo.txt',
 'source': 'http://129.114.97.130/foo.txt',
 'status': 'STAGING_QUEUED',
 'systemId': 'cic.storage',
 'uuid': '8947689907714003431-242ac113-0001-002'}

Note that the upload was QUEDED; in other words, our file won't be there instantly. Agave collects the data in the file and queues the transfer to the remote system. Usually, this transfer happens pretty quickly, but on days when Agave is doing a large number of transfers, it can sometimes take a while.

Let's check to see if our file is there.

In [57]:
# we can also list its contents directly by appending it to the path:
rsp = requests.get('{}/files/v2/listings/system/{}/{}/test'.format(base_url, system_id, username), headers=headers)
rsp.json()['result']

[{'_links': {'history': {'href': 'https://api.tacc.utexas.edu/files/v2/history/system/cic.storage//home/jstubbs/test'},
   'metadata': {'href': 'https://api.tacc.utexas.edu/meta/v2/data?q=%7B%22associationIds%22%3A%221371613682016129511-242ac113-0001-002%22%7D'},
   'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/test'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'folder',
  'lastModified': '2017-07-20T18:38:08.000-05:00',
  'length': 21,
  'mimeType': 'text/directory',
  'name': '.',
  'path': '/home/jstubbs/test',
  'permissions': 'ALL',
  'system': 'cic.storage',
  'type': 'dir'},
 {'_links': {'self': {'href': 'https://api.tacc.utexas.edu/files/v2/media/system/cic.storage//home/jstubbs/test/foo.txt'},
   'system': {'href': 'https://api.tacc.utexas.edu/systems/v2/cic.storage'}},
  'format': 'raw',
  'lastModified': '2017-07-20T18:38:09.000-05:00',
  'length': 15,
  'mimeType': 'text/plain',
 

In [59]:
# finally, let's download our file again in a new directory called temp. We'll make that directory first:
! mkdir temp

In [60]:
! ls -l 

total 164
-rw-r--r-- 1 root root 29180 Jul 20 19:40 Agave API.ipynb
-rw-r--r-- 1 root root 45279 Jul 20 11:38 Analysis with Dask Distributed.ipynb
-rw-r--r-- 1 root root 11075 Jul 19 19:45 dask.ipynb
-rw-r--r-- 1 root root 51034 Jul 19 19:48 dask_small.ipynb
-rw-r--r-- 1 root root    15 Jul 20 19:36 foo.txt
-rw-r--r-- 1 root root 10429 Jul 20 16:46 REST APIs - the github API.ipynb
drwxr-xr-x 2 root root  4096 Jul 20 19:41 temp


In [61]:
# use a GET request to the media endpoint to download the file
# the file comes to us in raw bytes, so we are responsible for writing it to disk.
with open('temp/foo.txt', 'wb') as f:
    rsp = requests.get('{}/files/v2/media/system/{}/{}/test/foo.txt'.format(base_url, system_id, username), headers=headers)
    for block in rsp.iter_content(1024):
        if not block:
            break
        f.write(block)    


In [62]:
! ls -l temp/

total 4
-rw-r--r-- 1 root root 15 Jul 20 19:52 foo.txt


In [63]:
! cat temp/foo.txt

This is a test.