Interacting with histories in Galaxy API
========================================

We are going to use the [requests](http://python-requests.org/) Python library to communicate via HTTP with the Galaxy server. To start, let's define the connection parameters.

**You need to insert the API key for your Galaxy server in the cell below**: open the Galaxy server in another browser tab, click on "User" on the top menu, then "API Keys". Generate an API key if needed, then copy the alphanumeric string and paste it below.

In [43]:
from __future__ import print_function
import json
from urlparse import urljoin
import requests

server = 'https://usegalaxy.org/'
api_key = ''
base_url = urljoin(server, 'api')
base_url

'https://usegalaxy.org/api'

We now make a GET request to retrieve all histories owned by a user:

In [44]:
params = {'key': api_key}
r = requests.get(base_url + '/histories', params)
print(r.text)
hists = r.json()
hists

[{"name": "New history", "tags": [], "deleted": false, "purged": false, "annotation": null, "url": "/api/histories/effec70bec8ba12c", "published": false, "model_class": "History", "id": "effec70bec8ba12c"}, {"name": "Unnamed history", "tags": [], "deleted": false, "purged": false, "annotation": null, "url": "/api/histories/49e446c3d6585583", "published": false, "model_class": "History", "id": "49e446c3d6585583"}]


[{u'annotation': None,
  u'deleted': False,
  u'id': u'effec70bec8ba12c',
  u'model_class': u'History',
  u'name': u'New history',
  u'published': False,
  u'purged': False,
  u'tags': [],
  u'url': u'/api/histories/effec70bec8ba12c'},
 {u'annotation': None,
  u'deleted': False,
  u'id': u'49e446c3d6585583',
  u'model_class': u'History',
  u'name': u'Unnamed history',
  u'published': False,
  u'purged': False,
  u'tags': [],
  u'url': u'/api/histories/49e446c3d6585583'}]

As you can see, GET requests in Galaxy API return JSON strings, which need to be **deserialized** into Python data structures. In particular, GETting a resource collection returns a list of dictionaries.

There is no readily-available filtering capability, but it's not difficult to filter histories **by name**:

In [45]:
[_ for _ in hists if _['name'] == 'Unnamed history']

[{u'annotation': None,
  u'deleted': False,
  u'id': u'49e446c3d6585583',
  u'model_class': u'History',
  u'name': u'Unnamed history',
  u'published': False,
  u'purged': False,
  u'tags': [],
  u'url': u'/api/histories/49e446c3d6585583'}]

The dictionaries returned when GETting a resource collection give basic info about each resource, e.g. for a history you have:
- `id`: the unique **identifier** of the history, needed for all specific requests about this resource
- `name`: the name of this history as given by the user
- `deleted`: whether the history has been deleted
- `url`: the relative URL to get all info about this resource.

If you are interested in more **details** about a given resource, you just need to append its `id` to the previous collection request, e.g. to the get more info for a history:

In [46]:
hist0_id = hists[0]['id']
print(hist0_id)
params = {'key': api_key}
r = requests.get(base_url + '/histories/' + hist0_id, params)
r.json()

effec70bec8ba12c


{u'annotation': None,
 u'contents_url': u'/api/histories/effec70bec8ba12c/contents',
 u'create_time': u'2015-07-02T11:04:17.100787',
 u'deleted': False,
 u'empty': False,
 u'genome_build': None,
 u'id': u'effec70bec8ba12c',
 u'importable': False,
 u'model_class': u'History',
 u'name': u'New history',
 u'published': False,
 u'purged': False,
 u'size': 48,
 u'slug': None,
 u'state': u'ok',
 u'state_details': {u'discarded': 0,
  u'empty': 0,
  u'error': 0,
  u'failed_metadata': 0,
  u'new': 0,
  u'ok': 3,
  u'paused': 0,
  u'queued': 0,
  u'running': 0,
  u'setting_metadata': 0,
  u'upload': 0},
 u'state_ids': {u'discarded': [],
  u'empty': [],
  u'error': [],
  u'failed_metadata': [],
  u'new': [],
  u'ok': [u'bbd44e69cb8906b550f62a0227a2da04',
   u'bbd44e69cb8906b544479115d78d7a93',
   u'bbd44e69cb8906b50e45a3912d4c471e'],
  u'paused': [],
  u'queued': [],
  u'running': [],
  u'setting_metadata': [],
  u'upload': []},
 u'tags': [],
 u'update_time': u'2015-07-02T15:35:38.834815',
 u'url'

As you can see, there are much more entries in the returned dictionary, e.g.:
- `create_time`
- `size`: disk space used by the history datasets
- `state_ids`: ids of datasets in each possible state.

To get the list of **datasets contained** in a history, simply append `/contents` to the previous resource request.

In [47]:
params = {'key': api_key}
r = requests.get(base_url + '/histories/' + hist0_id + '/contents', params)
hdas = r.json()
hdas

[{u'dataset_id': u'bbd44e69cb8906b5ab999dca04b8ece0',
  u'deleted': False,
  u'extension': u'txt',
  u'hid': 1,
  u'history_content_type': u'dataset',
  u'history_id': u'effec70bec8ba12c',
  u'id': u'bbd44e69cb8906b550f62a0227a2da04',
  u'name': u'1.txt',
  u'purged': False,
  u'state': u'ok',
  u'type': u'file',
  u'type_id': u'dataset-bbd44e69cb8906b550f62a0227a2da04',
  u'url': u'/api/histories/effec70bec8ba12c/contents/bbd44e69cb8906b550f62a0227a2da04',
  u'visible': True},
 {u'dataset_id': u'bbd44e69cb8906b5333417f7e7c6ca91',
  u'deleted': False,
  u'extension': u'txt',
  u'hid': 2,
  u'history_content_type': u'dataset',
  u'history_id': u'effec70bec8ba12c',
  u'id': u'bbd44e69cb8906b544479115d78d7a93',
  u'name': u'1.txt',
  u'purged': False,
  u'state': u'ok',
  u'type': u'file',
  u'type_id': u'dataset-bbd44e69cb8906b544479115d78d7a93',
  u'url': u'/api/histories/effec70bec8ba12c/contents/bbd44e69cb8906b544479115d78d7a93',
  u'visible': True},
 {u'dataset_id': u'bbd44e69cb8906b

The dictionaries returned when GETting the history content give basic info about each dataset, e.g.: `id`, `name`, `deleted`, `state`, `url`...

To get the details about one dataset, you can use either append the dataset `id` to the previous request URL:

In [48]:
hda0_id = hdas[0]['id']
print(hda0_id)
params = {'key': api_key}
r = requests.get(base_url + '/histories/' + hist0_id + '/contents/' + hda0_id, params)
r.json()

bbd44e69cb8906b550f62a0227a2da04


{u'accessible': True,
 u'annotation': None,
 u'api_type': u'file',
 u'create_time': u'2015-07-02T11:08:56.653404',
 u'creating_job': u'faa39c69e6841f30',
 u'data_type': u'galaxy.datatypes.data.Text',
 u'dataset_id': u'bbd44e69cb8906b5ab999dca04b8ece0',
 u'deleted': False,
 u'display_apps': [],
 u'display_types': [],
 u'download_url': u'/api/histories/effec70bec8ba12c/contents/bbd44e69cb8906b550f62a0227a2da04/display',
 u'extension': u'txt',
 u'file_ext': u'txt',
 u'file_size': 16,
 u'genome_build': u'?',
 u'hda_ldda': u'hda',
 u'hid': 1,
 u'history_content_type': u'dataset',
 u'history_id': u'effec70bec8ba12c',
 u'id': u'bbd44e69cb8906b550f62a0227a2da04',
 u'meta_files': [],
 u'metadata_data_lines': 4,
 u'metadata_dbkey': u'?',
 u'misc_blurb': u'4 lines',
 u'misc_info': u'uploaded txt file',
 u'model_class': u'HistoryDatasetAssociation',
 u'name': u'1.txt',
 u'peek': u'<table cellspacing="0" cellpadding="3"><tr><td>1 a</td></tr><tr><td>2 b</td></tr><tr><td>3 c</td></tr><tr><td>4 d</td>

Or directly use the `datasets` controller, without having to specify the history id:

In [49]:
params = {'key': api_key}
r = requests.get(base_url + '/datasets/' + hda0_id, params)
r.json()

{u'accessible': True,
 u'annotation': None,
 u'api_type': u'file',
 u'create_time': u'2015-07-02T11:08:56.653404',
 u'creating_job': u'faa39c69e6841f30',
 u'data_type': u'galaxy.datatypes.data.Text',
 u'dataset_id': u'bbd44e69cb8906b5ab999dca04b8ece0',
 u'deleted': False,
 u'display_apps': [],
 u'display_types': [],
 u'download_url': u'/api/histories/effec70bec8ba12c/contents/bbd44e69cb8906b550f62a0227a2da04/display',
 u'extension': u'txt',
 u'file_ext': u'txt',
 u'file_size': 16,
 u'genome_build': u'?',
 u'hda_ldda': u'hda',
 u'hid': 1,
 u'history_content_type': u'dataset',
 u'history_id': u'effec70bec8ba12c',
 u'id': u'bbd44e69cb8906b550f62a0227a2da04',
 u'meta_files': [],
 u'metadata_data_lines': 4,
 u'metadata_dbkey': u'?',
 u'misc_blurb': u'4 lines',
 u'misc_info': u'uploaded txt file',
 u'model_class': u'HistoryDatasetAssociation',
 u'name': u'1.txt',
 u'peek': u'<table cellspacing="0" cellpadding="3"><tr><td>1 a</td></tr><tr><td>2 b</td></tr><tr><td>3 c</td></tr><tr><td>4 d</td>

Some of the interesting additional dictionary entries are:
- `create_time`
- `creating job`: id of the job which created this dataset
- `download_url`: URL to download the dataset
- `file_ext`: the Galaxy data type of this dataset
- `file_size`
- `genome_build`: the dbkey.

**New resources** are created with POST requests. The uploaded **data needs to be serialized** in a JSON string. For example, to create a new history:

In [50]:
params = {'key': api_key}
data = {'name': 'New history'}
r = requests.post(base_url + '/histories', json.dumps(data), params=params, headers={'Content-Type': 'application/json'})
new_hist = r.json()
new_hist

{u'annotation': None,
 u'contents_url': u'/api/histories/09e9b859888fc439/contents',
 u'create_time': u'2015-07-03T17:21:19.676537',
 u'deleted': False,
 u'empty': True,
 u'genome_build': None,
 u'id': u'09e9b859888fc439',
 u'importable': False,
 u'model_class': u'History',
 u'name': u'New history',
 u'published': False,
 u'purged': False,
 u'size': 0,
 u'slug': None,
 u'state': u'new',
 u'state_details': {u'discarded': 0,
  u'empty': 0,
  u'error': 0,
  u'failed_metadata': 0,
  u'new': 0,
  u'ok': 0,
  u'paused': 0,
  u'queued': 0,
  u'running': 0,
  u'setting_metadata': 0,
  u'upload': 0},
 u'state_ids': {u'discarded': [],
  u'empty': [],
  u'error': [],
  u'failed_metadata': [],
  u'new': [],
  u'ok': [],
  u'paused': [],
  u'queued': [],
  u'running': [],
  u'setting_metadata': [],
  u'upload': []},
 u'tags': [],
 u'update_time': u'2015-07-03T17:21:19.676561',
 u'url': u'/api/histories/09e9b859888fc439',
 u'user_id': u'1c510fef372551ec',
 u'username_and_slug': None}

The return value of a POST request is a dictionary with detailed info about the created resource.

To **update** a resource, make a PUT request, e.g. to change the history name:

In [51]:
params = {'key': api_key}
data = {'name': 'Updated history'}
r = requests.put(base_url + '/histories/' + new_hist['id'], json.dumps(data), params=params, headers={'Content-Type': 'application/json'})
print(r.status_code)
r.json()

200


{u'annotation': None,
 u'contents_url': u'/api/histories/09e9b859888fc439/contents',
 u'create_time': u'2015-07-03T17:21:19.676537',
 u'deleted': False,
 u'empty': True,
 u'genome_build': None,
 u'id': u'09e9b859888fc439',
 u'importable': False,
 u'model_class': u'History',
 u'name': u'Updated history',
 u'published': False,
 u'purged': False,
 u'size': 0,
 u'slug': None,
 u'state': u'new',
 u'state_details': {u'discarded': 0,
  u'empty': 0,
  u'error': 0,
  u'failed_metadata': 0,
  u'new': 0,
  u'ok': 0,
  u'paused': 0,
  u'queued': 0,
  u'running': 0,
  u'setting_metadata': 0,
  u'upload': 0},
 u'state_ids': {u'discarded': [],
  u'empty': [],
  u'error': [],
  u'failed_metadata': [],
  u'new': [],
  u'ok': [],
  u'paused': [],
  u'queued': [],
  u'running': [],
  u'setting_metadata': [],
  u'upload': []},
 u'tags': [],
 u'update_time': u'2015-07-03T17:21:20.631406',
 u'url': u'/api/histories/09e9b859888fc439',
 u'user_id': u'1c510fef372551ec',
 u'username_and_slug': None}

The return value of a PUT request is usually a dictionary with detailed info about the updated resource.

Finally to **delete** a resource, make a DELETE request, e.g.:

In [52]:
params = {'key': api_key}
r = requests.delete(base_url + '/histories/' + new_hist['id'], params=params)
print(r.status_code)

200
