Interacting with histories in Galaxy API
========================================

We are going to use the [requests](http://python-requests.org/) Python library to communicate via HTTP with the Galaxy server. To start, let's define the connection parameters.

**You need to insert the API key for your Galaxy server in the cell below**: 
1. Open the Galaxy server in another browser tab
2. Click on "User" on the top menu, then "Preferences"
3. Click on "Manage API key"
4. Generate an API key if needed, then copy the alphanumeric string and paste it as the value of the `api_key` variable below.

In [43]:
from pprint import pprint

import json

import requests
from six.moves.urllib.parse import urljoin

server = 'https://usegalaxy.eu/'
api_key = ''
base_url = urljoin(server, 'api')
base_url

'https://usegalaxy.eu/api'

We now make a GET request to retrieve all histories owned by a user:

In [44]:
params = {'key': api_key}
r = requests.get(base_url + '/histories', params)
print(r.text)
hists = r.json()
pprint(hists)

[{"url": "/api/histories/effec70bec8ba12c", "published": false, "model_class": "History", "name": "New history", "purged": false, "annotation": null, "deleted": false, "tags": [], "id": "effec70bec8ba12c"}, {"url": "/api/histories/49e446c3d6585583", "published": false, "model_class": "History", "name": "Unnamed history", "purged": false, "annotation": null, "deleted": false, "tags": [], "id": "49e446c3d6585583"}]
[{'annotation': None,
  'deleted': False,
  'id': 'effec70bec8ba12c',
  'model_class': 'History',
  'name': 'New history',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/effec70bec8ba12c'},
 {'annotation': None,
  'deleted': False,
  'id': '49e446c3d6585583',
  'model_class': 'History',
  'name': 'Unnamed history',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/49e446c3d6585583'}]


As you can see, GET requests in Galaxy API return JSON strings, which need to be **deserialized** into Python data structures. In particular, GETting a resource collection returns a list of dictionaries.

Each dictionary returned when GETting a resource collection gives basic info about a resource, e.g. for a history you have:
- `id`: the unique **identifier** of the history, needed for all specific requests about this resource
- `name`: the name of this history as given by the user
- `deleted`: whether the history has been deleted
- `url`: the relative URL to get all info about this resource.

There is no readily-available filtering capability, but it's not difficult to filter histories **by name**:

In [45]:
pprint([_ for _ in hists if _['name'] == 'Unnamed history'])

[{'annotation': None,
  'deleted': False,
  'id': '49e446c3d6585583',
  'model_class': 'History',
  'name': 'Unnamed history',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/49e446c3d6585583'}]


If you are interested in more **details** about a given resource, you just need to append its `id` to the previous collection request, e.g. to the get more info for a history:

In [19]:
hist0_id = hists[0]['id']
print(hist0_id)
params = {'key': api_key}
r = requests.get(base_url + '/histories/' + hist0_id, params)
pprint(r.json())

effec70bec8ba12c
{'annotation': None,
 'contents_url': '/api/histories/effec70bec8ba12c/contents',
 'create_time': '2015-07-02T11:04:17.100787',
 'deleted': False,
 'empty': False,
 'genome_build': 'hg38',
 'id': 'effec70bec8ba12c',
 'importable': False,
 'model_class': 'History',
 'name': 'New history',
 'published': False,
 'purged': False,
 'size': 54092,
 'slug': None,
 'state': 'ok',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 3,
                   'paused': 0,
                   'queued': 0,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': ['bbd44e69cb8906b550f62a0227a2da04',
                      'bbd44e69

As you can see, there are much more entries in the returned dictionary, e.g.:
- `create_time`
- `size`: total disk space used by the history
- `state_ids`: ids of history datasets for each possible state.

To get the list of **datasets contained** in a history, simply append `/contents` to the previous resource request.

In [47]:
params = {'key': api_key}
r = requests.get(base_url + '/histories/' + hist0_id + '/contents', params)
hdas = r.json()
pprint(hdas)

[{'create_time': '2015-07-02T11:08:56.653404',
  'dataset_id': 'bbd44e69cb8906b5ab999dca04b8ece0',
  'deleted': False,
  'extension': 'txt',
  'hid': 1,
  'history_content_type': 'dataset',
  'history_id': 'effec70bec8ba12c',
  'id': 'bbd44e69cb8906b550f62a0227a2da04',
  'name': '1.txt',
  'purged': False,
  'state': 'ok',
  'tags': [],
  'type': 'file',
  'type_id': 'dataset-bbd44e69cb8906b550f62a0227a2da04',
  'update_time': '2015-07-02T11:09:26.632285',
  'url': '/api/histories/effec70bec8ba12c/contents/bbd44e69cb8906b550f62a0227a2da04',
  'visible': True},
 {'create_time': '2015-07-02T11:10:57.626465',
  'dataset_id': 'bbd44e69cb8906b5333417f7e7c6ca91',
  'deleted': False,
  'extension': 'txt',
  'hid': 2,
  'history_content_type': 'dataset',
  'history_id': 'effec70bec8ba12c',
  'id': 'bbd44e69cb8906b544479115d78d7a93',
  'name': '1.txt',
  'purged': False,
  'state': 'ok',
  'tags': [],
  'type': 'file',
  'type_id': 'dataset-bbd44e69cb8906b544479115d78d7a93',
  'update_time': '2

The dictionaries returned when GETting the history content give basic info about each dataset, e.g.: `id`, `name`, `deleted`, `state`, `url`...

To get the details about a specific dataset, you can use the `datasets` controller:

In [49]:
hda0_id = hdas[0]['id']
print(hda0_id)
params = {'key': api_key}
r = requests.get(base_url + '/datasets/' + hda0_id, params)
pprint(r.json())

bbd44e69cb8906b550f62a0227a2da04
{'accessible': True,
 'annotation': None,
 'api_type': 'file',
 'create_time': '2015-07-02T11:08:56.653404',
 'created_from_basename': None,
 'creating_job': 'faa39c69e6841f30',
 'data_type': 'galaxy.datatypes.data.Text',
 'dataset_id': 'bbd44e69cb8906b5ab999dca04b8ece0',
 'deleted': False,
 'display_apps': [],
 'display_types': [],
 'download_url': '/api/histories/effec70bec8ba12c/contents/bbd44e69cb8906b550f62a0227a2da04/display',
 'extension': 'txt',
 'file_ext': 'txt',
 'file_size': 16,
 'genome_build': '?',
 'hda_ldda': 'hda',
 'hid': 1,
 'history_content_type': 'dataset',
 'history_id': 'effec70bec8ba12c',
 'id': 'bbd44e69cb8906b550f62a0227a2da04',
 'meta_files': [],
 'metadata_data_lines': 4,
 'metadata_dbkey': '?',
 'misc_blurb': '4 lines',
 'misc_info': 'uploaded txt file',
 'model_class': 'HistoryDatasetAssociation',
 'name': '1.txt',
 'peek': '<table cellspacing="0" cellpadding="3"><tr><td>1 '
         'a</td></tr><tr><td>2 b</td></tr><tr><td

Some of the interesting additional dictionary entries are:
- `create_time`
- `creating job`: id of the job which created this dataset
- `download_url`: URL to download the dataset
- `file_ext`: the Galaxy data type of this dataset
- `file_size`
- `genome_build`: the genome build (dbkey) associated to this dataset.

**New resources** are created with POST requests. The uploaded **data needs to be serialized** in a JSON string. For example, to create a new history:

In [50]:
params = {'key': api_key}
data = {'name': 'New history'}
r = requests.post(base_url + '/histories', data=json.dumps(data), params=params, headers={'Content-Type': 'application/json'})
new_hist = r.json()
pprint(new_hist)

{'annotation': None,
 'contents_url': '/api/histories/1d45bdeb654111cb/contents',
 'create_time': '2020-07-15T23:23:00.692409',
 'deleted': False,
 'empty': True,
 'genome_build': None,
 'id': '1d45bdeb654111cb',
 'importable': False,
 'model_class': 'History',
 'name': 'New history',
 'published': False,
 'purged': False,
 'size': 0,
 'slug': None,
 'state': 'new',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 0,
                   'paused': 0,
                   'queued': 0,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': [],
               'paused': [],
               'queued': [],
               'running': [],

The return value of a POST request is a dictionary with detailed info about the created resource.

To **update** a resource, make a PUT request, e.g. to change the history name:

In [51]:
params = {'key': api_key}
data = {'name': 'Updated history'}
r = requests.put(base_url + '/histories/' + new_hist['id'], json.dumps(data), params=params, headers={'Content-Type': 'application/json'})
print(r.status_code)
pprint(r.json())

200
{'annotation': None,
 'contents_url': '/api/histories/1d45bdeb654111cb/contents',
 'create_time': '2020-07-15T23:23:00.692409',
 'deleted': False,
 'empty': True,
 'genome_build': None,
 'id': '1d45bdeb654111cb',
 'importable': False,
 'model_class': 'History',
 'name': 'Updated history',
 'published': False,
 'purged': False,
 'size': 0,
 'slug': None,
 'state': 'new',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 0,
                   'paused': 0,
                   'queued': 0,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': [],
               'paused': [],
               'queued': [],
               'runni

The return value of a PUT request is usually a dictionary with detailed info about the updated resource.

Finally to **delete** a resource, make a DELETE request, e.g.:

In [52]:
params = {'key': api_key}
r = requests.delete(base_url + '/histories/' + new_hist['id'], params=params)
print(r.status_code)

200
