Interacting with histories in BioBlend
======================================

**You need to insert the API key for your Galaxy server in the cell below**: 
1. Open the Galaxy server in another browser tab
2. Click on "User" on the top menu, then "Preferences"
3. Click on "Manage API key"
4. Generate an API key if needed, then copy the alphanumeric string and paste it as the value of the `api_key` variable below.

The user interacts with a Galaxy server through a `GalaxyInstance` object:

In [1]:
from pprint import pprint

import bioblend.galaxy

server = 'https://usegalaxy.eu/'
api_key = ''
gi = bioblend.galaxy.GalaxyInstance(url=server, key=api_key)

The `GalaxyInstance` object gives you access to the various controllers, i.e. the resources you are dealing with, like `histories`, `tools` and `workflows`.
Therefore, method calls will have the format `gi.controller.method()`. For example, the call to retrieve all histories owned by the current user is:

In [2]:
pprint(gi.histories.get_histories())

[{'annotation': None,
  'deleted': False,
  'id': 'effec70bec8ba12c',
  'model_class': 'History',
  'name': 'New history',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/effec70bec8ba12c'},
 {'annotation': None,
  'deleted': False,
  'id': '49e446c3d6585583',
  'model_class': 'History',
  'name': 'Unnamed history',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/49e446c3d6585583'}]


As you can see, methods in BioBlend do not return JSON strings, but **deserialize** them into Python data structures. In particular, `get_` methods return a list of dictionaries.

Each dictionary gives basic info about a resource, e.g. for a history you have:
- `id`: the unique **identifier** of the history, needed for all specific requests about this resource
- `name`: the name of this history as given by the user
- `deleted`: whether the history has been deleted
- `url`: the relative URL to get all info about this resource.

**New resources** are created with `create_` methods, e.g. the call to create a new history is:

In [15]:
new_hist = gi.histories.create_history(name='BioBlend test')
pprint(new_hist)

{'annotation': None,
 'contents_url': '/api/histories/81dbf1b348986271/contents',
 'create_time': '2020-07-15T23:58:51.454856',
 'deleted': False,
 'empty': True,
 'genome_build': None,
 'id': '81dbf1b348986271',
 'importable': False,
 'model_class': 'History',
 'name': 'BioBlend test',
 'published': False,
 'purged': False,
 'size': 0,
 'slug': None,
 'state': 'new',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 0,
                   'paused': 0,
                   'queued': 0,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': [],
               'paused': [],
               'queued': [],
               'running': [

As you can see, to make POST requests in BioBlend it is **not necessary to serialize data**, you just pass them explicitly as parameters. The return value is a dictionary with detailed info about the created resource.

`get_` methods usually have **filtering** capabilities, e.g. it is possible to filter histories **by name**:

In [17]:
pprint(gi.histories.get_histories(name='BioBlend test'))

[{'annotation': None,
  'deleted': False,
  'id': '81dbf1b348986271',
  'model_class': 'History',
  'name': 'BioBlend test',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/81dbf1b348986271'}]


It is also possible to specify the unique **id** of the resource to retrieve, e.g. to get back the history we created before:

In [13]:
hist_id = new_hist['id']
pprint(gi.histories.get_histories(history_id=hist_id))

[{'annotation': None,
  'deleted': False,
  'id': '81dbf1b348986271',
  'model_class': 'History',
  'name': 'BioBlend test',
  'published': False,
  'purged': False,
  'tags': [],
  'url': '/api/histories/81dbf1b348986271'}]


Please note that independently of which parameters are passed to the `get_` method, it always returns a list.

To **upload** files to the new history, run the special upload tool by calling the `upload_file` method of the `tools` controller:

In [22]:
pprint(gi.tools.upload_file('test-data/1.txt', hist_id))

{'implicit_collections': [],
 'jobs': [{'create_time': '2020-07-16T00:03:08.777733',
           'exit_code': None,
           'galaxy_version': '20.05',
           'history_id': '81dbf1b348986271',
           'id': 'bbd44e69cb8906b548906faa192062c9',
           'model_class': 'Job',
           'state': 'new',
           'tool_id': 'upload1',
           'update_time': '2020-07-16T00:03:08.945527'}],
 'output_collections': [],
 'outputs': [{'create_time': '2020-07-16T00:03:08.679131',
              'data_type': 'galaxy.datatypes.data.Data',
              'deleted': False,
              'file_ext': 'auto',
              'file_size': 0,
              'genome_build': '?',
              'hda_ldda': 'hda',
              'hid': 1,
              'history_content_type': 'dataset',
              'history_id': '81dbf1b348986271',
              'id': 'bbd44e69cb8906b5bdf8ca821a4e31dc',
              'metadata_dbkey': '?',
              'misc_blurb': None,
              'misc_info': None,
          

If you are interested in more **details** about a given resource, you can use the corresponding `show_` method. For example, to the get more info for the history we have just populated:

In [18]:
pprint(gi.histories.show_history(history_id=hist_id))

{'annotation': None,
 'contents_url': '/api/histories/81dbf1b348986271/contents',
 'create_time': '2020-07-15T23:58:51.454856',
 'deleted': False,
 'empty': False,
 'genome_build': None,
 'id': '81dbf1b348986271',
 'importable': False,
 'model_class': 'History',
 'name': 'BioBlend test',
 'published': False,
 'purged': False,
 'size': 0,
 'slug': None,
 'state': 'queued',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 0,
                   'paused': 0,
                   'queued': 1,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': [],
               'paused': [],
               'queued': ['bbd44e69cb8906b5bdf8ca821

As you can see, there are much more entries in the returned dictionary, e.g.:
- `create_time`
- `size`: total disk space used by the history
- `state_ids`: ids of history datasets for each possible state.

To get the list of **datasets contained** in a history, simply add `contents=True` to the previous call.

In [20]:
hdas = gi.histories.show_history(history_id=hist_id, contents=True)
pprint(hdas)

[{'create_time': '2020-07-16T00:03:08.679131',
  'dataset_id': 'bbd44e69cb8906b5bd38c5a17c7062a1',
  'deleted': False,
  'extension': 'auto',
  'hid': 1,
  'history_content_type': 'dataset',
  'history_id': '81dbf1b348986271',
  'id': 'bbd44e69cb8906b5bdf8ca821a4e31dc',
  'name': '1.txt',
  'purged': False,
  'state': 'queued',
  'tags': [],
  'type': 'file',
  'type_id': 'dataset-bbd44e69cb8906b5bdf8ca821a4e31dc',
  'update_time': '2020-07-16T00:03:08.750043',
  'url': '/api/histories/81dbf1b348986271/contents/bbd44e69cb8906b5bdf8ca821a4e31dc',
  'visible': True}]


The dictionaries returned when showing the history content give basic info about each dataset, e.g.: `id`, `name`, `deleted`, `state`, `url`...

To get the details about a specific dataset, you can use the `datasets` controller:

In [25]:
hda0_id = hdas[0]['id']
print(hda0_id)
pprint(gi.datasets.show_dataset(hda0_id))

bbd44e69cb8906b5bdf8ca821a4e31dc
{'accessible': True,
 'annotation': None,
 'api_type': 'file',
 'create_time': '2020-07-16T00:03:08.679131',
 'created_from_basename': None,
 'creating_job': 'bbd44e69cb8906b548906faa192062c9',
 'data_type': 'galaxy.datatypes.data.Data',
 'dataset_id': 'bbd44e69cb8906b5bd38c5a17c7062a1',
 'deleted': False,
 'display_apps': [],
 'display_types': [],
 'download_url': '/api/histories/81dbf1b348986271/contents/bbd44e69cb8906b5bdf8ca821a4e31dc/display',
 'extension': 'auto',
 'file_ext': 'auto',
 'file_size': 0,
 'genome_build': '?',
 'hda_ldda': 'hda',
 'hid': 1,
 'history_content_type': 'dataset',
 'history_id': '81dbf1b348986271',
 'id': 'bbd44e69cb8906b5bdf8ca821a4e31dc',
 'meta_files': [],
 'metadata_dbkey': '?',
 'misc_blurb': None,
 'misc_info': None,
 'model_class': 'HistoryDatasetAssociation',
 'name': '1.txt',
 'peek': None,
 'purged': False,
 'rerunnable': False,
 'resubmitted': False,
 'state': 'queued',
 'tags': [],
 'type': 'file',
 'type_id': 

Some of the interesting additional dictionary entries are:
- `create_time`
- `creating job`: id of the job which created this dataset
- `download_url`: URL to download the dataset
- `file_ext`: the Galaxy data type of this dataset
- `file_size`
- `genome_build`: the genome build (dbkey) associated to this dataset.

To **update** a resource, use the `update_` method, e.g. to change the name of the new history:

In [26]:
pprint(gi.histories.update_history(new_hist['id'], name='Updated history'))

{'annotation': None,
 'contents_url': '/api/histories/81dbf1b348986271/contents',
 'create_time': '2020-07-15T23:58:51.454856',
 'deleted': False,
 'empty': False,
 'genome_build': None,
 'id': '81dbf1b348986271',
 'importable': False,
 'model_class': 'History',
 'name': 'Updated history',
 'published': False,
 'purged': False,
 'size': 0,
 'slug': None,
 'state': 'queued',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 0,
                   'paused': 0,
                   'queued': 1,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': [],
               'paused': [],
               'queued': ['bbd44e69cb8906b5bdf8ca8

The return value of `update_` methods is usually a dictionary with detailed info about the updated resource.

Finally to **delete** a resource, use the `delete_` method, e.g.:

In [27]:
pprint(gi.histories.delete_history(new_hist['id']))

{'annotation': None,
 'contents_url': '/api/histories/81dbf1b348986271/contents',
 'create_time': '2020-07-15T23:58:51.454856',
 'deleted': True,
 'empty': False,
 'genome_build': None,
 'id': '81dbf1b348986271',
 'importable': False,
 'model_class': 'History',
 'name': 'Updated history',
 'published': False,
 'purged': False,
 'size': 0,
 'slug': None,
 'state': 'queued',
 'state_details': {'discarded': 0,
                   'empty': 0,
                   'error': 0,
                   'failed_metadata': 0,
                   'new': 0,
                   'ok': 0,
                   'paused': 0,
                   'queued': 1,
                   'running': 0,
                   'setting_metadata': 0,
                   'upload': 0},
 'state_ids': {'discarded': [],
               'empty': [],
               'error': [],
               'failed_metadata': [],
               'new': [],
               'ok': [],
               'paused': [],
               'queued': ['bbd44e69cb8906b5bdf8ca82