# Research Object Composer tutorial

This is a [Jupyter Notebook](https://jupyter.org/) demonstrating how a client can use the [Research Object Composer](https://github.com/researchobject/research-object-composer) REST API.

For requirements to run this notebook interactively, see the [README](https://github.com/ResearchObject/research-object-composer/blob/master/README.md). 

The [RO Composer API](https://researchobject.github.io/research-object-composer/api/) is documented using [Swagger OpenAPI](https://swagger.io/docs/specification/about/) 2.0, which means the REST API can be integrated into programming languages, however this notebook uses [Python](https://www.python.org/) to not hide too much of the HTTP details.

To execute this notebook, select each cell in order, then click the **▶️Run** button above.

## Python requirements

For the below examples we'll use the Python library [requests](https://pypi.org/project/requests/) to show the HTTP  interactions. Below assumes a basic knowledge of [REST services](https://en.wikipedia.org/wiki/Representational_state_transfer).

If the below `import` does not work, try on the command line where you started Jupyter Notebook: `pip install requests`

In [1]:
import requests
true,false = (True,False) # for JSON example

RO Composer is meant to be installed on a local infrastructure or as a cloud service. The below uses a demo service hosted by The University of Manchester why is not supported and may become unavailable in the future.

If you are testing the service locally using _Docker Compose_ (see [README](https://github.com/ResearchObject/research-object-composer/blob/master/README.md)) - change below to `http://localhost:8080` or use equivalent server name if you are hosting it as a cloud service.

In [2]:
host = "http://openphacts.cs.man.ac.uk:8080"

## Profiles

The RO Composer supports creating Research Object for multiple **profiles**. Each profile is [defined internally](https://github.com/ResearchObject/research-object-composer/tree/master/src/main/resources/public/schemas) using [JSON Schema](https://json-schema.org/), for creating different kind of Research Objects. We can query the `/profiles` service to see which profiles are installed:


In [3]:
r = requests.get(host + "/profiles")
r.status_code

200

HTTP status code `200` means **OK**, so let's see what is the _content type_ of the result:

In [4]:
r.headers["Content-Type"]

'application/hal+json;charset=UTF-8'

The API results from RO Composer is JSON that follows the Hypertext Application Language ([HAL](http://stateless.co/hal_specification.html)) patterns for RESTful services. Let's look at the content:

In [5]:
r.json()

{'_embedded': {'researchObjectProfileList': [{'id': 1,
    'name': 'data_bundle',
    'fields': ['data', '_metadata'],
    '_links': {'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle'},
     'schema': {'href': 'http://openphacts.cs.man.ac.uk:8080/schemas/data_bundle.schema.json'},
     'researchObjects': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle/research_objects'}}},
   {'id': 2,
    'name': 'draft_task',
    'fields': ['input', 'workflow', 'workflow_params', '_metadata'],
    '_links': {'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/draft_task'},
     'schema': {'href': 'http://openphacts.cs.man.ac.uk:8080/schemas/draft_task.schema.json'},
     'researchObjects': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/draft_task/research_objects'}}}]},
 '_links': {'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles'}}}

### Profile links

In HAL, the `_links` section contain links to related REST resources, in this case only `self` which is referring back to the HTTP resource we just requested. Hyperlinks are given with `href` like in HTML.

The `_embedded` section contains additional REST resources, which properties are partially embedded. Within `researchObjectProfileList` we therefore find the different profiles supported by this service. Let's look at their `name` fields:

In [6]:
profiles = r.json()["_embedded"]['researchObjectProfileList']
[p["name"] for p in profiles]

['data_bundle', 'draft_task']

In this installation, the profile `data_bundle` is for Research Objects containing arbitrary datasets, while `draft_task` is for more specific ROs describing workflow executions. We'll look at the first in detail. 

We see this profile only expects the fields `data` and `_metadata`:

In [7]:
bundle_profile = profiles[0]
bundle_profile["fields"]

['data', '_metadata']

For the profile the HAL `_links` includes several entries, `self`, `schema` and `researchObjects`

In [11]:
links = bundle_profile["_links"]
links

{'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle'},
 'schema': {'href': 'http://openphacts.cs.man.ac.uk:8080/schemas/data_bundle.schema.json'},
 'researchObjects': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle/research_objects'}}

### JSON Schema

We can request the underlying [JSON Schema](https://json-schema.org/) to see details of these fields at `/schemas/{name}` - linked to from `schema` above.

In [12]:
schema = links["schema"]
schema

{'href': 'http://openphacts.cs.man.ac.uk:8080/schemas/data_bundle.schema.json'}

In [13]:
schema_response = requests.get(schema["href"])
schema_response.json()

{'$schema': 'http://json-schema.org/draft-07/schema',
 'type': 'object',
 '$baggable': {'data': '/'},
 'properties': {'_metadata': {'$ref': '/schemas/_base.schema.json#/definitions/Metadata'},
  'data': {'type': 'array',
   'items': {'$ref': '/schemas/_base.schema.json#/definitions/RemoteItem'}}}}

You may notice that the JSON Schema define the `_metadata` and `data` keys by referencing a separate [base schema](https://github.com/ResearchObject/research-object-composer/blob/master/src/main/resources/public/schemas/_base.schema.json) that is common for all Research Objects.  However we do not need to learn the details of the profile's JSON Schema as the RO Composer will make individual REST resources for each field.

## Creating a Research Object

The REST resource that collect [Research Objects for the given profile]((https://researchobject.github.io/research-object-composer/api/#operation/listResearchObjectsForProfile)) is at `/profiles/{name}/research_objects` and is linked to from the `researchObjects` link:



In [14]:
researchObjects = links["researchObjects"]
researchObjects

{'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle/research_objects'}

As this resource is a collection it supports RO creation using [POST](https://researchobject.github.io/research-object-composer/api/#operation/createResearchObject).

In [15]:
created = requests.post(researchObjects["href"])
created

<Response [201]>

In HTTP, **201 Created** means a new HTTP resource was made. We can find out _where_ from the `Location` header:

In [16]:
ro_uri = created.headers["Location"]
ro_uri

'http://openphacts.cs.man.ac.uk:8080/research_objects/20'

### Completing the Research Object

The response also includes a preview of the created Research Object resource (we don't need to `GET` it), where we'll find the same URI under the `self` link.

In [17]:
created.json()

{'id': 20,
 'content': {'data': [], '_metadata': None},
 'contentSha256': None,
 'profileName': 'data_bundle',
 '_links': {'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20'},
  'profile': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle'},
  'content': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/content'},
  'data': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/content/data'},
  '_metadata': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/content/_metadata'}}}

Remember before we had the fields `data` and `_metadata`? We now see them under `content`, however they are not yet populated:

In [18]:
created.json()["content"]

{'data': [], '_metadata': None}

### Research Object links

We have a corresponding REST resource to populate each, which we find under `_links`. 

In [19]:
links = created.json()["_links"]
links

{'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20'},
 'profile': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle'},
 'content': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/content'},
 'data': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/content/data'},
 '_metadata': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/content/_metadata'}}

### Adding data to a Research Object

Let's fill `data` first. We saw in `content` it was a array  `[]`, which we also saw as `type: array` in the JSON Schema. RO Composer exposes this as a REST collection where we can `POST` to add items. 

In [20]:
data = requests.post(links["data"]["href"], json={})
data

<Response [400]>

Uups, **400 Bad Request**, perhaps `{}` was not sufficient? Maybe we should have read that JSON Schema after all... Let's look at the returned schema violations:

In [21]:
data.json()


{'pointerToViolation': '#',
 'message': '#: 4 schema violations found',
 'causingExceptions': [{'keyword': 'required',
   'pointerToViolation': '#',
   'message': '#: required key [length] not found',
   'causingExceptions': [],
   'schemaLocation': '/schemas/_base.schema.json#/definitions/RemoteItem'},
  {'keyword': 'required',
   'pointerToViolation': '#',
   'message': '#: required key [filename] not found',
   'causingExceptions': [],
   'schemaLocation': '/schemas/_base.schema.json#/definitions/RemoteItem'},
  {'keyword': 'required',
   'pointerToViolation': '#',
   'message': '#: required key [url] not found',
   'causingExceptions': [],
   'schemaLocation': '/schemas/_base.schema.json#/definitions/RemoteItem'},
  {'keyword': 'required',
   'pointerToViolation': '#',
   'message': '#: required key [checksums] not found',
   'causingExceptions': [],
   'schemaLocation': '/schemas/_base.schema.json#/definitions/RemoteItem'}],
 'schemaLocation': '/schemas/_base.schema.json#/definiti

We see here that we are missing 4 fields from the [RemoteItem](https://github.com/ResearchObject/research-object-composer/blob/master/src/main/resources/public/schemas/_base.schema.json#L4) type, `length`, `filename`, `url` and `checksums`; these properties are used in `data` to reference remote files.

### Adding remote resources

For the purpose of this demonstration we'll create a [simple dataset](https://github.com/ResearchObject/ro-lite/tree/master/examples/simple-dataset-0.1.0/data) containing a [TSV file](https://github.com/ResearchObject/ro-lite/blob/master/examples/simple-dataset-0.1.0/data/repository-sizes.tsv) and a [PNG image](https://github.com/ResearchObject/ro-lite/blob/master/examples/simple-dataset-0.1.0/data/repository-sizes-chart.png). We have already calculated the `length` and `checksums`.

In [22]:
tsv = { "url": "https://raw.githubusercontent.com/ResearchObject/research-object-composer/master/examples/repository-sizes.tsv",
        "length": 1982,
        "filename": "repository-sizes.tsv",
        "checksums": [{"type": "sha256", 
                       "checksum": "c2160e931a6ddb8cddb451190816196fc667c5f25020a89a356a69e75ec8dc0a"}]
      } 
png = { "url": "https://raw.githubusercontent.com/ResearchObject/research-object-composer/master/examples/repository-sizes-chart.png",
        "length": 23803,
        "filename": "repository-sizes-chart.png",
        "checksums": [{"type": "sha256", 
                       "checksum": "c2160e931a6ddb8cddb451190816196fc667c5f25020a89a356a69e75ec8dc0a"}]
      } 
tsv_uploaded = requests.post(links["data"]["href"], json=tsv)
tsv_uploaded.status_code

200

In [23]:
csv_uploaded = requests.post(links["data"]["href"], json=png)
csv_uploaded.status_code

200

**200 OK** here means we complied with the JSON Schema, you may get an error if you get any of the keys wrong, or the checksum value is of the incorrect length (supported `checksums`: `md5`, `sha1`, `sha256`, `sha512`).

### Inspecting the Research Object


Now let's reload the RO and see that we have populated `content` with the two items.

In [24]:
ro = requests.get(ro_uri)
ro.json()

{'id': 20,
 'content': {'data': [{'url': 'https://raw.githubusercontent.com/ResearchObject/research-object-composer/master/examples/repository-sizes.tsv',
    'length': 1982,
    'filename': 'repository-sizes.tsv',
    'checksums': [{'type': 'sha256',
      'checksum': 'c2160e931a6ddb8cddb451190816196fc667c5f25020a89a356a69e75ec8dc0a'}]},
   {'url': 'https://raw.githubusercontent.com/ResearchObject/research-object-composer/master/examples/repository-sizes-chart.png',
    'length': 23803,
    'filename': 'repository-sizes-chart.png',
    'checksums': [{'type': 'sha256',
      'checksum': 'c2160e931a6ddb8cddb451190816196fc667c5f25020a89a356a69e75ec8dc0a'}]}],
  '_metadata': None},
 'contentSha256': None,
 'profileName': 'data_bundle',
 '_links': {'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20'},
  'profile': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/data_bundle'},
  'content': {'href': 'http://openphacts.cs.man.ac.uk:8080/research_objects/20/conten

### Download Research Object

Now we can download the Research Object as a [BagIt archive](https://tools.ietf.org/html/rfc8493) (RFC8493) from the `/research_objects/{id}/bag` resource (currently this REST resource is not listed under `_links`).

As this is a binary (ZIP file) we'll use a slightly different Python method to save it to a file.

In [101]:
import shutil
with requests.post(ro_uri + "/bag", stream=True) as bag:
    r.raise_for_status()
    with open("bag.zip", "wb") as zipfile:
        shutil.copyfileobj(bag.raw, zipfile)

Let's check the content of the downloaded zip file.

In [114]:
import zipfile
zip = zipfile.ZipFile('bag.zip')
files = zip.namelist()
files.sort() # alphabetical order
files

['data_bundle-20/bag-info.txt',
 'data_bundle-20/bagit.txt',
 'data_bundle-20/data/content.json',
 'data_bundle-20/fetch.txt',
 'data_bundle-20/manifest-md5.txt',
 'data_bundle-20/manifest-sha256.txt',
 'data_bundle-20/manifest-sha512.txt',
 'data_bundle-20/metadata/manifest.json',
 'data_bundle-20/tagmanifest-md5.txt',
 'data_bundle-20/tagmanifest-sha256.txt',
 'data_bundle-20/tagmanifest-sha512.txt']

These paths follows the [BagIt structure](https://tools.ietf.org/html/rfc8493#section-2) where the [Research Object manifest](https://github.com/ResearchObject/bagit-ro) is under `metadata/manifest.json` and the _payload_ is under `data`.
 
You may notice that the two remote files we added are **not** present in the zip file, they are referenced from `fetch.txt` and `manifest-sha256.txt` - this means that even if large files are added to the Research Object, its download ZIP remains small (until the bag is _completed_ using BagIt tools like [BDBag](http://bd2k.ini.usc.edu/tools/bdbag/)).


### Research Object manifest

Let's have a look at `metadata/manifest.json` - the manifest of the Research Object.

In [106]:
import json
with zip.open("data_bundle-20/metadata/manifest.json") as f:
    manifest = json.load(f)
manifest

{'@context': ['https://w3id.org/bundle/context'],
 'id': '../',
 'manifest': ['manifest.json'],
 'createdOn': '2019-04-24T12:49:45.569Z',
 'aggregates': [{'uri': 'https://raw.githubusercontent.com/ResearchObject/research-object-composer/master/examples/repository-sizes.tsv',
   'bundledAs': {'uri': 'urn:uuid:cd0f8912-7a6f-42eb-a96c-b2ad67e93824',
    'folder': '../data/',
    'filename': 'repository-sizes.tsv'}},
  {'uri': 'https://raw.githubusercontent.com/ResearchObject/research-object-composer/master/examples/repository-sizes-chart.png',
   'bundledAs': {'uri': 'urn:uuid:7f94bf78-c7a9-4caa-9b71-cd7f8b656cca',
    'folder': '../data/',
    'filename': 'repository-sizes-chart.png'}}]}

In the [Research Object manifest](https://github.com/ResearchObject/bagit-ro) we see the two external files have been aggregated and given local paths and identifiers. Future work mayexplore propagating additional metadata about individual files (e.g. _creator_) into the RO manifest.

### Filling in metadata


Next we'll add sufficient metadata so that the RO can be published and assigned a DOI. A `title` is a good start. As this is a single resource (indicated by `None`) we use `PUT` to replace it's value.

In [48]:
metadata = requests.put(links["_metadata"]["href"], json={"title": "A good start"})
metadata

<Response [400]>

In [49]:
metadata.json()

{'pointerToViolation': '#',
 'message': '#: 2 schema violations found',
 'causingExceptions': [{'keyword': 'required',
   'pointerToViolation': '#',
   'message': '#: required key [description] not found',
   'causingExceptions': [],
   'schemaLocation': '/schemas/_base.schema.json#/definitions/Metadata'},
  {'keyword': 'required',
   'pointerToViolation': '#',
   'message': '#: required key [creators] not found',
   'causingExceptions': [],
   'schemaLocation': '/schemas/_base.schema.json#/definitions/Metadata'}],
 'schemaLocation': '/schemas/_base.schema.json#/definitions/Metadata'}

The attributes correspond to fields in the [DataCite schema](https://schema.datacite.org/), but in JSON instead of XML. We see we also need `description` and `creators`. From the name we guess that `creators` is an array, but we don't know it's fields yet.

In [58]:
metadata = requests.put(links["_metadata"]["href"], json={"title": "A good start", 
                                                          "description": "A test dataset of not much interest",
                                                          "creators": [{}] })
metadata.json()

{'keyword': 'required',
 'pointerToViolation': '#/creators/0',
 'message': '#/creators/0: required key [name] not found',
 'causingExceptions': [],
 'schemaLocation': '/schemas/data_bundle.schema.json#/definitions/Author'}

This highlights one of the advantages of this staged approach to filling the Research Object, you trigger any  errors as the item is being set rather than buried deep inside a large schema validation report. 

Now for the completed metadata, also including [ORCID](http://orcid.org/) to identify the creator.

In [75]:
metadata = requests.put(links["_metadata"]["href"], json=
                        { "title": "A good start", 
                          "description": "A test dataset of not much interest",
                          "access_right": "open",
                          "creators": [{"name": "Alice W Land", 
                                        "orcid": "https://orcid.org/0000-0002-1825-0097"}] })
metadata

<Response [200]>

As a RESTful resource we can use `PUT` multiple times in case we change our mind, e.g. if the user was editing equivalent forms in the UI. [Other metadata fields](https://developers.zenodo.org/#representation) from Zenodo can also be added to `_metadata` to be passed on to Zenodo, e.g. `keywords` or `related_identifiers`.

## Publishing the Research Object

Now we can **publish** the Research Object to [Zenodo](https://zenodo.org/) to assign a DOI (**Note**: the demo server this is configured to use [https//sandbox.zenodo.org/](https//sandbox.zenodo.org/) which does not actually issue DOIs).

To deposit the RO into the archive we `POST` to the `/research_object/{id}/deposit` resource.

In [76]:
published = requests.post(ro_uri + "/deposit")
published

<Response [200]>

Any errors in `metadata` above that caused issues in the [Zenodo API](https://developers.zenodo.org/#quickstart-upload) would have caused an error, but we got **200 OK**, so let's have a look.

In [82]:
print(published.text)

https://sandbox.zenodo.org/api/deposit/depositions/275055


### Inspecting the deposited RO

In a browser that is logged in to [https://sandbox.zenodo.org/](https://sandbox.zenodo.org/) you can access the above API resource, and will find something similar to:

In [99]:
{"conceptdoi":"10.5072/zenodo.275054","conceptrecid":"275054","created":"2019-04-23T19:07:43.524952+00:00","doi":"10.5072/zenodo.275055","doi_url":"https://doi.org/10.5072/zenodo.275055","files":[{"checksum":"1f555cdc3d5e5d5e50b5fb4dfef4b99e","filename":"data_bundle-20.zip","filesize":4000,"id":"e67803c5-95be-443a-b059-40039bfa9daf","links":{"download":"https://sandbox.zenodo.org/api/files/86ece719-5338-47af-b23f-ff4854e6df9d/data_bundle-20.zip","self":"https://sandbox.zenodo.org/api/deposit/depositions/274851/files/e67803c5-95be-443a-b059-40039bfa9daf"}}],"id":275055,"links":{"badge":"https://sandbox.zenodo.org/badge/doi/10.5072/zenodo.275055.svg","bucket":"https://sandbox.zenodo.org/api/files/86ece719-5338-47af-b23f-ff4854e6df9d","conceptbadge":"https://sandbox.zenodo.org/badge/doi/10.5072/zenodo.275054.svg","conceptdoi":"https://doi.org/10.5072/zenodo.275054","discard":"https://sandbox.zenodo.org/api/deposit/depositions/275055/actions/discard","doi":"https://doi.org/10.5072/zenodo.275055","edit":"https://sandbox.zenodo.org/api/deposit/depositions/275055/actions/edit","files":"https://sandbox.zenodo.org/api/deposit/depositions/275055/files","html":"https://sandbox.zenodo.org/deposit/275055","latest":"https://sandbox.zenodo.org/api/records/275055","latest_html":"https://sandbox.zenodo.org/record/275055","newversion":"https://sandbox.zenodo.org/api/deposit/depositions/275055/actions/newversion","publish":"https://sandbox.zenodo.org/api/deposit/depositions/275055/actions/publish","record":"https://sandbox.zenodo.org/api/records/275055","record_html":"https://sandbox.zenodo.org/record/275055","registerconceptdoi":"https://sandbox.zenodo.org/api/deposit/depositions/275055/actions/registerconceptdoi","self":"https://sandbox.zenodo.org/api/deposit/depositions/275055"},"metadata":{"access_right":"open","communities":[{"identifier":"zenodo"}],"creators":[{"name":"Alice W Land","orcid":"0000-0002-1825-0097"}],"description":"A test dataset of not much interest","doi":"10.5072/zenodo.275055","license":"CC0-1.0","prereserve_doi":{"doi":"10.5072/zenodo.275055","recid":275055},"publication_date":"2019-04-23","title":"A good start","upload_type":"dataset","version":"7484274FFDD99BC6822AF2F1CF805A5ED5F614C504D57FB6B960A2AD16575931"},"modified":"2019-04-23T19:07:45.060077+00:00","owner":25426,"record_id":275055,"state":"done","submitted":true,"title":"A good start"}

{'conceptdoi': '10.5072/zenodo.275054',
 'conceptrecid': '275054',
 'created': '2019-04-23T19:07:43.524952+00:00',
 'doi': '10.5072/zenodo.275055',
 'doi_url': 'https://doi.org/10.5072/zenodo.275055',
 'files': [{'checksum': '1f555cdc3d5e5d5e50b5fb4dfef4b99e',
   'filename': 'data_bundle-20.zip',
   'filesize': 4000,
   'id': 'e67803c5-95be-443a-b059-40039bfa9daf',
   'links': {'download': 'https://sandbox.zenodo.org/api/files/86ece719-5338-47af-b23f-ff4854e6df9d/data_bundle-20.zip',
    'self': 'https://sandbox.zenodo.org/api/deposit/depositions/274851/files/e67803c5-95be-443a-b059-40039bfa9daf'}}],
 'id': 275055,
 'links': {'badge': 'https://sandbox.zenodo.org/badge/doi/10.5072/zenodo.275055.svg',
  'bucket': 'https://sandbox.zenodo.org/api/files/86ece719-5338-47af-b23f-ff4854e6df9d',
  'conceptbadge': 'https://sandbox.zenodo.org/badge/doi/10.5072/zenodo.275054.svg',
  'conceptdoi': 'https://doi.org/10.5072/zenodo.275054',
  'discard': 'https://sandbox.zenodo.org/api/deposit/depositi

The `latest_html` key gives the more human-readable Zenodo record for browsers, e.g. [https://sandbox.zenodo.org/record/275055](https://sandbox.zenodo.org/record/275055), while `doi` gives the DOI (which would work on the production Zenodo)

We recognize our `metadata` properties, which have been augmented to indicate a `dataset` and the `publication_date`.

We can also see that the Research Object appears in the [most recent datasets](https://sandbox.zenodo.org/search?page=1&size=20&type=dataset&sort=mostrecent) on Zenodo.

## Depositing in other archives

It is possible to change the [deposition configuration](https://github.com/ResearchObject/research-object-composer/blob/deposition/src/main/resources/depositor.properties) of the RO composer to support depositing to other archives, e.g. [Mendeley Data](https://data.mendeley.com/), although a corresponding [implementation](https://github.com/ResearchObject/research-object-composer/tree/deposition/src/main/java/uk/org/esciencelab/researchobjectservice/deposition) must be added to the code. 

Current depositors include Zenodo and pure HTTP Post, but we are also planning a [SWORD](http://swordapp.org/) depositor to support multiple repositories. A remaining challenge here is how to unify the minimum metadata across repositories.

# Workflow Research Objects

Let's now look at the more detailed Research Object profile for capturing [scientific workflows](http://slides.com/soilandreyes/2019-03-11-reproducibility-pistoia). 

In [8]:
wf_profile = profiles[1]
wf_profile

{'id': 2,
 'name': 'draft_task',
 'fields': ['input', 'workflow', 'workflow_params', '_metadata'],
 '_links': {'self': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/draft_task'},
  'schema': {'href': 'http://openphacts.cs.man.ac.uk:8080/schemas/draft_task.schema.json'},
  'researchObjects': {'href': 'http://openphacts.cs.man.ac.uk:8080/profiles/draft_task/research_objects'}}}

Corresponding to a (potential) workflow run, we see these research objects have fields `workflow` for the workflow definition, `input` for the workflow input data, and `workflow_params` for the workflow configuration.

For the purpose of this demonstration, assume we are going to describe an execution of 