# Demo DERIVA Ingest Action Provider Flow
This notebook demonstrates using the DERIVA Ingest Action Provider via a premade Flow.

To run this notebook, you must have the `globus-automate-client` (to run the Flow) and `deriva-client` (to see the catalog) installed. The clients can be installed with `pip`:

`pip install globus-automate-client deriva-client`

In [1]:
import globus_automate_client
from deriva.core.ermrest_catalog import ErmrestCatalog
# Prerequisite IDs and data
native_app_id = "417301b1-5101-456a-8a27-423e71a2ae26"  # Premade native app ID
flow_id = "f172de09-b75b-4b83-9b97-90877b42c774"  # ID for flow to use, can be found through .list_flows()
backup_url = "https://s3-us-west-2.amazonaws.com/demo.derivacloud.org/cfde/cfde-backup.zip"  # Data for the Action
example_sample_id = "GTEX-OIZI-0008-SM-2XV77"  # Sample ID for one entry in the data to ingest

The `FlowsClient` is instantiated with the client ID of a native app registered with Globus (see the [Globus Developers](https://developers.globus.org/) website).

In [2]:
flows_client = globus_automate_client.create_flows_client(native_app_id)

The definition of a Flow can be found by querying on the Flow's ID. This definition includes the Flow's scope, which is needed for running the Flow.

In [3]:
flow_def = flows_client.get_flow(flow_id)
flow_def.data

{'administered_by': [],
 'api_version': '1.0',
 'definition': {'Comment': 'Run the Demo Deriva Ingest Action',
  'StartAt': 'RunDeriva',
  'States': {'RunDeriva': {'ActionScope': 'https://auth.globus.org/scopes/21017803-059f-4a9b-b64c-051ab7c1d05d/demo',
    'ActionUrl': 'https://demo-api.fair-research.org/',
    'End': True,
    'InputPath': '$.DerivaInput',
    'ResultPath': '$.DerivaResult',
    'Type': 'Action',
    'WaitTime': 86400}}},
 'description': '',
 'globus_auth_scope': 'https://auth.globus.org/scopes/f172de09-b75b-4b83-9b97-90877b42c774/flow_f172de09_b75b_4b83_9b97_90877b42c774',
 'globus_auth_username': 'f172de09-b75b-4b83-9b97-90877b42c774@clients.auth.globus.org',
 'id': 'f172de09-b75b-4b83-9b97-90877b42c774',
 'keywords': [],
 'log_supported': True,
 'principal_urn': 'urn:globus:auth:identity:f172de09-b75b-4b83-9b97-90877b42c774',
 'runnable_by': [],
 'subtitle': '',
 'synchronous': False,
 'title': 'Deriva Demo Flow',
 'types': ['Action'],
 'visible_to': []}

In [4]:
flow_scope = flow_def["globus_auth_scope"]

Input keyed on the Action name (see the `flow_def` above) is passed to the Action Provider directly.

In [5]:
# Input for restoring a catalog
restore_flow_input = {
    "DerivaInput": {
        "restore_url": "https://s3-us-west-2.amazonaws.com/demo.derivacloud.org/cfde/cfde-backup.zip"
    }
}
# Input for creating and ingesting into a new catalog
ingest_flow_input = {
    "DerivaInput": {
        "ingest_url": "https://examples.fair-research.org/public/CFDE/metadata/CFDE-GTEx-v7.2effcf3.C2M2.bdbag.tgz"
    }
}

In [6]:
# We're using the ingest flow
flow_input = ingest_flow_input

Running the flow combines all the previous elements into one call, and returns the starting state of the Flow, including the ID of this instance of the Flow.

In [7]:
flow_res = flows_client.run_flow(flow_id, flow_scope, flow_input)
flow_res.data

{'action_id': 'bfe46f09-1629-4261-9033-43ff778988cb',
 'completion_time': 'None',
 'created_by': 'urn:globus:auth:identity:117e8833-68f5-4cb2-afb3-05b25db69be1',
 'details': {'code': 'ActionStarted',
  'description': 'State RunDeriva of type Action started',
  'details': {'input': {'DerivaInput': {'ingest_url': 'https://examples.fair-research.org/public/CFDE/metadata/CFDE-GTEx-v7.2effcf3.C2M2.bdbag.tgz'}},
   'state_name': 'RunDeriva',
   'state_type': 'Action'},
  'time': '2019-09-25T16:17:13.006000+00:00'},
 'start_time': '2019-09-25T16:17:12.980000+00:00',
 'status': 'ACTIVE'}

In [8]:
instance_id = flow_res["action_id"]

After the Flow is started, the status can be queried with the `instance_id`.

In [9]:
flows_client.flow_action_status(flow_id, flow_scope, instance_id).data

{'action_id': 'bfe46f09-1629-4261-9033-43ff778988cb',
 'completion_time': 'None',
 'created_by': 'urn:globus:auth:identity:117e8833-68f5-4cb2-afb3-05b25db69be1',
 'details': {'action_url': 'https://demo-api.fair-research.org/',
  'code': 'ActionPolled',
  'description': 'Polling for completion of action state RunDeriva',
  'details': {'state_name': 'RunDeriva'},
  'time': '2019-09-25T16:17:13.447000+00:00'},
 'start_time': '2019-09-25T16:17:12.980000+00:00',
 'status': 'ACTIVE'}

Eventually, the Flow will complete (in this case successfully) and we can pull out the return value(s) of interest.

In [10]:
response = flows_client.flow_action_status(flow_id, flow_scope, instance_id).data
response

{'action_id': 'bfe46f09-1629-4261-9033-43ff778988cb',
 'completion_time': '2019-09-25T16:19:48.387000+00:00',
 'created_by': 'urn:globus:auth:identity:117e8833-68f5-4cb2-afb3-05b25db69be1',
 'details': {'output': {'DerivaInput': {'ingest_url': 'https://examples.fair-research.org/public/CFDE/metadata/CFDE-GTEx-v7.2effcf3.C2M2.bdbag.tgz'},
   'DerivaResult': {'action_id': '5d8b9309c9e0d5da0c40d372',
    'creator_id': 'urn:globus:auth:identity:117e8833-68f5-4cb2-afb3-05b25db69be1',
    'details': {'deriva_id': 142,
     'deriva_link': 'https://demo.derivacloud.org/chaise/recordset/#142/CFDE:Dataset',
     'message': 'DERIVA ingest successful'},
    'manage_by': ['urn:globus:auth:identity:96801dc2-95d7-44f0-9383-d3e747be8ab6',
     'urn:globus:auth:identity:117e8833-68f5-4cb2-afb3-05b25db69be1'],
    'monitor_by': ['urn:globus:auth:identity:96801dc2-95d7-44f0-9383-d3e747be8ab6',
     'urn:globus:auth:identity:117e8833-68f5-4cb2-afb3-05b25db69be1'],
    'release_after': 'P30D',
    'request

In [11]:
deriva_catalog_id = response["details"]["output"]["DerivaResult"]["details"]["deriva_id"]
print("The web link for this catalog is:\n", response["details"]["output"]["DerivaResult"]["details"]["deriva_link"], sep="")

The web link for this catalog is:
https://demo.derivacloud.org/chaise/recordset/#142/CFDE:Dataset


Now, we can look at the catalog with the DERIVA client.

In [12]:
catalog = ErmrestCatalog("https", "demo.derivacloud.org", str(deriva_catalog_id))
catalog.get("/").json()

{'snaptime': '2S6-RR4W-2YV0',
 'annotations': {'tag:isrd.isi.edu,2019:chaise-config': {'SystemColumnsDisplayCompact': []}},
 'rights': {'create': False, 'owner': False},
 'id': '142'}

In [13]:
# catalog.get("/entity/Samples/SAMPID={}".format(example_sample_id)).json()