<img src="img/automation_using_flows.png">

In this notebook we demonstrate how the Globus Flows service can be used to automate data management at scale. We demonstrate a flow that automates a common design pattern: moving data from one system to another and making the data accessible to collaborators. This flow is often needed to manage data coming from instruments, e.g., image files can be moved from local storage attached to a microscope to a high-performance storage system where they may be accessed by all members of the research project.

We will walk through the following tasks:
1. Authenticate with Globus and get tokens for accessing various services
1. Define and register a flow with Globus
1. Execute a flow using configurable inputs for the endpoint and access permissions

The Globus flow is illustrated below.

<img src="img/transfer_set_permissions_flow.png" alt="Transfer and set permissions flow" align="CENTER" style="width: 90%;"/>

In [None]:
import sys
import os
import time
import json
import uuid
import pickle
import base64
import pprint

import globus_sdk
from globus_automate_client import FlowsClient

client_id = 'f794186b-f330-4595-b6c6-9c9d3e903e47'

# Feel free to replace the endpoint UUIDs below with those of your own endpoints
tutorial_endpoint = "ddb59aef-6d04-11e5-ba46-22000b92c6ec"  # endpoint "Globus Tutorial Endpoint 1"
petrel_endpoint = "e56c36e4-1063-11e6-a747-22000bf2d559"  # endpoint "Petrel Testbed"
tutorial_users_group = "50b6a29c-63ac-11e4-8062-22000ab68755"  # group "Tutorial Users"

## A. Authentication and Authorization

All interactions between users and services on the Globus automation platform are governed by the Globus Auth service. In particular, this means that consent must be given by the user for each interaction taking place on their part, including in this Notebook.

The first time you interact with each service such as the Flows service, or even a flow instance, you will be provided a link to perform the consent flow. You must click the link to complete the consent flow which will launch in a new tab. When complete, copy the code string, return to the notebook, and  paste the code into the input box that is presented below the link to begin the flow.

We will encounter authorization steps in a couple of places:
1. When deploying a new flow on the Globus Flows service; deplpoying a flow requires (a) an identity that is associated with a Globus subscription, and (b) access to the Flow service scope.
1. When executing a flow.

Access to the Flow service is already granted to you by virtue of authenticating to the JupyterHub running this notebook. Note: If you're running this notebook in your own environment you will need to manually log into Globus Auth and get tokens using a native app authorization flow (see the `Platform_Introduction_Native_App_Auth` notebook for an example of how to initiate this flow).

In [None]:
# Get Globus Auth token data from the JupyterHub environment
tokens = pickle.loads(base64.b64decode(os.getenv('GLOBUS_DATA')))['tokens']

# Create a variable for storing flow scope tokens. Each newly deployed scope needs to be authorized separately,
# and will have its own set of tokens. Save each of these tokens by scope.
saved_flow_scopes = {}

# Add a callback to the flows client for fetching scopes. It will draw scopes from the `saved_flow_scopes` variable
# above.
def get_flow_authorizer(flow_url, flow_scope, client_id):
    return globus_sdk.AccessTokenAuthorizer(access_token=saved_flow_scopes[flow_scope]['access_token'])

# Setup the flows client, using tokens from our Jupyterhub login to access the flows service, and
# setting the `get_flow_authorizer` callback for any new flows we authorize.
flows_authorizer = globus_sdk.AccessTokenAuthorizer(access_token=tokens['flows.globus.org']['access_token'])
flows_client = FlowsClient.new_client(
    client_id, get_flow_authorizer, flows_authorizer
)


## Fetch User Identity for ACL Permissions

After transferring files, the second part of the flow will add an ACL for a user. Below will fetch your user id so it can be granted read access.

In [None]:

# Create an Auth client so we can look up identities
auth_authorizer = globus_sdk.AccessTokenAuthorizer(access_token=tokens['auth.globus.org']['access_token'])
ac = globus_sdk.AuthClient(authorizer=auth_authorizer)
primary_identity = ac.oauth2_userinfo()
identity_id = primary_identity['sub']

print(f"Setting permissions for user: {primary_identity['preferred_username']}")
print(f"Notifications will be sent to: {primary_identity['email']}")

# B. Flow Registration

## Define a flow

* Flows are composed of *Action* invocations.
* Each Action invocation reads from and contributes back to the *Flow State* which is referenced in Flow steps using the `InputPath` and `ResultPath` properties of an Action.
* Actions specify the service endpoint that will be called using the `ActionUrl` property, and the Globus Auth scope that's required for the specified action using the `ActionScope` property.
* Actions are linked via their `Next` property; the last action in a flow sets the `End` property to true.

Our simple flow defines just two *Actions*, `MoveFiles` and `SetPermission`.

In [None]:


flow_definition = {
  "Comment": "Move files to guest collection and set access permissions",
  "StartAt": "MoveFiles",
  "States": {
    "MoveFiles": {
      "Comment": "Transfer from Globus Tutorial Endpoint 1 to a guest collection on Petrel",
      "Type": "Action",
      "ActionUrl": "https://actions.automate.globus.org/transfer/transfer",
      "ActionScope": "https://auth.globus.org/scopes/actions.globus.org/transfer/transfer",
      "Parameters": {
        "source_endpoint_id.$": "$.input.source_endpoint_id", 
        "destination_endpoint_id.$": "$.input.destination_endpoint_id",
        "sync_level": "exists",
        "transfer_items": [
              {
                "source_path.$": "$.input.source_path",
                "destination_path.$": "$.input.destination_path",
                "recursive": True
              }
        ],
      },
      "ResultPath": "$.MoveFiles",
      "WaitTime": 3600,
      "Next": "SetPermission"
    }, 
    "SetPermission": {
      "Comment": "Grant read permission on the data to the Tutorial users group",
      "Type": "Action",
      "ActionUrl": "https://actions.automate.globus.org/transfer/set_permission",
      "ActionScope": "https://auth.globus.org/scopes/actions.globus.org/transfer/set_permission",
      "Parameters": {
        "endpoint_id.$": "$.input.destination_endpoint_id",
        "path.$": "$.input.destination_path",
        "permissions": "r",  # read-only access
        "principal.$": "$.input.principal",
        "principal_type.$": "$.input.principal_type",
        "operation": "CREATE",
      },
      "ResultPath": "$.SetPermission",
      "End": True
    }
  }
}


## Deploy a flow

Before running a flow it must be deployed on the Globus Flows service. In addition to the flow definition we created above, you must provide a unique title for your flow when you deploy it. If deployment succeeds Globus returns an ID as a handle to the flow resource.

In [None]:
# Deploy the flow
flow_title = f"Tutorial-Flow-{str(uuid.uuid4())}"   # generate a unique title
# flow = flows_client.update_flow(flow_id, flow_definition)
flow = flows_client.deploy_flow(flow_definition, title=flow_title)
flow_id = flow['id']
flow_scope = flow['globus_auth_scope']

print(f"Successfully deployed flow (ID: {flow_id})")
print(f"Flow scope: {flow_scope}")

# C. Flow Execution

## Define flow input(s)

If your flow includes parameterized input properties you must provide values for those properties when running the flow. Like the flow definition, flow inputs are defined as a JSON document. You must provide a value for each input property in your flow (input properties are prefixed by `$.` (see flow definition above).

For the `MoveFiles` action we must specify source and destination collection IDs and source and destination paths. For the `SetPermissions` action we must specify the collection ID, the type of entity to which we're granting permission, the entity's ID, and the permission (read or read/write).

In [None]:
flow_input = {
    "input": {
        # Transfer input
        "source_path": "/share/godata",
        "destination_path": "/disthome-automate/",
        "source_endpoint_id": tutorial_endpoint,
        "destination_endpoint_id": petrel_endpoint,
        
        # Grant access to the following person
        "principal": identity_id,
        "principal_type": "identity",
    }
}

## Authorize the newly deployed flow

The new flow has been deployed, but it still needs to be authorized.

In [None]:
# If the flow scope is already saved, we don't need a new one.
if flow_scope not in saved_flow_scopes:
    # Do a native auth flow to login with the newly deployed flow scope
    native_auth_client = globus_sdk.NativeAppAuthClient(client_id)
    native_auth_client.oauth2_start_flow(requested_scopes=flow_scope)
    print(f"Login Here:\n\n{native_auth_client.oauth2_get_authorize_url()}")
    auth_code = input('Auth Code> ')
    token_response = native_auth_client.oauth2_exchange_code_for_tokens(auth_code)
    
    # Save the new token in a place where the flows client can retrieve it.
    saved_flow_scopes[flow_scope] = token_response.by_scopes[flow_scope]

## Run the flow

We're finally ready to run the flow. Note that you will be required to consent again.

In [None]:
flow_action = flows_client.run_flow(flow_id, flow_scope, flow_input)
flow_action_id = flow_action['action_id']
flow_status = flow_action['status']
print(f'Flow action started with id: {flow_action_id}')
while flow_status == 'ACTIVE':
    time.sleep(2)
    flow_action = flows_client.flow_action_status(flow_id, flow_scope, flow_action_id)
    flow_status = flow_action['status']
    print(f'Flow status: {flow_status}')
pprint.pprint(flow_action.data)

### Remove Access Rule

You can remove the access rule the old way using the Globus SDK

In [None]:
access_rule_id = flow_action['details']['output']['SetPermission']['details']['access_id']

transfer_authorizer = globus_sdk.AccessTokenAuthorizer(tokens['transfer.api.globus.org']['access_token'])
tc = globus_sdk.TransferClient(authorizer=transfer_authorizer)

response = tc.delete_endpoint_acl_rule(petrel_endpoint, access_rule_id)
print (response)