<img src="img/platform_introduction_native_app_header.png">

# Globus SDK
- Source code: https://github.com/globus/globus-sdk-python
- Documenation: https://globus-sdk-python.readthedocs.io/en/stable/


# Requirements
- Membership in the [Tutorial Users Group](https://app.globus.org/groups/50b6a29c-63ac-11e4-8062-22000ab68755) for sharing.
- Installed Globus Python SDK

In [None]:
import json  # just so we can pretty-print response data

# Feel free to replace the endpoint UUIDs below with those of your own endpoints
tutorial_endpoint_1 = "ddb59aef-6d04-11e5-ba46-22000b92c6ec"  # endpoint "Globus Tutorial Endpoint 1"
tutorial_endpoint_2 = "ddb59af0-6d04-11e5-ba46-22000b92c6ec"  # endpoint "Globus Tutorial Endpoint 2"
tutorial_users_group = "50b6a29c-63ac-11e4-8062-22000ab68755"  # group "Tutorial Users"

# Authentication and Transfer Client Configuration

The Globus Python SDK makes transfer functionality available via a TransferClient class.

We want to configure a TransferClient with an OAuth2 access token, to authenticate its connections with Globus.

In order to do this, we must first do a Native App Grant OAuth2 flow, in which we will:
1. Log into Globus using a one-time, generated URL
2. Consent to allow this Jupyter Notebook to access Globus Transfer on our behalf
3. Return to the notebook with an Authorization Code (the result of step 2)
4. Exchange the Authorization Code for an Access Token
5. Create a TransferClient object using that Access Token as its authorization method

In [None]:
import globus_sdk

CLIENT_ID = "3b1925c0-a87b-452b-a492-2c9921d3bd14"  # client ID of the Jupyter Demo App in Globus Auth

# Start Native App Grant, and print out the URL where users login as part of the flow (step 2 above)
# First, create a client object that tracks state as we do this flow
native_auth_client = globus_sdk.NativeAppAuthClient(CLIENT_ID)

# Explicitly start the flow (some clients may support multiple flows)
native_auth_client.oauth2_start_flow()
print(f"Login Here:\n\n{native_auth_client.oauth2_get_authorize_url()}")
print("\nIMPORTANT NOTE: the link above can only be used once!")
print("If login or a later step in the flow fails, you must execute this cell again to generate a new link.")

### Come Back With Your Code

You'll now need to return with the copy-paste-able code that you got from login at the Authorize URL.
Insert it into the code block below, and it can be exchanged for a set of Access Tokens (and, optionally, Refresh Tokens, which are absent in this example).

In [None]:
# Add the code that you got from Globus below
auth_code = "2JKx5eE1t1EML1WleK6Z5gTaMUYPvm"

# Exchange code for access token(s)
token_response = native_auth_client.oauth2_exchange_code_for_tokens(auth_code)

# Extract the access token for the Globus Transfer service, known as "transfer.api.globus.org"
transfer_access_token = token_response.by_resource_server['transfer.api.globus.org']['access_token']

# Wrap the token in an object that implements the globus_sdk.GlobusAuthorizer interface
# In this case, an AccessTokenAuthorizer, which takes an access token and produces Bearer Auth headers
transfer_authorizer = globus_sdk.AccessTokenAuthorizer(transfer_access_token)

# Create a TransferClient object which Authorizes its calls using that GlobusAuthorizer
tc = globus_sdk.TransferClient(authorizer=transfer_authorizer)

## Help

Test that our transfer client works by requesting help on the `get_endpoint` method. You can use this to get help on any method.

In [None]:
help(tc.get_endpoint)

## Using the client

The transfer client makes REST resources available via easy to use methods. The response from these methods wraps the HTTP response status, content type, text and JSON response body. 

In [None]:
response = tc.get_endpoint(tutorial_endpoint_1)
print(f"HTTP Status Code: {response.http_status}")
print(f"Content Type: {response.content_type}")
print(f"Endpoint Display Name: {response['display_name']}")  # shortcut for response.data['display_name']
print(f"Data: {response}")

Helper methods for APIs that returns lists have iterable responses, and automatically take care of paging where required:

In [None]:
response = tc.endpoint_search(filter_scope="my-endpoints")
for ep in response:
    print(f"{ep['display_name'] or ep['canonical_name']} ({ep['id']})")

If a helper method is not yet available for the desired API call, or for more low level control, the low-level interface can be used. Note that the low level interface does not do automatic paging:

In [None]:
response = tc.get("/endpoint_search", params=dict(filter_fulltext="Globus Tutorial Endpoint", limit=1))
if response['DATA']:
    print(f"Endpoint ID: {response['DATA'][0]['id']} (Owner: {response['DATA'][0]['owner_string']})")
    print(f"More matches? {response['has_next_page']}")
else:
    print("No results")

## Handling errors

If the API returns an error response (HTTP status code 4xx or 5xx), it will be translated to a Python exception and raised:

In [None]:
try:
    response = tc.get_endpoint("dcb2e10e-de27-4b99-8722-1a69aa3fc467")
except globus_sdk.GlobusAPIError as error:
    print(f"HTTP Status Code: {error.http_status}")
    print(f"Error Code      : {error.code}")
    print(f"Error Message   : {error.message}")

There are five basic classes of errors:

1. Bad request - there is something wrong with the request from the client, like a mispelled parameter name or missing required data. These errors have a code that starts with ``BadRequest`` or ``ClientError.BadRequest``.
2. State conflict - this is a very broad category, and covers all the errors that can happen during normal operation, and neither the client nor the server could have anticipated and avoided the error. Examples: local filesystem permissions not allowing the requested path on a remote GridFTP endpoint, endpoint not found (could have been deleted concurrently by another client). This also includes network errors communicating with GridFTP endpoints and other external services. These errors typically have a code containing ``PermissionDenied``, ``Conflict``, or ``ExternalError``.
3. Network error - network failure between the REST client and the REST API server. These errors will result in a ``globus_sdk.NetworkError`` being raised by the SDK.
4. Planned downtime - code ``ServiceUnavailable``.
5. Server error - caused by a bug in the REST API server (code ``ServerError.InternalError``). We log such errors and incorperate fixes into our next release, but developers are still encouraged to submit details to the mailing list when they encounter these errors. Note that sometimes these errors are actually a sign of a bad request type error, i.e. the bug in the server is that it's not anticipating the exact type of bad data, so it's not reporting the correct error code, but the problem can still be resolved by a change to the client.

# Endpoint management

## Endpoint search

Globus has tens of thousands of registered endpoints. To find endpoints of interest you can access powerful search capabilities via the SDK. For example, to search for a given string across the descriptive fields of endpoints (names, description, keywords):

In [None]:
search_str = "Globus Tutorial Endpoint"
endpoints = tc.endpoint_search(search_str)
print(f"==== Displaying endpoints that match '{search_str}' ===")
for ep in endpoints:
    print(f"{ep['display_name'] or ep['canonical_name']} ({ep['id']})")

## Restricting search scope with filters

There are also a number of default filters to restrict the search for 'my-endpoints', 'my-gcp-endpoints',     'recently-used', 'in-use', 'shared-by-me','shared-with-me') 

In [None]:
search_str = None
endpoints = tc.endpoint_search(filter_fulltext=search_str, filter_scope="recently-used")
for ep in endpoints:
    print(f"{ep['display_name'] or ep['canonical_name']} ({ep['id']})")

## Endpoint details

You can also retrieve complete information about an endpoint, including name, owner, and configuration details. 

In [None]:
endpoint = tc.get_endpoint(tutorial_endpoint_1)
print(f"Display Name: {endpoint['display_name']}")
print(f"Owner       : {endpoint['owner_string']}")
print(f"ID          : {endpoint['id']}")
print(f"Network Use - Concurrency = {endpoint['preferred_concurrency']}")
print(f"Network Use - Parallelism = {endpoint['preferred_parallelism']}")

# File operations

## Autoactivate Endpoints
Globus endpoints must be "activated" before they can be used, which means associating a credential with the endpoint that is used for login to that endpoint. Before performing operations against an endpoint, you should "autoactivate" the endpoint. On Globus Connect Personal endpoints and shared collections, autoactivation will automatically create the necessary credentials to access the endpoint. For endpoints that require activation (i.e., those with mapped collections) you can activate those endpoints via the Globus website. Here we autoactivate the Globus tutorial endpoints using their endpoint IDs.

In [None]:
# help(tc.endpoint_activate)
response = tc.endpoint_autoactivate(tutorial_endpoint_1)
print(f"Response: {response['code']}\n{response['message']}")

## Get a directory listing

Having activated an endpoint, you can now perform operations on it. For example, performing an ls command to retrieve directory contents. 

In [None]:
# help(tc.operation_ls)
endpoint_id = tutorial_endpoint_1
endpoint_path = "/share/godata/"
response = tc.operation_ls(endpoint_id, path=endpoint_path)
print(f"==== 'ls' for {endpoint_path} on endpoint {endpoint_id} ====")
for item in response:
    print(f"{item['type']}: {item['name']} [{item['size']}]")

## Make directory

You can create a new directory.

In [None]:
# help(tc.operation_mkdir)
try:
    new_path = "/~/tutorial_dir"
    mkdir_result = tc.operation_mkdir(endpoint_id, path=new_path)
    print(mkdir_result['message'])
except globus_sdk.GlobusAPIError as error:
    print(f"Error code: {error.code}\nError message: {error.message}")

## Rename

You can rename files and directories on your endpoints. 

In [None]:
# help(tc.operation_rename)
try:
    response = tc.operation_rename(endpoint_id, oldpath="/~/tutorial_dir", newpath="/~/tutorial_dir_renamed")
    print(response['message'])
except globus_sdk.GlobusAPIError as error:
    print(f"Error code: {error.code}\nError message: {error.message}")

# Task submission and management

The Globus task interface allows you to create and manage asynchronous file transfer and deletion tasks. 

## Transfer

Creating a transfer is a two stage process. First you must create a description of the data you want to transfer (which also creates a unique submission_id), and then you can submit the request to Globus to transfer that data. 

If the submit_transfer fails, you can safely resubmit the same transfer_data again. The submission_id will ensure that this transfer request will be submitted once and only once.

In [None]:
# help(tc.submit_transfer)
source_endpoint_id = tutorial_endpoint_1
source_path = "/share/godata/"

dest_endpoint_id = tutorial_endpoint_2
dest_path = "/~/"

label = "My tutorial transfer"

# TransferData() automatically gets a submission_id for once-and-only-once submission
tdata = globus_sdk.TransferData(tc, source_endpoint_id, dest_endpoint_id, label=label)

## Recursively transfer source path contents
tdata.add_item(source_path, dest_path, recursive=True)

## Alternatively, transfer a specific file
# tdata.add_item("/source/path/file.txt", "/dest/path/file.txt"))

# Ensure endpoints are activated
tc.endpoint_autoactivate(source_endpoint_id)
tc.endpoint_autoactivate(dest_endpoint_id)

submit_result = tc.submit_transfer(tdata)
print(f"Task ID: {submit_result['task_id']}")

## Get Task By ID

While the task is running, or after completion, you can get information that describes the transfer task. 

In [None]:
response = tc.get_task(submit_result['task_id'])
print(f"Label: {response['label']}")
print(f"Status: {response['status']}")
print(f"Transfer: {response['source_endpoint_display_name']} -> {response['destination_endpoint_display_name']}")
    
if response.data["status"] == "SUCCEEDED":
    print(f"Bytes transferred: {response['bytes_transferred']}")
    print(f"Files transferred: {response['files_transferred']}")
    print(f"Transfer rate: {response['effective_bytes_per_second']} Bps")

## Check destination endpoint

After the transfer has finished you can list the contents of the destination endpoint

In [None]:
ls_iter = tc.operation_ls(dest_endpoint_id, path=dest_path)
print(f"==== 'ls' for {dest_path} on endpoint {dest_endpoint_id} ====")
for item in ls_iter:
    print(f"{item['type']}: {item['name']} [{item['size']}]")

## Get task list

You can get a list of past or current tasks with the following call. Note that only `TRANSFER` tasks are returned by default - the type filter is necessary to get `DELETE` tasks as well. This is a remnant of a legacy backward compatibility concern, and will likely be changed in the future to both being included by default.

In [None]:
# help(tc.task_list)
response = tc.task_list(num_results=10, filter="type:TRANSFER,DELETE")
for i, item in enumerate(response):
    print(item['status'],
          item['task_id'], 
          item['type'],
          item['source_endpoint_display_name'],
          item['destination_endpoint_display_name'],
          item['label'])

## Filter task list

Retrieve only active tasks:

In [None]:
response = tc.task_list(num_results=10, filter="type:TRANSFER,DELETE/status:ACTIVE")
for i, item in enumerate(response):
    print(item['status'],
          item['task_id'], 
          item['type'],
          item['source_endpoint_display_name'],
          item['destination_endpoint_display_name'],
          item['label'])

See [Common Query Parameters](https://docs.globus.org/api/transfer/task/#common_query_parameters) for a description of the `filter` parameter, and [Task List filters](https://docs.globus.org/api/transfer/task/#filter_and_order_by_options) for details of what is supported by task list.

## Cancel task

You can also cancel a running task. 

In [None]:
# help(tc.cancel_task)
response = tc.cancel_task(submit_result['task_id'])
print(f"{response['code']}: {response['message']}")

## Get event list for task

Every task stores periodic event markers (e.g., errors, performance markers, etc.). You can retrieve and filter this list as follows. 

In [None]:
# help(tc.task_event_list)
response = tc.task_event_list(submit_result['task_id'], num_results=10)
for event in response:
    print(event['time'], event['code'], event['is_error'], event['details'])


## Delete files task

File deletion is also an asynchronous task, that is submitted and monitored similar to a transfer task.

In [None]:
# help(tc.submit_delete)
# Create a folder, delete it, wait for completion
endpoint_id = tutorial_endpoint_1
path = "/~/tutorial_delete_example"
try:
    tc.operation_mkdir(endpoint_id, path=path)
except globus_sdk.GlobusAPIError as error:
    if "Exists" in error.code:
        print("Directory already exists, ignoring error")
    else:
        raise

label = "My tutorial delete"

# DeleteData() automatically gets a submission_id for once-and-only-once submission
# Note that recursive is a top level option for delete, not a per-path option like
# it is for transfers.
ddata = globus_sdk.DeleteData(tc, endpoint_id, label=label, recursive=True)

## Recursively delete path contents (because of recursive flag set above)
ddata.add_item(path)

# Ensure endpoint is activated
tc.endpoint_autoactivate(endpoint_id)

submit_result = tc.submit_delete(ddata)
print(f"Task ID: {submit_result['task_id']}")

## Wait for task to complete

Transfer and delete tasks are asynchronous operations, and depending on their size may take a long time to complete. If you wish to wait for a task to complete, the TransferClient provides a task_wait helper method:

In [None]:
# Wait for a task to finish for 10 minutes, polling every 15 seconds.
completed = tc.task_wait(submit_result['task_id'], timeout=600, polling_interval=15)
if completed:
    print("Task finished!")
else:
    print("Task still running after timeout reached.")


# Bookmarks

Bookmarks allow you to keep a list of frequently used endpoints and paths. Full management capabilities (create, retrieve, update, delete) are supported on bookmarks. Note that the REST API itself does not directly support bookmarks when performing operations. It is the responsibility of the client to allow the users to choose bookmarks, and then translate them to endpoint ids to perform ls operations and submit transfers. In particular, the www.globus.org website has full support for bookmarks.

## Create a Bookmark

In [None]:
bookmark_name = "My Tutorial Bookmark"
endpoint_id = tutorial_endpoint_1
endpoint_path = "/share/godata/"
response = tc.create_bookmark({"endpoint_id": endpoint_id, "path": endpoint_path,"name": bookmark_name})
bookmark_id = response['id']
print(response)

## Get a list of bookmarks

In [None]:
response = tc.bookmark_list()
for b in response:
    print (b['name'], b['path'], b['id'])

## Update a bookmark


In [None]:
bookmark_data = {
    'name': 'My Updated Tutorial Bookmark'
}
response = tc.update_bookmark(bookmark_id, bookmark_data)
print (response)

## Delete a Bookmark

In [None]:
response = tc.delete_bookmark(bookmark_id)
print (response)

# Shared endpoints

Shared endpoints are virtual endpoints that refer to a particular "host endpoint" and path, which allows Globus to manage access to that shared endpoint. Folders on the shared endpoint can be easily shared with other Globus users and groups via access control rules.

## Create a shared endpoint

In [None]:
# Create a dir to share
host_endpoint_id = tutorial_endpoint_1
host_endpoint_path = "/~/shared_dir2"
try:
    response = tc.operation_mkdir(host_endpoint_id, path=host_endpoint_path)
except globus_sdk.GlobusAPIError as error:
    # Ignore the error if the directory already exists, otherwise raise
    if "Exists" not in error.code:
        raise

# Define the shared endpoint 
shared_ep = {"DATA_TYPE": "shared_endpoint",
             "host_endpoint": host_endpoint_id,
             "host_path": host_endpoint_path,
             "display_name":"My Tutorial Shared Endpoint2",
             # optionally specify additional endpoint fields
             "description": "Test creating a share from globus-jupyter-notebook"
             }

response = tc.create_shared_endpoint(shared_ep)
print(f"{response['code']}: {response['message']}")
print(f"Endpoint ID", {response['id']})
shared_endpoint_id = response['id']

## Get endpoint information

In [None]:
response = tc.get_endpoint(shared_endpoint_id)
print(f"Display name: {response['display_name']}")
print(f"Owner: {response['owner_string']}")
print(f"Host Endpoint ID: {response['host_endpoint_id']}")

## Get a list of shared endpoints

In [None]:
endpoints = tc.endpoint_search(filter_scope="shared-by-me")
print("==== Displaying shared endpoints ===")
for ep in endpoints:
    print(f"{ep['display_name']} ({ep['id']})")


## Add a new access control rule

You can share access to different paths within your shared endpoint with users, groups, or publicly. The principal_type can be one of 'identity', 'group', 'all_authenticated_users', or 'anonymous'.  Each access rule is given a unique access_rule_id, which can be used to manage that access rule.

Here is an example of sharing with the tutorial users group. 

In [None]:
rule_data = {
    'DATA_TYPE': 'access',
    'permissions': 'rw',
    'principal' : tutorial_users_group,  # use this if sharing with a group of users
    'principal_type' : 'group',  # use this if sharing with a group of users
    #'principal': 'IDENTITY_ID',  # use this if sharing with a single user (identity)
    #'principal_type': 'identity',  # use this if sharing with a single user (identity)
    'path': '/'
}

try:
    response = tc.add_endpoint_acl_rule(shared_endpoint_id, rule_data)
    access_rule_id = response['access_id']
    print (response)
except globus_sdk.GlobusAPIError as error:
    if "Exists" in error.code:
        print("ACL already exists, ignoring error")
    else:
        raise

## Get list of access rules

In [None]:
response = tc.endpoint_acl_list(shared_endpoint_id)
for rule in response:
    print (rule['id'], rule['principal_type'], rule['principal'], rule['permissions'], rule['path'])

## Get access rule by id

Get the access rule details using its access_rule_id

In [None]:
response = tc.get_endpoint_acl_rule(shared_endpoint_id, access_rule_id)
print (response)

## Update access rule

Update an access rule using its access_rule_id.

In [None]:
rule_update = {
    'DATA_TYPE': 'access',
    'permissions': 'r',
}
response = tc.update_endpoint_acl_rule(shared_endpoint_id, access_rule_id, rule_update)
print (response)

## Delete access rule

Delete an access rule using its access_rule_id.

In [None]:
response = tc.delete_endpoint_acl_rule(shared_endpoint_id, access_rule_id)
print (response)

# Low-level SDK interface

The helper methods are all built on top of the low level interface. If a helper method is not yet available for the API resource you wish to use, the low level interface can be used directly.

Note that the examples in this section use the endpoint management API resources, which DO have helper methods, but they still serve as good examples for how to use the low level interface.

## POST request

Create an endpoint using the low level API:

In [None]:
endpoint_data = {
    "DATA_TYPE": "endpoint",
    "display_name": "Tutorial Create Example",
    "public": False,
    "DATA": [
        {
            "DATA_TYPE": "server",
            "hostname": "gridftp.example.org",
        }
    ]
}
response = tc.post("/endpoint", json_body=endpoint_data)
endpoint_id = response['id']
print(response)

## GET request

Do a GET on the newly create endpoint:

In [None]:
response = tc.get(f"/endpoint/{endpoint_id}",
           params=dict(fields="id,display_name,description"))
print(response)

## PUT request

Update the description on the newly created endpoint:

In [None]:
endpoint_update = {
    "description": "Test updating description using low level API"
}
response = tc.put(f"/endpoint/{endpoint_id}", json_body=endpoint_update)
print(response)

## DELETE request

Now delete the endpoint:

In [None]:
response = tc.delete(f"/endpoint/{endpoint_id}")
print(response)