<img src="img/jupyterhub_integration_header.png">

This tutorial demonstrates how to (a) pull Globus Auth tokens from the Jupyter Notebook Server environment, and (b) use those tokens to interact with different REST APIs secured with Globus Auth. The notebook implements a simple data flow. 

You launched this notebook using the Globus-enabled JupyterHub environment, so the following have already happened:
1. You have established your identity by authenticating, with an institutional credential, ORCID, or similar
1. You have granted consent to the issuance of tokens with certain scopes
1. A notebook has been created, with access to those tokens

<img src="img/jupyterhub_tokens.png" alt="Steps followed prior to starting this notebook" align="CENTER" style="width: 85%;"/>

<strong>Notes</strong>
1. You need to join the  [Tutorial Users Group](https://app.globus.org/groups/50b6a29c-63ac-11e4-8062-22000ab68755) in order to access the shared endpoint at the end of the tutorial.
1. Our code here is pedantic, for clarity; much could be encapsulated in Python packages to simplify the notebook.
1. Tokens are issued and stored in the JupyterHub database at login. They typically expire in 24 hours. As we do not provide a mechanism to handle refresh tokens, the simplest way to get new ones is to:
 - Stop your server (see the Control Panel)
 - Log out
 - Log back in
 - Start your server
 - Launch the notebook

# Import Packages

In [None]:
# used to get the tokens
import os
import pickle
import base64

# much of what we deal with is JSON
import json

# we're going to make explicit HTTPS calls
import requests

# required to work with our example data
import csv
import datetime
import matplotlib.pyplot as plt
from io import StringIO

# Get Tokens

The Globus-enabled JupyterhHub passes the tokens into the notebook environment `base64` encoded as a pickled Python dictionary assigned to the `GLOBUS_DATA` variable. We'll grab the variable and unpack it. 

In [None]:
# get Globus Auth token data
globus_token_data = os.getenv('GLOBUS_DATA')

In [None]:
# now extract the pickled tokens
pickled_tokens = base64.b64decode(globus_token_data)

# Unpickle and get the dictionary
tokens = pickle.loads(pickled_tokens)

# Minimal sanity check, did we get the data type we expected?
isinstance(tokens, dict)

## Introspect Tokens

Depending on the JupyterHub configuration, there will be different numbers of tokens. For this tutorial, our identity token is a __[JSON Web Token (JWT)](https://tools.ietf.org/html/rfc7519)__. We also have tokens for different Resource Servers and scopes, including for retrieving our profile from Globus Auth; accessing the Petrel HTTPS server, and accessing the Globus Transfer service.

In [None]:
print(json.dumps(tokens, indent=4, sort_keys=True))

# B. Use Tokens

Now we can talk to different servers. In this tutorial, we show how tokens can be passed as HTTP headers. Much of this can also be done with the __[Globus Python SDK](http://globus-sdk-python.readthedocs.io/en/stable/)__.

## Get User Information

First, let's use our `auth.globus.org` token to get our __[OAuth2 user information](https://docs.globus.org/api/auth/reference/#get_or_post_v2_oauth2_userinfo_resource)__. We assemble the header with the appropriate access token, and do an HTTP `GET` on the resource.

In [None]:
# base URL for Globus Auth API
auth_base_url = 'https://auth.globus.org/v2'

# Create the header
headers = {'Authorization':'Bearer '+ tokens['tokens']['auth.globus.org']['access_token']}

# Get the user info as JSON
user_info = requests.get(auth_base_url + '/v2/oauth2/userinfo', headers=headers).json()

# Look at the response
print(json.dumps(user_info, indent=4, sort_keys=True))

## Get Identity Information

Using the __[Globus Auth API resource for identities](https://docs.globus.org/api/auth/reference/#v2_api_identities_resources)__, we perform a `GET` on a specific identity, our own, to examine its properties:

In [None]:
identity = requests.get(auth_base_url + '/api/identities/' + user_info['sub'], headers=headers).json()
print(json.dumps(identity, indent=4, sort_keys=True))

----
# C. Implement a Simple Data Flow

Our example below implements a simple, but common, data flow which includes the following steps, as illustrated below:

1. Fetch some data from a remote location, in this case the __[Petrel](http://petrel.alcf.anl.gov)__ data server at Argonne National Lab
1. Plot the retrieved data
1. Save the plot to a remote web server
1. Share a link to data on the web server
<img src="img/graph_plot_flow.png" alt="Steps in the simple workflow" align="LEFT" style="width: 85%;"/>

## 1. Get CSV Data via HTTPS

We replicate here some of the flow from the __[Modern Research Data Portal](https://mrdp.globus.org)__ design pattern and tutorial. In particular, we retrieve some weather data in a CSV file that has column names in the first row.

In [None]:
# GET the CSV from the publicly accessible HTTPS GCS endpoint
resp = requests.get('https://a4969.36fe.dn.glob.us/portal/catalog/dataset_las/1952.csv').text
csv_rows = csv.DictReader(StringIO(resp))

# inspect the header line
print(','.join(csv_rows.fieldnames))

In [None]:
# pull CSV data into lists for plotting
dates = [] # from column DATE
max_temps = [] # from column TMAX
min_temps = [] # from column TMIN

for row in vegas_rows:
    dates.append(datetime.date(int(row['DATE'][:4]), int(row['DATE'][4:6]), int(row['DATE'][6:])))
    max_temps.append(int(row['TMIN']))
    min_temps.append(int(row['TMAX']))

## 2. Plot CSV Data

In [None]:
# we will save the generated plot to this file
plot_filename = "temp_plot.png"

# generate the plot
plt.figure(figsize=(16,8))
plt.plot(dates, min_temps, label = "Min Temp")
plt.plot(dates, max_temps, label = "Max Temp")
plt.xlabel('Date YYYY-MM')
plt.ylabel('Temperature')
plt.title('Maximum and Minimum Temperatures: ' + str(dates[0])[:4])
plt.grid(True)
plt.savefig(plot_filename)
plt.show()

## 3. Save Plot on Globus Endpoint

For these Globus Transfer API interactions, we use the Globus Transfer token. As the Globus Transfer API does not support __[`application/x-www-form-urlencoded`](https://docs.globus.org/api/transfer/overview/#document_formats)__ data, we are explicit about the JSON we pass.

At this point, we will:

1. Activate the endpoint we're using for the tutorial
1. Create a directory for our file
1. `PUT` our plot there
1. Generate a link to our plot and view it

Note, the calls to Transfer can be made via the Globus SDK, with argument validation, etc. For this tutorial we're using a direct HTTP request for pedagogical purposes.

In [None]:
# base URL for the Globus Transfer API
transfer_base_url = 'https://transfer.api.globus.org/v0.10'

# ID of the endpoint that we're using for the tutorial
endpoint_uuid = 'e56c36e4-1063-11e6-a747-22000bf2d559'
endpoint_url = 'https://testbed.petrel.host'
endpoint_data_path = '/globus/tutorials/jupyterhub_data_flow_images/'

# define headers for HTTP request
headers = {'Authorization':'Bearer '+ tokens['tokens']['transfer.api.globus.org']['access_token'],
          "Content-Type" : "application/json"}

In [None]:
# get our username from JupyterHub (which includes a hash to avoid collision)
username = os.getenv('JUPYTERHUB_USER')
print("My user name is " + username)

In [None]:
# autoactivate the endpoint
resp = requests.post(
  transfer_base_url + '/endpoint/' + endpoint_uuid + '/autoactivate',
  headers=headers)
print(resp.status_code)
print(resp.text)

In [None]:
# call the Transfer API to make a directory
mkdir_payload = {"DATA_TYPE": "mkdir", "path": endpoint_data_path + username}
resp = requests.post(
  transfer_base_url + '/endpoint/' + endpoint_uuid + '/mkdir',
  headers=headers,
  json=mkdir_payload)
if (resp.status_code != 502): # directory exists; ignore error
  print(resp.status_code)
  print(resp.text)

## 4. Upload Plot to a Shared Endpoint
We now upload the image file to the Petrel data server at Argonne National Laboratory. This Globus endpoint supports HTTPS access, so we can `PUT` the plot image file there directly. If this was a large file (or many files) we might want to use a Globus Transfer request instead.

In [None]:
# define headers for HTTP request
headers = {'Authorization':'Bearer '+ tokens['tokens'][endpoint_uuid]['access_token']}

# pass file handle to the requests library for the PUT
image_data = open(plot_filename, 'rb')

# PUT the file to the shared endpoint
resp = requests.put(endpoint_url + endpoint_data_path + username + plot_filename,
                    headers=headers, data=image_data, allow_redirects=False)
print(resp.status_code)

## 5. Share a link to the plot file

Let's look at a link to the file. Clicking the link will require you to authenticate to the Globus endpoint since your browser is a different client than this notebook server.

In [None]:
print(endpoint_url + endpoint_data_path + username + plot_filename)

We can also look at the folders and permissions on the __[shared endpoint](https://app.globus.org/file-manager/collections/e56c36e4-1063-11e6-a747-22000bf2d559/sharing?back=endpoints)__.