# JupyterHub Integration Demonstration
----

This tutorial demonstrates how to (A) pull Globus Auth tokens from the Jupyter Notebook Server environment and use those tokens to (B) interact with different REST APIs secured with Globus Auth, including (C) a simple workflow. Our code here is pedantic, for clarity; much could be encapsulated in Python packages to simplify the notebook.

Since you used the Globus-enabled JupyterHub environment to launch this notebook, the following have already happened:
1. You have established your identity by authenticating, with an institutional credential or ORCID or similar
1. You have granted consent to the issuance of tokens with certain scopes
1. A notebook has been created, with access to those tokens

<img src="img/jupyterhub_tokens.png" alt="Steps followed prior to starting this notebook" align="CENTER" style="width: 85%;"/>

<em>Note 1: Tokens are issued and stored in the JupyterHub database at login. They typically expire in 24 hours. As we do not provide a mechanism to handle refresh tokens, the simplest way to get new ones is to:

* Stop your server (see the Control Panel)
* Log out
* Log back in
* Start your server
* Launch the notebook

Note 2: You need to join the tutorial group at https://www.globus.org/app/groups/50b6a29c-63ac-11e4-8062-22000ab68755/about to be able to access the shared endpoint at the end.</em>

## Import Packages

In [None]:
# These are to get the tokens
import os
import pickle
import base64

# Much of what we're dealing with is JSON
import json

# We're going to be making explicit HTTPS calls
import requests

# This is to work with data for our example
import csv
import datetime
import matplotlib.pyplot as plt
from io import StringIO

## A. Get Tokens

The Globus-enabled JupyterhHub passes the tokens into the notebook environment `base64` encoded as a pickled Python dictionary assigned to the `GLOBUS_DATA` variable. We'll grab the variable and unpack it. 

In [None]:
# Get the content
globus_env_data = os.getenv('GLOBUS_DATA')

In [None]:
# Now we have the pickled tokens
pickled_tokens = base64.b64decode(globus_env_data)

# Unpickle and get the dictionary
tokens = pickle.loads(pickled_tokens)

# Minimal sanity check, did we get the data type we expected?
isinstance(tokens, dict)

### Look inside the tokens

Depending on the JupyterHub configuration, there will be different numbers of tokens. For this tutorial, our identity token is a __[JSON Web Token (JWT)](https://tools.ietf.org/html/rfc7519)__. We also tokens for different Resource Servers and scopes, including for retrieving our profile from Globus Auth; accessing the Petrel HTTPS server, and accessing the Globus Transfer service.

In [None]:
print(json.dumps(tokens, indent=4, sort_keys=True))

## B. Use the tokens

Now we can talk to different servers. In this tutorial, we show how tokens can be passed as HTTP headers. Much of this can also be done with the __[Globus Python SDK](http://globus-sdk-python.readthedocs.io/en/stable/)__.

### Acess user information

First, let's use our `auth.globus.org` token to get our __[OAuth2 user information](https://docs.globus.org/api/auth/reference/#get_or_post_v2_oauth2_userinfo_resource)__. We assemble the header with the appropriate access token and do an HTTP `GET` on the resource.

In [None]:
# Create the header
headers = {'Authorization':'Bearer '+ tokens['tokens']['auth.globus.org']['access_token']}

# Get the user info as JSON
user_info = requests.get('https://auth.globus.org/v2/oauth2/userinfo',headers=headers).json()

# Look at the response
print(json.dumps(user_info, indent=4, sort_keys=True))

### Identities

Using the __[Globus Auth API resource for identities](https://docs.globus.org/api/auth/reference/#v2_api_identities_resources)__, we perform a `GET` on a specific identity, our own, to examine its properties:

In [None]:
identity = requests.get('https://auth.globus.org/v2/api/identities/' + user_info['sub'],headers=headers).json()
print(json.dumps(identity, indent=4, sort_keys=True))

----
## C. Implement a Simple Workflow

We next implemement a simple workflow by following these steps (and see figure):

1. Fetch some data from a remote location, in this case the __[Petrel](http://petrel.alcf.anl.gov)__ data server at Argonne National Lab
1. Plot the retrieved data
1. Save the plot to a remote web server
1. Share a link to data on the web server
<img src="img/graph_plot_flow.png" alt="Steps in the simple workflow" align="LEFT" style="width: 85%;"/>

### 1. Pull Down a CSV File

We replicate here some of the flow from the __[Modern Research Data Portal](https://mrdp.globus.org)__ design pattern and tutorial. In particular, we retrieve the climate data for Las Vegas from 1952. This is a CSV file with column names in the first row.

In [None]:
# GET the CSV from the publicly accessible HTTPS GCS endpoint
vegas_climate_csv = requests.get('https://tutorial-https-endpoint.globus.org/portal/catalog/dataset_las/1952.csv').text

vegas_rows = csv.DictReader(StringIO(vegas_climate_csv))

# Look at the header line
print(','.join(vegas_rows.fieldnames))

In [None]:
# Pull the data from the CSV text
vegas_day = [] # DATE
vegas_tmax = [] # TMAX
vegas_tmin = [] # TMIN

for row in vegas_rows:
    day = datetime.date(int(row['DATE'][:4]), int(row['DATE'][4:6]), int(row['DATE'][6:]))
    vegas_day.append(day)
    vegas_tmin.append(int(row['TMIN']))
    vegas_tmax.append(int(row['TMAX']))

### 2. Plot the Data

In [None]:
# Plot the data
plt.figure(figsize=(16,8))
plt.plot(vegas_day, vegas_tmin, label = "Min Temp")
plt.plot(vegas_day, vegas_tmax, label = "Max Temp")
plt.xlabel('Date YYYY-MM')
plt.ylabel('Temperature')
plt.title('Las Vegas Airport Temperature Min & Max')
plt.grid(True)
plt.savefig("vegas.png")
plt.show()

### Save the plot in our Jupyter environment

For these Globus Transfer API interactions, we use the Globus Transfer token. As the Globus Transfer API does not support __[`application/x-www-form-urlencoded`](https://docs.globus.org/api/transfer/overview/#document_formats)__ data, we are explicit about the JSON we pass.

At this point, we will:

1. Activate the endpoint we're using for the tutorial
1. Create a directory for our file
1. `PUT` our plot there
1. Generate a link to our plot and view it

Note, the calls to Transfer can be made via the Globus SDK, with argument validation, etc. For this tutorial we're using a direct HTTP request for pedagogical purposes.

In [None]:
# Base URL for the Globus Transfer API
base_url = 'https://transfer.api.globus.org/v0.10'
# ID of the endpoint that we're using for the tutorial
endpoint_uuid = 'e56c36e4-1063-11e6-a747-22000bf2d559'

# Create the header
headers = {'Authorization':'Bearer '+ tokens['tokens']['transfer.api.globus.org']['access_token'],
          "Content-Type" : "application/json"}

In [None]:
# Grab our username (which includes a hash to avoid collision)
username = os.getenv('JUPYTERHUB_USER')
print("My user name is " + username)

In [None]:
# Autoactivate the endpoint
resp = requests.post(base_url + '/endpoint/' + endpoint_uuid + '/autoactivate',
                    headers=headers)
print(resp.status_code)
print(resp.text)

In [None]:
# Call the Transfer API to make the directory
# Note, this will throw a 502 if the directory already exists.
# So don't panic if that happens when you rerun it.
# Later this can be done directly via Collections
mkdir_payload = { "DATA_TYPE": "mkdir",
                  "path": "/test/jhtutorial/users/" + username }

resp = requests.post(base_url + '/endpoint/' + endpoint_uuid + '/mkdir',
                    headers=headers, json=mkdir_payload)
print(resp.status_code)
print(resp.text)

### 3. Put the plot on a shared endpoint on the Petrel data server at Argonne
This Globus Connect Server endpoint supports HTTPS and we can `PUT` the plot image file there directly. If this was a large file (or many files) we might want to use a Globus Transfer request instead.

In [None]:
# Create the header
headers = {'Authorization':'Bearer '+ tokens['tokens']['petrel_https_server']['access_token']}

# Pass the file pointer reference to the requests library for the PUT
image_data = open('vegas.png', 'rb')

# Get the user info as JSON
resp = requests.put('https://testbed.petrel.host/test/jhtutorial/users/' + username + '/vegas.png',
                    headers=headers, data=image_data, allow_redirects=False)
print(resp.status_code)

### 4. Share a link to the plot file

Let's look at a link to the file. This will require you to authenticate to the GCS endpoint since your browser is a different client than this notebook server.

In [None]:
print('https://testbed.petrel.host/test/jhtutorial/users/' + username + '/vegas.png')

We can also look at the folders and permissions on the __[shared endpoint](https://www.globus.org/app/transfer?origin_id=e56c36e4-1063-11e6-a747-22000bf2d559&origin_path=%2Ftest%2Fjhtutorial%2F)__.