# This Notebook Supports the SGF2020 Presentation:
## 4278_TenMinutesToYourFirstHelloWorld_REST-APIs_SGF2020
### ** include link when available**

## Objective:

By executing the following playbook, you will create a standard Analytical Base Table (ABT) and promote the ABT so that it can be used in other SAS Viya applications:
- Process and prepare the Water_Cluster dataset from the samples CAS library
 - The Water_Cluster dataset contains data on home water consumption and is used in example [Visual Analytics reports](https://go.documentation.sas.com/api/docsets/viyaov/3.5/content/viyaov.pdf) 
- Run SAS DataStep to add an indicator variable by calling the RunCode action
- Aggregate the dataset with the equivalent of proc means, then promote the ABT to be used by other SAS Viya applications
 - This playbook is used to demonstrate a generally accepted practice in order to prepare an Analytical Base Table (ABT)
 - Analysts typically denormalize the data and create statistically relevant columns when creating an ABT
 - The resulting ABT can then be used for subsequent analysis and modeling in SAS Viya
 
#### Disclaimer:
The suggested steps in this playbook are not exhaustive around creating an ABT; your individual requirements may require more or less work.

## Usage:  

Execute the following cells; making any changes as you see fit

## Prerequisites:
- [SAS Viya 3.4 or 3.5](https://www.sas.com/en_us/software/viya.html)
- [Python 3.x](https://www.python.org/download/releases/3.0/), note that all developed was performed with 3.7.4
- [Jupyter](https://jupyter.org) {Hub | Lab | Notebook} is recommended but not required
- [Python requests package](https://requests.readthedocs.io/en/master/)
- [Python json package](https://docs.python.org/3/library/json.html)
- [Python pandas package](https://pandas.pydata.org/)
- [Python getpass package](https://docs.python.org/3/library/getpass.html)
- Git, in order to [clone the project](https://github.com/sascommunities/sas-global-forum-2020/tree/master/papers/4278-2020-Bouts)

## Supporting Documentation:
- Information supporting CAS REST APIs can be found at [developer.sas.com](https://developer.sas.com/guides/restapis/cas-rest.html)
- Information supporting the CAS Action Sets behind the CAS REST APIs can be found in the following two locations:
 - The [SAS® Viya® 3.5: System Programming Guide](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=titlepage.htm&locale=en)
 - The [SAS® Viya® 3.5 Actions and Action Sets by Name and Product](https://go.documentation.sas.com/?cdcId=pgmcdc&cdcVersion=8.11&docsetId=allprodsactions&docsetTarget=titlepage.htm&locale=en)

## Essential Preparation:
- Reference [this blog for instructions on how to configure the client secrets](https://blogs.sas.com/content/sgf/2019/01/25/authentication-to-sas-viya/)
- Work with your SAS Environment Administrator to setup the appropriate client secret(s)

In [None]:
# The requests module allows us to make HTTP calls; some have said it has similarities to PROC HTTP
import requests as rq
# The json module allows us to work with JSON objects / events
import json
# The pandas module allows us to work with data objects in various formats, like DataFrames
import pandas as pd
# The getpass module allows for a password prompt with variable assignment for the session
import getpass
#base_url = 'https://<URL>'
# where the URL is probably in a form like: <host.subDomain.topLevelDomain>
base_url = 'https://host.subDomain.topLevelDomain.sas.com'
# Reference the Essential Preparation to configure secret(s)
oAuthCliendId = 'your-oAuthCliendId'
oAuthClientSecret = 'your-oAuthClientSecret'
print("If no errors, then the imports and variable assignment completed")

## Step 1.2: Authenticate
- Let's Authenticate to Retrieve an OAuth Token and a Session from SAS Viya (CAS)

In [None]:
# prompt for credentials
user = input("Enter your username that you use to access SAS Viya:")
pw = getpass.getpass("Enter password - which will be reset to None:")
print('Requesting security token...')

# build OAuth API URL, then pass credentials to API to get OAuth Token
url = base_url + '/SASLogon/oauth/token'
headers = { 
    'content-type': 'application/x-www-form-urlencoded' 
    }
payload = 'grant_type=password&username=' + user + '&password=' + pw
r = rq.post(url, payload, headers=headers, auth=(oAuthCliendId, oAuthClientSecret), verify=False)
responseObj = json.loads(r.text)
oAuthAccessToken = responseObj['access_token']

# immediately reassign the pw variable to null since you are done with it
pw = None

# build Session API URL, then pass OAuth Token to API to get Session
url = base_url + '/cas-shared-default-http/cas/sessions'
headers = { 
    'Authorization': 'bearer ' + oAuthAccessToken 
}
r = rq.put(url, data=None, headers=headers, verify=False)
responseObj = json.loads(r.text)
casSessionId = responseObj['session']

# print the responses
print('\nThe token is:')
# I=if you'd like to hide your access token or session, please adjust the commented lines accordingly
# print('hidden')
print(oAuthAccessToken)
print('\nThe session is:')
# print('hidden')
print(casSessionId)
print('\nContinue if you recieved your OAuth Token and Session, otherwise you probably will see a key error due to unsucessful authentication')

## 1.3: VARIABLE ASSIGNMENT

We'll use the 'samples' CAS library to load the 'WATER_CLUSTER.sashdat' dataset that comes predefined in SAS Viya

We'll work with this data in our predefined 'casuser' CAS library

In [None]:
# Establish data & library variables
sourceCasLib = 'samples'
sourceDataPathTable = 'WATER_CLUSTER.sashdat'
destinationCasLib = 'casuser'
destinationDataPathTable = 'WATER_CLUSTER_AB'
destinationDataAbt = 'WATER_CLUSTER_AB_ABT'

print("If no errors, then variable assignment worked")

## 1.4: HELPER FUNCTIONS

In [None]:
# Define a simple helper function to print out the data variables that are loaded
def printLoadVrbls():
    print('The variables were defined as...')
    print(' -The source caslib is:', sourceCasLib)
    print(' -The source data table is:', sourceDataPathTable)
    print(' -The destination caslib is:', destinationCasLib)
    print(' -The destination data table is:', destinationDataPathTable)
    print(' -The destination data ABT is:', destinationDataAbt)

# Define a simple helper function to call API endpoint
def callEndpoint():
    # pass headers to requests
    headers = {
        'Authorization': 'bearer ' + oAuthAccessToken,
        'Content-Type': 'application/json'
        }
    # call the API endpoint
    r = rq.post(url, headers=headers, json=payload, verify=False)
    # declare global responseObj
    global responseObj
    # recieve the responseObj
    responseObj = json.loads(r.text)

# Define a simple helper function to print the API payload
def printPayload():
    print(' -For debugging, the API payload is:',payload)
    
# Define a simple helper function to print the API responses
def printResponse():
    if responseObj.get('status') == 0:
        print(' -The API response "status" is "0", indicating the action was successful.\n')
    else:
        print(' -The API response "status" is "non-zero",  indicating an error occurred')
        print(' -The API "log" is:',responseObj.get('log'))        

# Define a simple helper function to print the API responseObj.log
def printResponseLog():
    print(' -The API "log" is:',responseObj.get('log'))
    
# Define a simple helper function to print the API response objects
def printResponseObj():
    print('\n -The full API response object is:',responseObj)
    
# Define a DataFrame table output
def printTable():
    tableColumns = pd.DataFrame(responseObj.get('results').get('ColumnInfo').get('rows'))
    print('\n -The table columns are:\n')
    display(tableColumns)

print("If no errors, then variable assignments worked")

## 2.1: LOAD A DATASET INTO CAS

We'll load the 'samples.WATER_CLUSTER' dataset that comes predefined in Viya

The [table.loadTable action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-table-loadtable.htm&locale=en)


In [None]:
# Call helper function to print out the variables
printLoadVrbls()
print('\nLoading table into CAS memory...')

# Build full API URL -- perform the API action set -> /actions/table.loadTable
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId + '/actions/table.loadTable'

# Build the API payload
payload = {
    'casLib': sourceCasLib,
    'path': sourceDataPathTable,
    'casout': {
        'caslib': destinationCasLib,
        'name': destinationDataPathTable
        }
    }

# Call the API endpoint
callEndpoint()
# Receive the response log and print it out
printResponse()
# Verify the payload
printPayload()
# Optionally call printResponseLog() and/or printResponseObj() for debugging
# printResponseLog()
# printResponseObj()
print('\nIf no errors, then table has been loaded into CAS memory')

## 2.2: OPTIONAL – VERIFY THE DATASET WAS LOADED INTO CAS

We'll verify that 'casuser.WATER_CLUSTER' is ready for use

The [table.tableInfo action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-table-tableinfo.htm&locale=en)


In [None]:
# Call helper function to print out the variables
printLoadVrbls()
print("\nQuery destination data table info...")

# Build full API URL -- perform the API action set -> /actions/table.tableInfo
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId + '/actions/table.tableInfo'

# Build the API payload
payload = {
    'casLib': destinationCasLib,
    'name': destinationDataPathTable
    }

# Call the API endpoint
callEndpoint()
# Receive the response log and print it out
printResponse()
# Verify the payload
printPayload()
# Optionally call printResponseLog() and/or printResponseObj() for debugging
# printResponseLog()
# printResponseObj()
print('\nIf no errors, then this table is available in CAS memory')

## 2.3: RUN SAS DATASTEP CODE

Since it's typically more efficient to perform operations on numbers compared with unstructured text and this is a common example of pre-processing to prepare data for analytics; we'll create a new column (double) based on simple boolean logic.

The [datastep.runCode action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-datastep-runcode.htm&locale=en)

In [None]:
# Call helper function to print out the variables
# This will be commented out to reduce repetition, but can easily turned back on as desired
# printLoadVrbls()
print("\nCall the run code action...")

# Build full API URL -- perform the API action set -> /actions/dataStep.runCode
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId + '/actions/dataStep.runCode'

# Build the API payload
payload = {
    'code': "data casuser.water_cluster_ab (replace=yes); set casuser.water_cluster_ab; if 'US Holiday'n = null then US_Holiday_Ind = 0; else US_Holiday_Ind = 1; run;"
    }

# Here is the easier to read SAS DataStep code:
# data casuser.water_cluster_ab (replace=yes);
#  set casuser.water_cluster_ab;
#  if 'US Holiday'n = null then US_Holiday_Ind = 0;
#  else US_Holiday_Ind = 1;
# run;

# Call the API endpoint
callEndpoint()
# Receive the response log and print it out
printResponse()
# Verify the payload
printPayload()
# Optionally call printResponseLog() and/or printResponseObj() for debugging
printResponseLog()
# printResponseObj()
print('\nIf no errors, then this runCode worked as expected')

## 2.4: OPTIONAL – VERIFY THE DATASTEP WORKED AS EXPECTED

Verify the new column(s)

The [table.columnInfo action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-table-columninfo.htm&locale=en)

In [None]:
# Call helper function to print out the data variables that are loaded
# This will be commented out to reduce repetition, but can easily turned back on as desired
# printLoadVrbls()
print("\nQuery destination data table column information...")

# Build full API URL -- perform the API action set -> /actions/table.columnInfo
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId + '/actions/table.columnInfo'

# Build the API payload
payload = {
        'table': {
            'caslib': destinationCasLib,
            'name': destinationDataPathTable
        }
    }

# Call the API endpoint
callEndpoint()
# Receive the response log and print it out
printResponse()
# Verify the payload
printPayload()
# Call printTable() for verification
printTable()
# Optionally call printResponseLog() and/or printResponseObj() for debugging
# printResponseLog()
# printResponseObj()
print('\nIf no errors, then this table is available in CAS memory')

## 2.5: CREATE AN ANALYTICAL BASE TABLE (ABT)

Create an Analytical Base Table (ABT) with the equivalent of proc means by using aggregation.aggregate action.  As previously mentioned:
- This playbook is used to demonstrate a generally accepted practice in order to prepare an Analytical Base Table (ABT)
- Analysts typically denormalize the data and create statistically relevant columns when creating an ABT
- The resulting ABT can then be used for subsequent analysis and modeling in SAS Viya

The [aggregation.aggregate action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=casanpg&docsetTarget=cas-aggregation-aggregate.htm&locale=en)

* Note, that once a table is promoted, it will need to be dropped before it can be reloaded, refer to the [table.dropTable action](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-table-droptable.htm&locale=en)

In [None]:
# Call helper function to print out the data variables that are loaded
# This will be commented out to reduce repetition, but can easily turned back on as desired
# printLoadVrbls()
print("\nCall the aggregation.aggregate action...")

# Build full API URL -- perform the API action set -> /actions/aggregation.aggregate
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId + '/actions/aggregation.aggregate'

# Build the API payload
# This payload contains more parameters similar to the parameters that proc means would take
payload = {
        'table': {
            'caslib': destinationCasLib,
            'name': destinationDataPathTable,
            'vars': 'Daily_W_C_M3',
            'groupBy': [
                {'name': 'Year'},
                {'name': 'Month'},
                {'name': 'City'},
                {'name': 'Clli'},
                {'name': 'Zip'},
                {'name': 'Meter_Location'},
                {'name': 'US_Holiday_Ind'}
            ]
        },
        'saveGroupbyFormat': 'False',
        'varSpecs': [{
            'name': 'Daily_W_C_M3',
            'summarySubset': ["MEAN","STD", "MIN", "MAX"
            ]
        }],
        'casout': {
            'caslib': destinationCasLib,
            'name': destinationDataAbt,
            'promote': 'True'
        }   
    }

# Here is equivalent SAS Proc Means code:
# proc means data=casuser.water_cluster_ab noprint;
#  var Daily_W_C_M3;
#  by Year Month City Clli Zip Meter_Location US_Holiday_Ind;
#  output out=casuser.water_cluster_ab_ABT (drop=_type_) mean= std= min= max= /autoname;
# run;

# Call the API endpoint
callEndpoint()
# Receive the response log and print it out
printResponse()
# Verify the payload
printPayload()
# Optionally call printResponseLog() and/or printResponseObj() for debugging
# printResponseLog()
# printResponseObj()
print('\nIf no errors, then this table is available in CAS memory')

## 2.6: OPTIONAL – VERIFY THE AGGREGATION WORKED AS EXPECTED

Verify the new ABT table structure

The [table.columnInfo action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-table-columninfo.htm&locale=en)

In [None]:
# Call helper function to print out the data variables that are loaded
# This will be commented out to reduce repetition, but can easily turned back on as desired
# printLoadVrbls()
print("\nQuery destination data table column information...")

# Build full API URL -- perform the API action set -> /actions/table.columnInfo
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId + '/actions/table.columnInfo'

# Build the API payload
payload = {
        'table': {
            'caslib': destinationCasLib,
            'name': destinationDataAbt
        }
    }

# Call the API endpoint
callEndpoint()
# Receive the response log and print it out
printResponse()
# Verify the payload
printPayload()
# Call printTable() for verification
printTable()
# Optionally call printResponseLog() and/or printResponseObj() for debugging
# printResponseLog()
# printResponseObj()
print('\nIf no errors, then this table is available in CAS memory')

## 2.7: POTENTIAL NEXT STEPS – FETCH A SUBSET OF THE ABT DATA

I might suggest that your next step in this process could be to use the [table.fetch action documentation](https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=caspg&docsetTarget=cas-table-fetch.htm&locale=en) to return a number of rows from your ABT.

If I were going to build a fetch API call, I would start with the code from 2.6 above, update the url, and then potentially update the payload by adding a maximum number of rows to return, for example:

`payload = {
        'table': {
            'caslib': destinationCasLib,
            'name': destinationDataAbt
        },
        'maxRows': 15,
    }`
    
To view the data I would suggest creating another helper function similar to printTable(), but updated to parse the table.fetch response object.

# The End --> Summary

If everything worked as expected, you have just created an Analytical Base Table (ABT) and promoted it to be used in other SAS Viya applications by:
- Starting with the Water_Cluster dataset from the samples CAS library
- Running SAS DataStep using the RunCode action
- Aggregating  the dataset with the equivalent of proc means and promoting it to be used by other SAS Viya applications

# Cleanup

It's suggested that you run the following cell to cleanup your CAS session and logout

In [None]:
print('\nCleaning up session...\n')
# Build full API URL
url = base_url + '/cas-shared-default-http/cas/sessions/' + casSessionId
# Build API headers, there is no payload for this call
headers = {
    'Authorization': 'bearer ' + oAuthAccessToken, 
    'content-type': 'application/json'
    }

# receive the response and print it out
r = rq.delete(url, headers=headers, verify=False)
if r.status_code == 200:
    print("Successfully cleaned up (status code = ",r.status_code,")")
    print('\n**I ran out of time getting Anaconda to trust the CA, so I had to pass a no-verify parameter above**')
    print('\n**This should be fixed (eventually) ... but for now it\'s acceptable since we\'re using a proper cert**')
else:
    print("Something else seems to have happened (status code = ",r.status_code,")")

# Comments
Your comments and questions are valued and encouraged; [contact the author](https://www.linkedin.com/in/andybouts/)