![Top <](./images/watsonxdata.png "watsonxdata")

# Accessing watsonx.data with RESTful Calls
Representational state transfer (REST) is a software architectural style that defines a set of constraints to be used for creating Web services. Web services that conform to the REST architectural style, called RESTful Web services, provide interoperability between computer systems on the internet. RESTful Web services allow the requesting systems to access and manipulate textual representations of Web resources by using a uniform and predefined set of stateless operations.

Presto provides [System APIs](https://prestodb.io/docs/current/rest.html) and [Query APIs](https://prestodb.io/docs/current/develop/client-protocol.html). This notebook will use two RESTful calls that check the status of the server and run a query to retrieve a result set.

The Python requests library provides a simple call interface to RESTful. The base64 library is required for converting userids and passwords into a format this required in the REST header.

In [None]:
import requests
import base64

### Retrieve Presto Credentials
The credentials to access the Presto engine are different from the watsonx.data credentials. The two functions below will extract the Presto administration userid and password that will then be used in subsequent RESTful calls.

In [None]:
presto_userid     = %system docker exec ibm-lh-presto printenv PRESTO_USER
presto_userid     = presto_userid[0]
presto_password   = %system docker exec ibm-lh-presto printenv LH_INSTANCE_SECRET
presto_password   = presto_password[0]
print(f"Presto user: {presto_userid} Presto password: {presto_password}")

The userid and password need to be combined into a single string that is converted to base64.

In [None]:
credentials_bytes = f"{presto_userid}:{presto_password}".encode("ascii") 
credentials       = base64.b64encode(credentials_bytes).decode("ascii")
print(f"Credentials Base64: {credentials}")

### Host Settings
We need to add the specific information of the server before attempting to connect with the REST calls. The API version is v1 which we need to pass to the service. If the REST API changes in the future, a new API version would be used, but older programs will still be able to access the older APIs using the v1 level.

We need to provide a certificate to the RESTful service to validate the server that we are connecting to. This certificate needs to be local to the application that is making the RESTful call.

In [None]:
host              = "https://watsonxdata"
port              = 8443
api               = "/v1"
certfile          = "/certs/lh-ssl-ts.crt"

The auth_header is passed to the RESTful service and contains the authorization values (userid/password) required to connect. The `X-Presto-User` must be present in the header for the SQL RESTful calls and contains the name of the session user. This could be the same as the watsonx.data userid, but it is only for query tracking and not for authentication.

In [None]:
auth_header = {
    "Content-Type"  : "text/javascript",
    "Authorization" : f"Basic {credentials}",
    "X-Presto-User" : presto_userid
}

## Check the Status of the Presto Engine
At this point we have sufficient information to make a RESTful request to the Presto engine, asking for the current status of the engine. The service name is `/info` and does not require any additional parameters. If the engine is running okay, the reason code should return `OK`.

The `info` RESTful service uses GET to send a request to the server. The RESTful call requires the following information.
* URL - host, port, api and service
* Service - `info` to get details of the server
* Headers - The authentication settings
* Verify - The certificate file

In [None]:
service = "/info"
request = {}
r = requests.get(f"{host}:{port}{api}{service}", headers=auth_header, verify=certfile)
r.reason

The request call will return a number of fields in the variable `r`. The `reason` field tells us whether or not the call was successful. If you want to know what the return code is, you use the following call.

In [None]:
r.status_code

### Extract Details
The payload (results) from a RESTful call are found in the `r.json` field. To view the contents of this field you must a function specification `r.json()`. 

In [None]:
r.json()

This value returned in the `r.json` field can be accessed by using the Python dictionary format. To return the starting flag, we will use the following syntax.

In [None]:
r.json()['starting']

## Querying Presto with a RESTful Call
You can query data in Presto by using a RESTful call. There are a number of steps involved when retrieving answer sets from Presto. First of all, a single RESTful call may not result in an answer set immediately. What this means is that the program must "poll" the server to determine when to retrieve results. 

The intial call to RESTful will result in a number of possible responses:

* WAITING_FOR_PREQUISITES - Presto is checking that resources are available to run your query
* QUEUED - Your SQL is queued for execution
* RUNNING - The SQL is running
* FINISHED - The SQL has finished
* ERROR - An error was found in your SQL

The RESTful service uses POST to send a request to the server. The RESTful call requires the following information.
* The host, port, api and service
* Service - `statement` which indicates this is an SQL statement
* Headers - The authentication settings
* Data - SQL statement you want executed
* Verify - The certificate file

The program below initiates a POST request with the connection details and the SQL statement. The returned message is found in the `r.json()` field. A field called `stats` contains another field called `state` which indicate what state the RESTful service is in. Based on the current state of execution, the code will continue looping looking for intermediate results or the final results.

Every RESTful call (after the initial one) may send data back to the client. This data needs to be appended after each RESTful call. The program may need to make several RESTful calls to retrieve the entire answer set. The returned `r.json()` field will contain the URL that should be used to get the next block of rows using a `GET` request. Once the answer set is exhausted, the final block will have a FINISHED status.

In order to reduce the overhead on the Presto service, a delay is added between every call to not overwhelm the server!

In [None]:
def restfulSQL(host,port,api,auth_header,certfile,sql):
    
    from time import sleep
    import pandas as pd

    service = "/statement"
    data    = []
    columns = []
    error   = False
    
    URI = f"{host}:{port}{api}{service}"
    r = requests.post(URI, headers=auth_header, data=sql, verify=certfile)
    if (r.ok == False):
        print(r.reason)
        return None

    while r.ok == True: 
        results = r.json()
        collect = False
        stats = results.get('stats',None)
        state = stats['state']
        print(state)
        if (state in ["FINISHED","RUNNING"]):
            collect = True
        elif (state == "FAILED"):
            errormsg = results.get('error',None)
            if (errormsg != None):
                print(f"Error: {errormsg.get('message')}")
            error = True
            break
        else:
            collect = False
    
        if (collect == True):
            columns = results.get('columns',None)
            result  = results.get('data',None)
            if (result not in [None]):
                data.append(result)
    
        URI = results.get('nextUri',None)
        if (URI != None):    
            sleep(.1)
            r = requests.get(URI, headers=auth_header, verify=certfile)
        else:
            break

    if (error == True):
        return None
  
    column_names = []
    for col in columns:
        column_names.append(col.get("name"))
    
    data_values = []
    for row in data[0]:
        data_values.append(row)
    
    df = pd.DataFrame(data=data_values, columns=column_names)
    return df


In [None]:
restfulSQL(host,port,api,auth_header,certfile,'select * from "tpch"."tiny"."customer" limit 10')

### Example of Invalid SQL
If you send invalid SQL to the engine you will receive a FAILED state back from the RESTful call.

In [None]:
restfulSQL(host,port,api,auth_header,certfile,'select * from "tpch"."tiny"."xcustomer" limit 10')

#### Credits: IBM 2024, George Baklarz [baklarz@ca.ibm.com]