# Using the IBM Db2 Augmented Data Explorer REST API

This notebook includes basic Python examples for using the REST API to Db2 Augmented Data Explorer.

## Required libraries

To run the examples in this notebook, you need to install the following libraries:

- `requests`: For making HTTP calls.
- `matplotlib`: For visualizing the results.

We also use the `pandas` library. This library should already be installed if you are able to use this notebook.


## Setting up

We are going to import the required libraries and create a variable that stores the base URL for the REST API. Update this variable to a value that matches your environment.

In [1]:
from IPython.core.display import display, HTML
import matplotlib
import pandas as pd
import requests

base_url = "http://localhost:5000/api/v1"

## Getting authenticated

All the API endpoints are protected. You must be authorized as a ??What type of user?? user to access the endpoints. Here we are authenticating with the default user, who has the correct privileges. The API responds with an access token.

In [None]:
#Authentication 

account_username = 'tester'
account_password = 'testing123'

In [2]:
auth_json = {"username": account_username, "password": account_password}

auth_resp = requests.post(f"{base_url}/sessions", json=auth_json)

print(auth_resp.json())

{'token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1c2VybmFtZSI6InRlc3RlciIsImV4cGlyYXRpb25fZGF0ZSI6IjIwMTktMTAtMTcgMTY6Mzg6NDkuNTQ3MTExIiwic2Vzc2lvbi1pZCI6IjFkOTcxODE2LWE2YWItNDYzZS1iMjc1LWFmNTA3MjExYTZlNyJ9.EHt51D9XQQDVn9lthU5Lby-lXe-dpIaYqo-8lAFCBnX7yYor_VOwKqBv7of2kDljtY1w01ytdsShB4iYK_JporzrfwSZlQrZGEx_T2-0As_NjCGikliZnWDZEYsqdN4kxIAeCnY3F7pBCGJh_rvtvPNNsbT_YXvatDl3i3u5Z2cdoT0PN0tXog5rL36zmfRml9i6GlzCp5HRMuoegqlquv_c3uJLVscZjVELBajDgnj9oIW6UDdStkKo5ZsOS648PVCB01uv9R4NH1Cc7N5LAef7mSD6hfwjlAVdXnLiXn5UdqtLQL6naySG55Ubi36AXhvbPtuPKCy8ebdTWKt6IA'}


To make it easier to make subsequent calls with the prop authorization, we will create a header variable with the token.

In [3]:
header = {"Authorization": f"Bearer {auth_resp.json()['token']}"}

## Setting up connections

Db2 Augmented data explorer works with Db2 databases. If you haven't already set up a connection using the browser interface, you must create one now. Following is a sample connection. Before running this cell, update the variables at the beginning of the cell with values that match your connection.

In [4]:
hostname = "your_hostname"
username = "your_username"
password = "your_password"
schema_name = "your_schema_name"

connection_json = {
  "connection_type_guid": "your_connection_type_guid",
  "configuration": {
    "hostname": hostname,
    "database": "your_database",
    "port": "50000",
    "protocol": "TCPIP",
    "user": username,
    "password": password,
    "autocommit": False
  },
  "schemas_to_crawl": [
    {
      "schema": schema_name,
      "crawl_all_tables": True,
      "selected_tables": []
    }
  ]
}

connection_resp = requests.post(f"{base_url}/connections", json=connection_json, headers=header)

print(connection_resp.json())

{'message': 'Successfully created connection.', 'guid': '263bff70-4e34-50f0-8440-1fbfad774e7a'}


## Crawling

Before you can search or get results from Db2 Augmented Data Explorer, you need to crawl the connection that you set up previously. The crawling process connects to the database and stores information about the data in an Elasticsearch index. This index

In [15]:
crawl_json = {
    "status": "crawling"
}

crawl_resp = requests.put(f"{base_url}/status", json=crawl_json, headers=header)

print(crawl_resp.json())

{'message': 'Crawling started. Use GET /api/v1/status to check the crawling status.'}


As indicated by the response, you can continue to poll the `/status` endpoint to check the progress of the crawling process.

In [18]:
status_resp = requests.get(f"{base_url}/status", json=crawl_json, headers=header)

print(status_resp.json())

{'crawling': False, 'crawling_since': None, 'crawl_health': 'green', 'crawl_health_last_updated': '2019-10-17T14:55:40.182454', 'crawl_last_message': 'Crawling complete', 'crawl_pct_complete': 100, 'cache_health': 'green', 'cache_health_last_updated': '2019-10-17T14:55:41.235453', 'cache_last_message': '', 'ds_health': 'green', 'ds_health_last_updated': '2019-10-17T14:55:41.229087', 'ds_last_message': '', 'crawl_last_updated': '2019-10-17T14:54:16.700824'}


When `'crawling'` is `False`, crawling is complete and we can start searching.

## Storyline
Looking at the `ADEDEMO` schema to analyze the `Customers` table, with the following objective:
1. Get insights for `Average Monet Spent`
2. Is there any relationship between `Average Money Spent` with `Gender`, `Education`, `Age` or `Marital Status`
3. How is `Average Money Spent` related to `Number of Purchaes` and `Days Since Last Purchase`

## Searching

Searching in Db2 Augmented Data Explorer involves sending search text and getting back query suggestions for this search text. These suggestions are valid query objects, which you can use to get data.

In [23]:
search_json = {
    "search_text": "avg amount by",
    "plain_text": True
}

search_resp = requests.post(f"{base_url}/suggestions", json=search_json, headers=header)

In [24]:
print(search_resp.json())

{'suggestions': []}


## Getting data and natural language insights

To get data and natural language insights, use the `/results` endpoint. This endpoint requires a valid

In [None]:
results_response = requests.post(f"{base_url}/results", json=search_resp.json()['suggestions'][0], headers=header)

In [None]:
results_response.json()

Let's display this data and the related insights.

In [None]:
dataframe = pd.read_json(results_response.json()['data'])
dataframe

In [None]:
display(HTML(results_response.json()['nlg']))

In [None]:
dataframe.plot(x='MONTH', kind='bar')