# Consume Data from the SAP Datasphere OData API

## Purpose
This notebook helps you to get started browsing and working with assets exposed via the OData API. It will:
1. Connect to the tenant you specify using the username, password, and OData client credentials provided by you in the [dsp_secrets.json](./dsp_secrets.json) file.
2. Make requests against the OData APIs to:
   - [List the spaces you have access to](#list-spaces)
   - [List the assets you have access to](#list-assets)
   - [List the assets in a specified space](#list-space-assets)
   - [Return the metadata for a specified asset](#metadata)
   - [Return unaggregated data from a view](#view-data)
   - [Aggregate view data in pandas](#aggregate-view-data)
   - [Return unaggregated data from an analytic model](#unaggregated-data)
   - [Aggregate and plot analytic model data in a chart](#aggregate-plot)

This script has been tested and reviewed by SAP but, in case of errors or other problems, SAP is not liable to offer fixes nor any kind of support and maintenance. It is recommended that you test the script first, ideally in a test environment. You can also edit, enhance, copy or otherwise use the script in your own projects.

## Prerequisites
You must:
- Have an SAP Datasphere user for the specified tenant and be a member of one or more spaces exposing data.
- [Obtain the credentials for an OAuth client for the specified tenant](https://help.sap.com/docs/SAP_DATASPHERE/9f804b8efa8043539289f42f372c4862/3f92b46fe0314e8ba60720e409c219fc.html) and enter them in [dsp_secrets.json](./dsp_secrets.json)

## Storing Credentials
For simplicity, the SAP Datasphere OAuth credentials used in this script are stored in a plain-text file. When adapting this script you should use your organization's credentials store solution and apply any other security recommendations.

## More Information
For detailed information about working with the OData APIs, see [Consume Data via the OData API](https://help.sap.com/docs/SAP_DATASPHERE/43509d67b8b84e66a30851e832f66911/7a453609c8694b029493e7d87e0de60a.html) in the SAP Datasphere documentation.

In [None]:
import requests
from requests_oauthlib import OAuth2Session
from oauthlib.oauth2 import WebApplicationClient
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import re
import getpass
import time
import json
import pandas as pd
import plotly

# Specify your tenant url for use in odata requests
tenant_url = "<TENANT_URL>"

# Get oauth client details from "dsp_secrets.json" file
with open('dsp_secrets.json', 'r') as file:
    data = file.read()
    client = json.loads(data)
    print(f"Connecting with client {client['client_id']}.")

client_id = client['client_id']
client_secret = client['client_secret']
authorization_base_url = client['authorization_url']
token_url = client['token_url']
email_address = client['email_address']
password = client['password']
redirect_uri="https://localhost:8080"


# Create an OAuth session
oauth = OAuth2Session(client_id)

# Start the authorization process
authorization_url, state = oauth.authorization_url(authorization_base_url)

# Use Selenium headless Chrome to acquire token
options = Options()
options.add_argument("--headless=new")
driver = webdriver.Chrome(options=options)
driver.get(authorization_url)
email_box = driver.find_element(by=By.ID, value="j_username")
password_box = driver.find_element(by=By.ID, value="j_password")
auth0_login_button = driver.find_element(By.ID, value="logOnFormSubmit")
email_box.send_keys(email_address)
password_box.send_keys(password)
auth0_login_button.click()
code_url = driver.current_url
try:
    code = re.findall("(?:code=)(\w+)", code_url)[0]
    print("Connected")
except:
    raise AttributeError("Cannot retrieve token due to wrong credentials")
driver.close()

# token = fetch token
token = oauth.fetch_token(token_url, code=code, client_secret=client_secret)

def odata_to_df(base_url, page_size = 1000):
    '''
    take api call and convert response to dataframe
    page_size set, by default, to 1,000 records
    '''
    page_num = 0
    df_list = []

    while True:
        odata_url = f'https://{tenant_url}/api/v1/dwc/{base_url}?$skip={page_num*page_size}&$top={page_size}'
        print(f"Fetching data from {odata_url}.")
        response = oauth.get(odata_url)
        # If page doesn't exist, stop
        if response.status_code != 200:
            print('Failed to make OData request:', response.status_code)
            break
        data = response.json()
        df = pd.json_normalize(data, "value")
        df_list.append(df)

        # If less than maximum records are returned, stop
        if len(df) < page_size:
            break

        page_num += 1

        # if required, add a delay between requests to respect rate limits
        time.sleep(1)  

    # combine all dataframes
    return pd.concat(df_list, ignore_index=True)

def odata_to_json(url):
    '''take api call and convert response to json'''
    odata_url = f'https://{tenant_url}/api/v1/dwc/'+url
    print(f"Fetching data from {odata_url}.")
    response = oauth.get(odata_url)
    # Check the response
    if response.status_code == 200:
        data = response.json()
        # json = pd.json_normalize(data, "value")
        json = pd.json_normalize(data)
        return(json)
    else:
        print('Failed to make OData request:', response.status_code)

<a id='list-spaces'></a>
## List Spaces
Ready to go. No modification necessary.

In [None]:
my_spaces = odata_to_df('catalog/spaces')
my_spaces

<a id='list-assets'></a>
## List Assets
Ready to go. No modification necessary.

In [None]:
my_assets = odata_to_df('catalog/assets')
my_assets_simplified = my_assets[['name', 'assetRelationalDataUrl', 'assetAnalyticalDataUrl', 'hasParameters']]
print(my_assets_simplified)

<a id='list-space-assets'></a>
## List Assets in a Space
Using the syntax: `catalog/spaces('<space_id>')/assets`

In [None]:
space_assets = odata_to_df("catalog/spaces('<space_id>')/assets")
space_assets_simplified = space_assets[['name', 'assetRelationalDataUrl', 'assetAnalyticalDataUrl', 'hasParameters']]
print(space_assets_simplified)

<a id='metadata'></a>
## Get Asset Metadata
Using the syntax: `catalog/spaces('<space_id>')/assets('<asset_id>')`

In [None]:
my_asset = odata_to_json("catalog/spaces('<space_id>')/assets('<asset_id>')")
my_asset_simplified = my_asset[['name', 'assetRelationalDataUrl', 'assetAnalyticalDataUrl', 'hasParameters']]
print(my_asset_simplified)

<a id='view-data'></a>
## View: Get Data
Using the syntax: `consumption/relational/<space_id>/<asset_id>/<asset_id>[<params>]`

In [None]:
my_view_data = odata_to_df("consumption/relational/<space_id>/<asset_id>/<asset_id>")
print(my_view_data.head())

# Get info for df
print(my_view_data.info())

<a id='aggregate-view-data'></a>
## View: Aggregate Data in Pandas
Using the syntax: `mysubset.groupby('<attribute>')['<measure>'].sum()`

In [None]:
my_subset = my_view_data[['<attribute>', '<measure>']] # select columns to extract
my_aggregation = my_subset.groupby('<attribute>')['<measure>'].sum() # group by attribute and sum measure
my_aggregation = my_aggregation.reset_index()
my_aggregation


<a id='unaggregated-data'></a>
## Analytic Model: Get Unaggregated Data
Using the syntax: `consumption/analytical/<space_id>/<asset_id>/<asset_id>[<params>]`

In [None]:
my_am_data = odata_to_df("consumption/analytical/<space_id>/<asset_id>/<asset_id>")
print(my_am_data.head())

# Get info for df
print(my_am_data.info())

<a id='aggregate-plot'></a>
## Analytic Model: Aggregate and Plot Data
By specifying one or more measures to aggregate and one or more attributes to group by
Using the syntax: `consumption/analytical/<space_id>/<asset_id>/<asset_id>?$select=<attribute>,<measure>`

In [None]:
pd.options.plotting.backend = "plotly"
my_am_aggregation = odata_to_df("consumption/analytical/<space_id>/<asset_id>/<asset_id>?$select=<attribute>,<measure>")

# Plot the aggregated data
my_am_aggregation.plot(kind='bar', x='<attribute>', y='<measure>')