# DEVASC Unstructured vs. Structured Data Demo

## Summary

Programmatic interaction with infrastructure automation end systems and controllers typically involves a few common steps:

1. Requesting information from a system or systems via API or RPC calls.
2. Receiving a response to the request.
3. Extracting or parsing the specific data you need from the response.

---


## Overview

Often, the responses to API/RPC requests are lengthy and contain a lot of data that, while important, isn't relevant to your automation use case.  So then, how do we extract the data we need from a lengthy and potentially unformatted string of text in a way that is simple, predictable, and repeatable?  The short answer is, we need to work with that response in a structured data format and the most common formats in the infrastructure automation world are (in alphabetical order):

1. JavaScript Object Notation (JSON)
2. eXtensible Markup Language (XML)
3. YAML Ain't Markup Language (YAML)

This walkthrough will first demonstrate how we would have to parse and extract meaningful data _without_ a structured data format and then demonstrate how we a structured data format makes the entire process much simpler.

---


## Example programmatic API authentication with a Cisco APIC

The Cisco Appliction Policy Infrastructure Controller (APIC) provides, among other things, the management plane for Cisco Application Centric Infrastructure (ACI), a software-defined data center network infrastructure platform.

The REST API authentication process with an APIC involves:

1. Sending credentials to the APIC via HTTP POST.
2. Extracting the **token** from the response body.

After authentication is complete, you can use the resulting **token** to interact with and manage the configuration and policy endpoints in the APIC API.  For more information about the Cisco APIC REST API authentication process, [click here](https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/4-x/rest-api-config/Cisco-APIC-REST-API-Configuration-Guide-401/Cisco-APIC-REST-API-Configuration-Guide-401_chapter_00.html#concept_D16AC6DC9CCD4351A4A40287487F061A "Cisco APIC REST API: Authenticating and Mainint an API Session").

---

---

### Part 1: Perform Cisco ACI authentication via the APIC API

The target APIC for this demo is the [Cisco DevNet Always-On Sandbox](https://devnetsandbox.cisco.com/RM/Diagram/Index/5a229a7c-95d5-4cfd-a651-5ee9bc1b30e2?diagramType=Topology "Cisco DevNet Always-On Sandbox").  This sandbox is available to anyone with a [Cisco DevNet account](https://developer.cisco.com "Cisco DevNet") and provides read _and_ write access to a live, functional Cisco APIC. 

**Note:** Don't worry if you aren't familiar with all of the Python code below.  The point of this walkthrough is to show you an example of the difference between working with **unstructured** and **structured** data, and is _not_ intended as a Python lesson for the APIC REST API.

---


The Python [`requests`](https://docs.python-requests.org/en/master/ "Python Requests Module Documentation") module provides a frendly way to interact with REST-based APIs.

##### Step 1: Import the requests module and disable certificate warnings

---


In [1]:
import requests
requests.packages.urllib3.disable_warnings()

##### Step 2: Create variables for the APIC REST API URL and credentials

---


In [2]:
url = 'https://sandboxapicdc.cisco.com/api/aaaLogin.json'
name = 'admin'
pwd = 'ciscopsdt'

##### Step 3: Create the [Cisco-documented](https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/4-x/rest-api-config/Cisco-APIC-REST-API-Configuration-Guide-401/Cisco-APIC-REST-API-Configuration-Guide-401_chapter_00.html#concept_D16AC6DC9CCD4351A4A40287487F061A "Cisco APIC API Authentication Documentation") APIC authentication body

---


In [3]:
json = {
    'aaaUser': {
        'attributes': {
            'name': name,
            'pwd': pwd
        }
    }
}

##### Step 4: Request an authorization token from the APIC

---


In [4]:
login_response = requests.post(  # Create an HTTP POST request
    url=url,                     # Set the target URL parameter (url) to the value of our 'url' variable
    json=json,                   # Set the JSON body URL parameter (json) to the value of our 'json' variable
    verify=False,                # Disable certificate verification checks (this APIC has a self-signed certificate)
    timeout=5                    # Set a timeout value (the default is timeout is infinite)
)

##### Step 5: Confirm that the HTTP response code and reason values are '200' and 'OK', respectively

**Note:** Occasionally, this ACI sandbox experiences congestion which can cause API requests to fail.  If you do not receive a `200 OK` response from the APIC, wait a few minutes and try again.

---


In [5]:
print(f'{login_response.status_code} {login_response.reason}')

200 OK


##### Step 6: Display the APIC login response body as text

---


In [6]:
print(login_response.text)

{"totalCount":"1","imdata":[{"aaaLogin":{"attributes":{"token":"aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==","siteFingerprint":"EQOmXY+ByqJiFvug","refreshTimeoutSeconds":"600","maximumLifetimeSeconds":"86400","guiIdleTimeoutSeconds":"1200","restTimeoutSeconds":"90","creationTime":"1624530631","firstLoginTime":"1624530631","userName":"admin","remoteUser":"false","unixUserId":"15374","sessionId":"pYBirbEASIeyL9WnbT39EA==","lastName":"","firstName":"","changePassword":"no","version":"4.1(1k)","buildTime":"Mon May 13 16:27:03 PDT 2019","node":"topology/pod-1/node-1"},"children":[{"aaaUserDomain":{"attributes":{"name":"all","rolesR":"admin","rolesW":"admin"},"children":[{"aaaReadRoles":{"attributes":{}}},{"aaaWriteRoles":{"attributes":{},"children":[{"role":{"attributes":{"name":"admin"}}}]}}]}},{"DnDomainMapEntry":{"att

##### Step 7: Use the `type` method to display the object type for the text APIC login response body

---


In [7]:
print(type(login_response.text))

<class 'str'>


---

### Part 1 Summary

We successfully authenticated with the APIC although the response from the APIC looks like a complete mess.  The APIC response contains a `token` value, which is what we need for future API requests.

```python
# Example APIC token
"token":"awIAAAAAAAAAAAAAAAAAAHpd0Ybc6/p7+qlQ2jyONq9fSYu8keKDUs5fV0FMABxnBb+GuRlyNKZsausshy4FiKeDDR+FTCYqYI9BAN4hMC3Zll7E6LTAanW40p+dCUOf9BkPu7Nd2kEudETJNBTFnvZ2a2n2731YOB9AP/yVU8pEy0VZ5zWqwJqrxX9AOQmmh/yq/fAdV6YCF6QswuE9sA=="
```

However, **Step 7** tells us that the entire APIC login response is a `string` which means Python does not see the login response as a data structure that we can easily parse.

---

---

### Part 2: Parse the token from an unstructured (`string`) response body

There are many techniques to parse and extract specfic data from strings and one of the more common methods is to use regular expressions, to find very specific sub-strings in a `string`.

---

##### Step 1: Import the regular expression (`re`) module

---


In [8]:
import re

##### Step 2: Create a regular expression match pattern to parse the token from the APIC login response body

Assign the pattern match `string` to the variable `token_pattern`.

---


In [9]:
token_pattern = re.compile(r'''
    ("token":")   # Match group #1: the literal string, 'token="'
    (.+?)         # Match group #2: first instance of everything between groups #1 & #3
    (","?)        # Match group #3: first instance of a " followed by a space character
    ''',
    re.VERBOSE
)

##### Step 3: Use the regular expression match pattern (`token_pattern`) to search the APIC login response for the token value

Pass the `search` method an argument (`login_response.text`) that represents the string to search with the `token_pattern` match pattern.

Assign the results of the search to the variable `token_search`.

---


In [10]:
token_search = token_pattern.search(login_response.text)

##### Step 4: Determine if any values match the regular expression for group \#2

Display the object `token_search.group(2)` and review the data.

**Note:** If the result of the search operation is `NoneType`, the regular expression pattern did not match properly.

---


In [11]:
print(token_search.group(2))

aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==


##### Step 5: Assign the search result to a variable named, 'token' and display the 'token' variable value

---


In [12]:
token = token_search.group(2)
print(token)

aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==


### Part 2 Summary

We successfully parsed/extracted the `token` value from the unstructured data object (`login_response.text`) although the process is somewhat of a complicated mess and the code to parse the `token` value is not something we can easily reuse with a different automation platform.

---

---

### Part 3: Parse the token from a structured version of the APIC response body

While the previous text parsing example is a bit clumsy, we can easily put the APIC login response in a structured data format and, along the way, make our code much cleaner and more reusable.

---


##### Step 1: Assign a JSON-encoded version of the response body to a variable named, `json_response`

The `requests` library provides an easy-to-use method which transforms a JSON `string` into structured Python data.  Not every REST API supports JSON encoding although the APIC allows us to choose between JSON and XML encoding and, in this example, we told the APIC to send us a JSON-formatted response to our authentication request.

**Note:** Specifically for a Cisco APIC, we define the API response encoding format when we set the `url` variable to `https://sandboxapicdc.cisco.com/api/aaaLogin.json`.  Replacing `.json` with `.xml` woulc change the response encoding format to XML.

Use the `json` method on the variable which stores the response to our APIC authentication request (`login_response`) and store the resulting value in the new variable, `json_response`.

---


In [13]:
json_response = login_response.json()

##### Step 2: Use the `type` method to display the object type for the JSON version of the APIC login response body (`json_response`)

---


In [14]:
print(type(json_response))

<class 'dict'>


##### Step 3: Display the Python dictionary APIC login response body

Now that the response object from the APIC is a Python dictionary, we can navigate the structured data more programmatically, to extract the `token` value that we need.

---


In [15]:
print(json_response)

{'totalCount': '1', 'imdata': [{'aaaLogin': {'attributes': {'token': 'aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==', 'siteFingerprint': 'EQOmXY+ByqJiFvug', 'refreshTimeoutSeconds': '600', 'maximumLifetimeSeconds': '86400', 'guiIdleTimeoutSeconds': '1200', 'restTimeoutSeconds': '90', 'creationTime': '1624530631', 'firstLoginTime': '1624530631', 'userName': 'admin', 'remoteUser': 'false', 'unixUserId': '15374', 'sessionId': 'pYBirbEASIeyL9WnbT39EA==', 'lastName': '', 'firstName': '', 'changePassword': 'no', 'version': '4.1(1k)', 'buildTime': 'Mon May 13 16:27:03 PDT 2019', 'node': 'topology/pod-1/node-1'}, 'children': [{'aaaUserDomain': {'attributes': {'name': 'all', 'rolesR': 'admin', 'rolesW': 'admin'}, 'children': [{'aaaReadRoles': {'attributes': {}}}, {'aaaWriteRoles': {'attributes': {}, 'children': [{'role': {'att

##### Step 4: Use the _Pretty Print_ module to display the structured data in a more readable format

In case you are wondering, "doesn't the structured response from the APIC look just as difficult to read as the unstructured response?"  The answer is, yes although, since we have structured data available now, we can use the Python _Pretty Print_ (`pprint`) module to make the response a bit easier to read.

---


In [16]:
from pprint import pprint
pprint(json_response)

{'imdata': [{'aaaLogin': {'attributes': {'buildTime': 'Mon May 13 16:27:03 PDT '
                                                      '2019',
                                         'changePassword': 'no',
                                         'creationTime': '1624530631',
                                         'firstLoginTime': '1624530631',
                                         'firstName': '',
                                         'guiIdleTimeoutSeconds': '1200',
                                         'lastName': '',
                                         'maximumLifetimeSeconds': '86400',
                                         'node': 'topology/pod-1/node-1',
                                         'refreshTimeoutSeconds': '600',
                                         'remoteUser': 'false',
                                         'restTimeoutSeconds': '90',
                                         'sessionId': 'pYBirbEASIeyL9WnbT39EA==',
                     

##### Step 5: Assign the `token` value from the Python dictionary APIC login response body to a variable named, `token`

In the case of the Cisco APIC, the `token` value is a bit complex to parse/extract, because the `token` itself sits several levels below the root of the data structure.  We have to know how to navigate this data structure and parse/extrace very specific information.

The next code block get's right to the solution although **Step 4** provides a walkthrough of the parsing process for this example.  Here's a quick description of the parsing process for the API login response:

```python
json_response                                 # the complete, JSON-formatted Python object which stores data
[imdata]                                      # the top-level dictionary key
[0]                                           # the first (and only) index of the list value assigned to the `json_response` object
['aaaLogin']                                  # the next relevant dictionary key in the APIC response
['attributes']                                # the next relevant dictionary key in the APIC response
['token']                                     # the dictionary key which contains the actual APIC token
```

---


In [17]:
token = json_response['imdata'][0]['aaaLogin']['attributes']['token']

##### Step 6: Display the value of the `token` variable

---


In [18]:
print(token)

aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==


### Part 3 Summary

We successfully parsed/extracted the `token` value from the structured data object (`json_response`) in a way that doesn't require us to write a custom, regular expression text parser.  The code we just wrote is highly-reusable because, although the code requires a basic understanding of Python objects, the code does _not_ require experience with text searching/parsing.

---

---

### Part 4 (optional): Review the process to extract the `token` value from the structured, Python Dictionary APIC response

Here is a quick walkthrough of a methodology to extract nested data from a multi-layer Python object.

**Note:** To help make the process of visualizing the APIC response in the format of a Python dictionary, several of the subsequent steps uses the _Pretty Print_ (`pprint`) method instead of the standard `print` method.

---


##### Step 1: Display the top-level `json_response` dictionary keys, as a starting point

Use the `keys` method of the `json_response` dictionary object to reveal the available top-level dictionary keys.

---


In [19]:
print(json_response.keys())

dict_keys(['totalCount', 'imdata'])


##### Step 2: Determine and display the object type for the `json_response['imdata']` dictionary key

A quick look at the values for both the `totalCount` and `imdata` keys shows us that there isn't much data available in the `totalCount` value.  APIC documentation describes that the `imdata` key will contain the majority of the data in any APIC response.

In order to examine the `imdata` object, we should first determine which type of Python object it is.

---


In [20]:
print(type(json_response['imdata']))

<class 'list'>


##### Step 3: Display the size of the `json_response['imdata']` `list` object

We just learned that the `imdata` object is a Python `list` so we should determine how long the list is, in order to understand whether or not we need to loop/iterate over several list items, or not.

---


In [21]:
pprint(len(json_response['imdata']))

1


##### Step 4: Display the first item in the `json_response['imdata'][0]` `list` object

We can see from the previous step that the `imdata` list has only one element.  As such we can directly access that element by displaying index 0 of the `imdata` object.

---


In [22]:
pprint(json_response['imdata'][0])

{'aaaLogin': {'attributes': {'buildTime': 'Mon May 13 16:27:03 PDT 2019',
                             'changePassword': 'no',
                             'creationTime': '1624530631',
                             'firstLoginTime': '1624530631',
                             'firstName': '',
                             'guiIdleTimeoutSeconds': '1200',
                             'lastName': '',
                             'maximumLifetimeSeconds': '86400',
                             'node': 'topology/pod-1/node-1',
                             'refreshTimeoutSeconds': '600',
                             'remoteUser': 'false',
                             'restTimeoutSeconds': '90',
                             'sessionId': 'pYBirbEASIeyL9WnbT39EA==',
                             'siteFingerprint': 'EQOmXY+ByqJiFvug',
                             'token': 'aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07

##### Step 5: Display the value for the `json_response['imdata'][0]['aaaLogin']` dictionary key

From the output in the previous step, we can see the next dictionary key to inspect is the `aaaLogin` key.

---


In [23]:
pprint(json_response['imdata'][0]['aaaLogin'])

{'attributes': {'buildTime': 'Mon May 13 16:27:03 PDT 2019',
                'changePassword': 'no',
                'creationTime': '1624530631',
                'firstLoginTime': '1624530631',
                'firstName': '',
                'guiIdleTimeoutSeconds': '1200',
                'lastName': '',
                'maximumLifetimeSeconds': '86400',
                'node': 'topology/pod-1/node-1',
                'refreshTimeoutSeconds': '600',
                'remoteUser': 'false',
                'restTimeoutSeconds': '90',
                'sessionId': 'pYBirbEASIeyL9WnbT39EA==',
                'siteFingerprint': 'EQOmXY+ByqJiFvug',
                'token': 'aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==',
                'unixUserId': '15374',
                'userName': 'admin',
                'version': 

##### Step 6: Display the value for the `json_response['imdata'][0]['aaaLogin'][`attributes`]` dictionary key

From the output in the previous step, we can see the next dictionary key to inspect is the `attributes` key.

---


In [24]:
pprint(json_response['imdata'][0]['aaaLogin']['attributes'])

{'buildTime': 'Mon May 13 16:27:03 PDT 2019',
 'changePassword': 'no',
 'creationTime': '1624530631',
 'firstLoginTime': '1624530631',
 'firstName': '',
 'guiIdleTimeoutSeconds': '1200',
 'lastName': '',
 'maximumLifetimeSeconds': '86400',
 'node': 'topology/pod-1/node-1',
 'refreshTimeoutSeconds': '600',
 'remoteUser': 'false',
 'restTimeoutSeconds': '90',
 'sessionId': 'pYBirbEASIeyL9WnbT39EA==',
 'siteFingerprint': 'EQOmXY+ByqJiFvug',
 'token': 'aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw==',
 'unixUserId': '15374',
 'userName': 'admin',
 'version': '4.1(1k)'}


##### Step 7: Display the value for the `json_response['imdata'][0]['aaaLogin']['attributes']['token']` dictionary key

We are down to the last level of this dictionary and we can display and/or use the `token` value from the APIC.

---


In [25]:
pprint(json_response['imdata'][0]['aaaLogin']['attributes']['token'])

'aAAAAAAAAAAAAAAAAAAAAJhNZSsUcKARKKGovgUSdyFpEnTsu8lDsYT4Gcz8bo7d2q9xYPtZ8t0OO8BJWGpXrHwqDjRRWw1mbOKvs3t8V1F+4nLQCsq5PLqOH3c5Z07OLXXLRIvvTZABcr19XVRS1RKSyKRqgVFXHlSO3COlxfBl/nAXnPZq6Rd2ACvyps+0/3MG+RDux90hugdH+fSaNw=='


### Part 4 Summary

Often, during the code development process, it is necessary to step through complex objects, as we just did, in order to find the right syntax which extracts the specific data you need.  Fortunately, once you go through this process once, you can assign long object indices (like `json_response['imdata'][0]['aaaLogin']['attributes']['token']`) to a variable with a much shorter name (like `token`) in order to easily recall and reuse specific, significant data.

---
