# DataModelDict Class Demonstration

The DataModelDict class is a utility tool for working with structured data models.  It handles the conversions between equivalent representations of json, XML and Python dictionaries.  It also has a few methods associated with checking the data model type and recursively retrieving elements from the model.

Simple examples are presented here to demonstrate the basic functionality of the class.  Full documentation can be found on the [Documentation webpage](Documentation.html).

#### Library imports

In [1]:
#standard Python libraries
import random
import os

#DataModelDict class
#Can be installed to Python with "pip install DataModelDict"
#or downloaded from "https://github.com/usnistgov/DataModelDict"
from DataModelDict import DataModelDict

## 1. Class Basics 

The DataModelDict is a child class of OrderedDict.  As such, is has all the functionality of OrderedDict and more.

Here, we construct a multi-level demonstration data model using lists and DataModelDicts.

In [2]:
#Create an empty DataModel
model = DataModelDict()

#Create tiered dictionary for demonstration purposes
model['my-data-model'] = DataModelDict()

model['my-data-model']['name'] = 'Demo'

model['my-data-model']['author'] = 'Me'

model['my-data-model']['process'] = DataModelDict()
model['my-data-model']['process']['Instrument'] = DataModelDict()
model['my-data-model']['process']['Instrument']['Name'] = 'Shiny Thing'
model['my-data-model']['process']['Instrument']['Model'] = 'Newest Most Expensive'
model['my-data-model']['process']['method'] = 'By the book'

#Assign random data for each temperature measurement
model['my-data-model']['measurement'] = []
for temperature in xrange(0, 2000, 200):
    measurement = DataModelDict([('temperature', DataModelDict([('value', temperature),
                                                                ('unit', 'K')])),
                                 ('length',      DataModelDict([('value', temperature*random.random()/50),
                                                                ('unit', 'm')]))])
    model['my-data-model']['measurement'].append(measurement) 

## 2. Output Conversion

DataModelDict has methods json() and xml() that return the data model as either of these formats. 

In [3]:
#Save DataModelDict as json file by setting fp = file-like object
with open('model.json', 'w') as f:
    model.json(fp=f)

#Print the DataModelDict as a json string. 
#Setting an indent value adds newlines and line indentations
print model.json(indent=2)

{
  "my-data-model": {
    "name": "Demo", 
    "author": "Me", 
    "process": {
      "Instrument": {
        "Name": "Shiny Thing", 
        "Model": "Newest Most Expensive"
      }, 
      "method": "By the book"
    }, 
    "measurement": [
      {
        "temperature": {
          "value": 0, 
          "unit": "K"
        }, 
        "length": {
          "value": 0.0, 
          "unit": "m"
        }
      }, 
      {
        "temperature": {
          "value": 200, 
          "unit": "K"
        }, 
        "length": {
          "value": 1.3982209483386643, 
          "unit": "m"
        }
      }, 
      {
        "temperature": {
          "value": 400, 
          "unit": "K"
        }, 
        "length": {
          "value": 2.533088681465685, 
          "unit": "m"
        }
      }, 
      {
        "temperature": {
          "value": 600, 
          "unit": "K"
        }, 
        "length": {
          "value": 8.699617823553178, 
          "unit": "m"
        }
      }

In [4]:
#Save DataModelDict as xml file by setting fp = file-like object
with open('model.xml', 'w') as f:
    model.xml(fp=f)

#Print the DataModelDict as an xml string. 
#Setting an indent value adds newlines and line indentations
print model.xml(indent=4)

<?xml version="1.0" encoding="utf-8"?>
<my-data-model>
    <name>Demo</name>
    <author>Me</author>
    <process>
        <Instrument>
            <Name>Shiny Thing</Name>
            <Model>Newest Most Expensive</Model>
        </Instrument>
        <method>By the book</method>
    </process>
    <measurement>
        <temperature>
            <value>0</value>
            <unit>K</unit>
        </temperature>
        <length>
            <value>0.0</value>
            <unit>m</unit>
        </length>
    </measurement>
    <measurement>
        <temperature>
            <value>200</value>
            <unit>K</unit>
        </temperature>
        <length>
            <value>1.3982209483386643</value>
            <unit>m</unit>
        </length>
    </measurement>
    <measurement>
        <temperature>
            <value>400</value>
            <unit>K</unit>
        </temperature>
        <length>
            <value>2.533088681465685</value>
            <unit>m</unit>
        </lengt

## 3. Loading Data Models

DataModelDict has a load() method that reads in a string or file-like object in either xml or json format. The class initilizer also calls load() if the argument is a string or file-like object.

In [5]:
#load from xml file by initilizing new model
with open('model.xml') as f:
    model2 = DataModelDict(f)
    
#test that models are equivalent
print model.json() == model2.json() and model.xml()  == model2.xml()

True


In [6]:
#load from json file by using load()
model2 = DataModelDict()
with open('model.json') as f:
    model2.load(f)
    
#test that models are equivalent
print model.json() == model2.json() and model.xml()  == model2.xml()

True


In [7]:
#load from json string by initilizing new model
json_string = model.json()
model2 = DataModelDict(json_string)
    
#test that models are equivalent
print model.json() == model2.json() and model.xml()  == model2.xml()

True


In [8]:
#load from xml string by using load()
xml_string = model.xml()
model2 = DataModelDict()       
model2.load(xml_string)

#test that models are equivalent
print model.json() == model2.json() and model.xml()  == model2.xml()

True


## 4. Finding and Accessing Elements

A number of methods have been added to DataModelDict to assist in finding, accessing, and modifying the various elements and subelements of a data model.

### 4.1 Index with path lists

Normally, accessing or setting the values contained in a data model consisting of tiered dictionaries and lists requires knowing the full path list beforehand.  This can be tedious and requires that the programmer hard-code the absolute path of any elements of interest.  To improve upon this, values contained in a DataModelDict can be accessed using a _path list_ consisting of a list of indicies.  The terms in the list can be either dictionary keys or list indicies.

In [9]:
#Use indexing to retrieve the instrument name in the standard way
print model['my-data-model']['process']['Instrument']['Name']

#Use path list indexing to retrieve the instrument name
path = ['my-data-model', 'process', 'Instrument', 'Name']
print model[path]

Shiny Thing
Shiny Thing


### 4.2 Find value(s) with key

If you know the key for an element you are interested in but don't know where it is located in the data model, you can access the element's value using the find() and finds() methods.  find() will return a value if the search produces a unique result, and issue an error if no match or multiple matches are found.  finds() returns a list of all values obtained by the search conditions.

In [10]:
#I know the instrument name is the only element with the key Name
print model.find('Name')

Shiny Thing


In [11]:
#I want a list of all the measurement elements
measurements = model.finds('measurement')

print len(measurements), "measurements found, with first measurement being:"
print measurements[0].json(indent=2)

10 measurements found, with first measurement being:
{
  "temperature": {
    "value": 0, 
    "unit": "K"
  }, 
  "length": {
    "value": 0.0, 
    "unit": "m"
  }
}


Both find() and finds() allow for additional search conditions using dictionary arguments _yes_ and _no_.  Any key-value pairs listed in _yes_ must be found in the element in order for it to be considered a match.  If any key-value pairs listed in _no_ are found in the element, then it is rejected.  

In [12]:
#I want only the length from the measurement with temperature equal to 1600  
temp = DataModelDict([('value',1600), ('unit', 'K')])

print model.find('measurement', yes={'temperature':temp})['length']['value']

#I want all measurements except for when temperature equals 800
temp = DataModelDict([('value',800), ('unit', 'K')])

measurements = model.finds('measurement', no={'temperature':temp})
print len(measurements), "measurements found that don't have temperature = 800"

24.0126928722
9 measurements found that don't have temperature = 800


### 4.3 Find path(s) with key

Alternatively, if you want to learn the full path to any elements in unknown locations, you can use the path() and paths() methods.  These behave similarly to find() and finds(), but return path lists instead of values.

In [13]:
#I know the instrument name is the only element with the key Name
path = model.path('Name')
print path
print model[path]

['my-data-model', 'process', 'Instrument', 'Name']
Shiny Thing


The path() and paths() methods also allow for yes and no dictionaries to be used as arguments.

In [14]:
measurement_paths = model.paths('measurement', no={'temperature':temp})
for path in measurement_paths:
    print path    

['my-data-model', 'measurement', 0]
['my-data-model', 'measurement', 1]
['my-data-model', 'measurement', 2]
['my-data-model', 'measurement', 3]
['my-data-model', 'measurement', 5]
['my-data-model', 'measurement', 6]
['my-data-model', 'measurement', 7]
['my-data-model', 'measurement', 8]
['my-data-model', 'measurement', 9]


## 5. Treatment of Unbounded Sequences

When converting from XML there is some ambiguity associated with sequences.  The normal parsing method will convert sequences with one element to single values, and sequences with multiple elements to lists.  To help with this, DataModelDict has a couple methods that allow for the handling of elements that may or may not be lists.

The append() method allows for a key-value pair to be added to the DataModelDict.  If the key doesn't already exist, then it is assigned like a regular dictionary.  If the key does exist, the current value is converted into a list if it isn't one and the new value is appended.

The aslist() method returns the value(s) associated with a dictionary key as a list, even if it isn't one.

In [15]:
#Check element value and aslist before key is assigned
print "model['my-data-model'].get('ordinal', None) ->", 
print model['my-data-model'].get('ordinal', None) 
print "model['my-data-model'].aslist('ordinal') ->   ", 
print model['my-data-model'].aslist('ordinal') 
print 

#append a value and check again
print "model['my-data-model'].append('ordinal', 'first')"
model['my-data-model'].append('ordinal', 'first')
print "model['my-data-model'].get('ordinal', None) ->", 
print model['my-data-model'].get('ordinal', None) 
print "model['my-data-model'].aslist('ordinal') ->   ", 
print model['my-data-model'].aslist('ordinal') 
print 

#append a value and check again
print "model['my-data-model'].append('ordinal', 'second')"
model['my-data-model'].append('ordinal', 'second')
print "model['my-data-model'].get('ordinal', None) ->", 
print model['my-data-model'].get('ordinal', None) 
print "model['my-data-model'].aslist('ordinal') ->   ", 
print model['my-data-model'].aslist('ordinal') 
print 

#append a value and check again
print "model['my-data-model'].append('ordinal', 'third')"
model['my-data-model'].append('ordinal', 'third')
print "model['my-data-model'].get('ordinal', None) ->", 
print model['my-data-model'].get('ordinal', None) 
print "model['my-data-model'].aslist('ordinal') ->   ", 
print model['my-data-model'].aslist('ordinal') 

model['my-data-model'].get('ordinal', None) -> None
model['my-data-model'].aslist('ordinal') ->    []

model['my-data-model'].append('ordinal', 'first')
model['my-data-model'].get('ordinal', None) -> first
model['my-data-model'].aslist('ordinal') ->    ['first']

model['my-data-model'].append('ordinal', 'second')
model['my-data-model'].get('ordinal', None) -> ['first', 'second']
model['my-data-model'].aslist('ordinal') ->    ['first', 'second']

model['my-data-model'].append('ordinal', 'third')
model['my-data-model'].get('ordinal', None) -> ['first', 'second', 'third']
model['my-data-model'].aslist('ordinal') ->    ['first', 'second', 'third']


#### File removal to keep Notebook directory clean.

In [16]:
os.remove('model.json')
os.remove('model.xml')