## Performance MIP examples

### These examples assume that you have [pip installed pyesdoc](https://pypi.python.org/pypi/pyesdoc/) 

### This example just shows how we can build a standalone set of performance descriptions using esdoc. The real thing will use the server ...

## Setup

First import the pyesdoc library, including the classes for the documents that we want to create. 

In [1]:
import glob
import os
import uuid

import pyesdoc
from pyesdoc.ontologies.cim.v2 import DocReference
from pyesdoc.ontologies.cim.v2 import Machine
from pyesdoc.ontologies.cim.v2 import Model
from pyesdoc.ontologies.cim.v2 import Party
from pyesdoc.ontologies.cim.v2 import Performance

Set up empty [CIM](https://github.com/ES-DOC/esdoc-cim/tree/master/v2/schema) documents for the Performance, Model and Machine.

In [2]:
performance_of_hadgem3_on_archer = p = \
    pyesdoc.create(Performance, project='cmip6', sub_projects=['cpmip'], institute='ipsl', source='ipython')

archer = a = \
    pyesdoc.create(Machine, project='cmip6', sub_projects=['cpmip'], institute='ipsl', source='ipython')

## Setting details

Add details to these documents

In [3]:
a.name         = 'Archer'
a.description  = 'The UK national UK academic HPC platform'
a.model_number = 'XC-30'

In [4]:
p.name                          = 'Our first performance example'
p.resolution                     = 1  # TODO convert resolution to float: 1.8E8
p.complexity                     = 66
p.simulated_years_per_day        = 1.0
p.core_hours_per_simulated_year  = 6504.0
p.coupling_cost                  = 0.15
p.actual_simulated_years_per_day = 0.57

## Setting associations

So at this point we realise that the links between the performance and model etc haven't been made.

In the following we are mkaing a reference object to link between the performance and the model, and it needs to have the same properties as the target (hadgem3) so that when the links are actually made the binder can find the right type (Model.type_key) with the right name.

In [5]:
a.vendor = pyesdoc.associate_by_name(a, Party, 'Cray')
p.model = pyesdoc.associate_by_name(p, Model, 'HadGEM3-GC2')
p.platform = pyesdoc.associate(p, a)

In [6]:
assert isinstance(a.vendor, DocReference)
assert isinstance(p.model, DocReference)
assert isinstance(p.platform, DocReference)

## Validating

In [7]:
if not pyesdoc.is_valid(a):
    for err in pyesdoc.validate(a):
        print err

doc.compute_pools --> is an empty list
doc.institution --> is null


In [8]:
if not pyesdoc.is_valid(p):
    for err in pyesdoc.validate(p):
        print err

## Serializing

#### JSON

In [9]:
# Encode.
p_as_json = pyesdoc.encode(p, 'json')
print(p_as_json)

{"name": "Our first performance example", "platform": {"version": 0, "meta": {"type": "cim.2.shared.DocReference"}, "id": "5c87e75a-9e4c-413d-9ec7-9b24884fb33d"}, "coreHoursPerSimulatedYear": 6504.0, "actualSimulatedYearsPerDay": 0.57, "simulatedYearsPerDay": 1.0, "complexity": 66, "meta": {"institute": "ipsl", "createDate": "2016-11-24 11:17:15.377929", "subProjects": ["cpmip"], "project": "cmip6", "source": "ipython", "version": 0, "type": "cim.2.platform.Performance", "id": "763e3c5f-b46c-43d9-8de1-227bae35c314"}, "couplingCost": 0.15, "model": {"meta": {"type": "cim.2.shared.DocReference"}, "type": "cim.2.science.Model", "name": "HadGEM3-GC2"}, "resolution": 1}


In [10]:
# Decode.
assert isinstance(pyesdoc.decode(p_as_json, 'json'), Performance)

#### XML

In [11]:
# Encode.
p_as_xml = pyesdoc.encode(p, 'xml')
print(p_as_xml)

<performance><actualSimulatedYearsPerDay>0.57</actualSimulatedYearsPerDay><complexity>66</complexity><coreHoursPerSimulatedYear>6504.0</coreHoursPerSimulatedYear><couplingCost>0.15</couplingCost><meta><createDate>2016-11-24 11:17:15.377929</createDate><id>763e3c5f-b46c-43d9-8de1-227bae35c314</id><institute>ipsl</institute><project>cmip6</project><source>ipython</source><subProjects><subProject>cpmip</subProject></subProjects><type>cim.2.platform.Performance</type></meta><model><meta><type>cim.2.shared.DocReference</type></meta><name>HadGEM3-GC2</name><type>cim.2.science.Model</type></model><name>Our first performance example</name><platform><id>5c87e75a-9e4c-413d-9ec7-9b24884fb33d</id><meta><type>cim.2.shared.DocReference</type></meta></platform><resolution>1</resolution><simulatedYearsPerDay>1.0</simulatedYearsPerDay></performance>


In [12]:
# Decode.
assert isinstance(pyesdoc.decode(p_as_xml, 'xml'), Performance)

#### HTML

In [13]:
# HTML encoding WILL BE supported.
assert pyesdoc.encode(p, 'html') is None

## I/O - Write

Write the Performance document to the local file system in JSON format (we need to provide a directory to pyesdoc)

In [14]:
# Initialise I/O directory.
io_dir = os.path.join(os.getenv('HOME'), 'tmp/esdoc')
if not os.path.isdir(io_dir):
    os.mkdir(io_dir)

# Delete previous created files.
for fpath in glob.glob("{}/*".format(io_dir)):
    os.remove(fpath)

In [15]:
# Write performance document to local disk.
p_fpath = pyesdoc.write(p, io_dir)
assert os.path.exists(p_fpath)

# Write machine document to local disk.
a_fpath = pyesdoc.write(a, io_dir)
assert os.path.exists(a_fpath)

Have a look at the performance document on local disk

In [16]:
!cat $p_fpath

{"name": "Our first performance example", "platform": {"version": 0, "meta": {"type": "cim.2.shared.DocReference"}, "id": "5c87e75a-9e4c-413d-9ec7-9b24884fb33d"}, "coreHoursPerSimulatedYear": 6504.0, "actualSimulatedYearsPerDay": 0.57, "simulatedYearsPerDay": 1.0, "complexity": 66, "meta": {"institute": "ipsl", "createDate": "2016-11-24 11:17:15.377929", "subProjects": ["cpmip"], "project": "cmip6", "source": "ipython", "version": 0, "type": "cim.2.platform.Performance", "id": "763e3c5f-b46c-43d9-8de1-227bae35c314"}, "couplingCost": 0.15, "model": {"meta": {"type": "cim.2.shared.DocReference"}, "type": "cim.2.science.Model", "name": "HadGEM3-GC2"}, "resolution": 1}

Have a look at the machine document on local disk

In [17]:
!cat $a_fpath

{"modelNumber": "XC-30", "meta": {"institute": "ipsl", "createDate": "2016-11-24 11:17:15.378146", "subProjects": ["cpmip"], "project": "cmip6", "source": "ipython", "version": 0, "type": "cim.2.platform.Machine", "id": "5c87e75a-9e4c-413d-9ec7-9b24884fb33d"}, "vendor": {"meta": {"type": "cim.2.shared.DocReference"}, "type": "cim.2.shared.Party", "name": "Cray"}, "description": "The UK national UK academic HPC platform", "name": "Archer"}

## I/O - Read

In [18]:
doc = pyesdoc.read(p_fpath)
assert isinstance(doc, Performance)

In [19]:
doc = pyesdoc.read(a_fpath)
assert isinstance(doc, Machine)

## I/O - Seek

#### Simulate updating the documents

In [20]:
# Store identifier/version for later.
p_id = p.meta.id
p_version = p.meta.version

# Simulate adding another 10 performance documents with different versions. 
for i in range(10):
    p.meta.version += 1
    pyesdoc.write(p, io_dir)

# Simulate adding another 10 performance documents with different identifiers. 
p.meta.version = 0
for i in range(10):
    p.meta.id = unicode(uuid.uuid4())
    pyesdoc.write(p, io_dir)

#### Seek local disk

In [21]:
# Get documents - all.
docs = pyesdoc.seek(io_dir)
assert len(docs) == 22

In [22]:
# Get documents - latest.
docs = pyesdoc.seek(io_dir, latest=True)
assert len(docs) == 12

In [23]:
# Get documents by type - all.
docs = pyesdoc.seek(io_dir, Performance)
assert len(docs) == 21
for doc in docs:
    assert isinstance(doc, Performance)

In [24]:
# Get documents by type - latest.
docs = pyesdoc.seek(io_dir, Performance, latest=True)
assert len(docs) == 11
for doc in docs:
    assert isinstance(doc, Performance)

In [25]:
# Get a document - all versions.
docs = pyesdoc.seek(io_dir, p_id)
assert len(docs) == 11
for doc in docs:
    assert doc.meta.id == p_id

In [26]:
# Get a document - latest version.
doc = pyesdoc.seek(io_dir, p_id, latest=True)
assert doc.meta.id == p_id
assert doc.meta.version == 10

In [27]:
# Get a document - specific version (note tuple usage).
doc = pyesdoc.seek(io_dir, (p_id, p_version))
assert doc is not None
assert doc.meta.id == p_id
assert doc.meta.version == p_version