## Quick start demo for how to use the cimxml parser

In [6]:
import cim_plugin   # Required to register the cimxml parser
from rdflib import Literal
from cim_plugin.utilities import collect_cimxml_to_dataset
from pathlib import Path


### Parse the files and print identifier for each graph. 

The graph(s) in the file(s) are collected in a python object called a Dataset. Each graph is a separate named graph in the Dataset. The namespaces are collected into a namespace_manager in the Dataset, but each graph also keeps it's own namespace_manager. Namespace conflicts will be normalized, so it is recommended not to parse mutliple graphs that share prefixes with different namespaces or share namespaces with different prefixes. 

The identifier has been collected from either md:FullModel or dcat:Dataset in the graphs themselves.

In [7]:
file="../Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml"
file2="../Nordic44/instances/Grid/cimxml/Nordic44-HV_GL.xml"
linkmlfile = "../CoreEquipment.linkml.yaml"

ds = collect_cimxml_to_dataset([file, file2], linkmlfile)

for g in ds.graphs():
    print("Identifier: ", g.identifier)

# The log message warns that the linkML file namespace for dc is not the same as the one linkML has stored as default.
# The namespace in the file takes precedence over the default.

dc namespace is already mapped to http://purl.org/dc/terms/ - Overriding with mapping to http://purl.org/dc/elements/1.1/
dc namespace is already mapped to http://purl.org/dc/terms/ - Overriding with mapping to http://purl.org/dc/elements/1.1/


Identifier:  urn:uuid:e710212f-f6b2-8d4c-9dc0-365398d8b59c
Identifier:  urn:uuid:167b4832-27c5-ff4f-bd26-6ce3bff1bdb7
Identifier:  urn:x-rdflib:default


### Handling individual graphs

Extract a spesific graph from the dataset using the id and print a random selection of triples. Datatype can also be printed for Literal objects.

In [8]:
g1 = ds.graph("urn:uuid:e710212f-f6b2-8d4c-9dc0-365398d8b59c")

count = 0
for s, p, o in g1:
    print(s, p, o)
    if isinstance(o, Literal):
        print(o.datatype)

    count +=1
    if count == 5:
        break

urn:uuid:f1769cba-9aeb-11e5-91da-b8763fd99c5f https://cim.ucaiug.io/ns#OperationalLimitSet.Terminal urn:uuid:2dd90402-bdfb-11e5-94fa-c8f73332c8f4
urn:uuid:f1769bcc-9aeb-11e5-91da-b8763fd99c5f https://cim.ucaiug.io/ns#IdentifiedObject.mRID f1769bcc-9aeb-11e5-91da-b8763fd99c5f
http://www.w3.org/2001/XMLSchema#string
urn:uuid:f1769c7f-9aeb-11e5-91da-b8763fd99c5f http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://cim.ucaiug.io/ns#CurrentLimit
urn:uuid:f1769722-9aeb-11e5-91da-b8763fd99c5f https://cim.ucaiug.io/ns#EnergyConsumer.LoadResponse urn:uuid:f1769759-9aeb-11e5-91da-b8763fd99c5f
urn:uuid:f1769a74-9aeb-11e5-91da-b8763fd99c5f https://cim.ucaiug.io/ns#IdentifiedObject.name Limits <Default>
http://www.w3.org/2001/XMLSchema#string


### Parsing without a model

If a linkML-file is not provided, the graph file will be parsed as if it was an RDF/XML file.

Notice the differences in structure between these triples and the triples above.

In [9]:
wt = collect_cimxml_to_dataset([file])

count = 0
for s, p, o in list(wt.graphs())[0]:
    print(s, p, o)
    if isinstance(o, Literal):
        print(o.datatype)

    count +=1
    if count == 5:
        break

Cannot perform post processing without the model. Data parsed as RDF/XML.


file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml#_f1769b99-9aeb-11e5-91da-b8763fd99c5f http://iec.ch/TC57/CIM100#OperationalLimit.OperationalLimitSet file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml#_f1769b98-9aeb-11e5-91da-b8763fd99c5f
file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml#_f1769e97-9aeb-11e5-91da-b8763fd99c5f http://iec.ch/TC57/CIM100#OperationalLimit.OperationalLimitType file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml#_f1769a42-9aeb-11e5-91da-b8763fd99c5f
file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml#_f1769dd4-9aeb-11e5-91da-b8763fd99c5f http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://iec.ch/TC57/CIM100#CurrentLimit
file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ.xml#_f1769cda-9aeb-11e5-91da-b8763fd99c5f http://iec.ch/TC57/CIM100#IdentifiedObject.name RATEB
None
file:///home/christha/Nordic44/instances/Grid/cimxml/Nordic44-HV_EQ

### Serializing

To serialize to file use the standard rdflib serializer. For serializing one graph to RDF/XML file, use format='xml'. With format='trig' a trig file is made. This allows for serializing all the graphs in the Dataset as separate named graphs in one file.

In [10]:
output_file = Path.cwd().parent / "test_graphs.trig"
ds.serialize(destination=str(output_file), format="trig")


output_file = Path.cwd().parent / "test_graph.xml"
g1.serialize(destination=str(output_file), format="xml")


<Graph identifier=urn:uuid:e710212f-f6b2-8d4c-9dc0-365398d8b59c (<class 'rdflib.graph.Graph'>)>