# Serialization Builder

After an ontology model is developed, there needs to be a method of uploading data that fits the model in question. 

This data load process is often fed by flat or tree-like datafiles - perhaps simple CSV, XML or JSON documents. 

A serialization helps marshall structured data into a given ontologically defined graph format. 

The serialization has a name, and contains multiple mappings. 

Each mapping describes some link between named attributes in the input file, and their equivalent ontological markup. Some inputs describe entities of a given type, while others describe relations between these entities, and others describe simple data-properties those entities might have. 



In [1]:

# Reset and Start again from here. 
# Let's use the json sourced data file to construct the mapping details.
#1. Load config from json serialisation file
#2. Validate sections of the serialisation file using ontology
#3. Build the serialisation rdf file. 



import jsonschema
import json

#import owlready2 as owlr

import xml.etree.ElementTree as ET



In [2]:
import sys, os
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

import serialization_builder as s_b



In [3]:
schemafilename = '/home/tomk/Documents/Coding/gitHub/datamodels/sample_ser.json'

t = s_b.process_json_serialization(schemafilename)

[('subclasses', {'a': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassA', 'b': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassB', 'c': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassC'}), ('testcontents', {'1': 'X', '2': 'Y', '3': 'Z'})]
('subclasses', {'a': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassA', 'b': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassB', 'c': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassC'})
////////////////////
{'id': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#Sample_Serialisationa37eebd65031413b8757e0739b12f179', 'type': 'http://www.semanticweb.org/tomk/ontologies/2022/11/serialization#TranslationMapping', 'label': 'subclasses', 'kvpairs': [('http://www.semanticweb.org/tomk/ontologies/2023/6/sample#Sample_Serialisation1d67489dc7884a4288f98a45020340bd', 'http://www.semanticweb.org/tomk/ontologies/2022/11/serialization#MappingKVPair

In [4]:
j_data = {'$schema': 'serialisation_schema.json', 'serialization_iri': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#Sample_Serialisation', 'serialization_label': 'Sample Serialisation', 'targetOntology': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample', 'targetClasses': ['http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someClass', 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassA', 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassB', 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassC'], 'targetProperties': ['http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someProperty'], 'targetDataProperties': ['http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someDataProperty'], 'targetStaticProperties': ['http://www.w3.org/1999/02/22-rdf-syntax-ns#type'], 'source_headers': ['ParentClass', 'Class', 'Property', 'DataProperty', 'SubClassPointer'], 'translation_mappings': {'subclasses': {'key0': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassA', 'key1': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassB', 'key2': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassC'}}, 'serialization_mappings': [{'mapping_name': 'Parent_Class_Mapping', 'label': 'ParentClass', 'target': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someClass'}, {'mapping_name': 'Class_Mapping', 'label': 'Class', 'target': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someClass', 'parent_label': 'ParentClass'}, {'mapping_name': 'SubClassProperty_Mapping', 'domain': 'Class', 'target': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'range': 'SubClassPointer', 'translationMapping': 'subclasses'}, {'mapping_name': 'Property_Mapping', 'target': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someProperty', 'domain': 'Class', 'range': 'Property'}, {'mapping_name': 'DataProperty_Mapping', 'target': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#someDataProperty', 'domain': 'Class', 'range': 'DataProperty'}]}
j_data.keys()
for k,v in j_data.get("translation_mappings").items():
    print (k, v)

subclasses {'key0': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassA', 'key1': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassB', 'key2': 'http://www.semanticweb.org/tomk/ontologies/2023/6/sample#SubClassC'}


In [5]:
with open("../sample_ser.rdf", "w") as f:
    f.writelines(t)
    
print(t)

<?xml version='1.0' encoding='utf-8'?>
<!--
        Sample Schema Name
        -->
<rdf:RDF xmlns="http://www.w3.org/2002/07/owl#" xml:base="http://www.w3.org/2002/07/owl" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:ser="http://www.semanticweb.org/tomk/ontologies/2022/11/serialization#">
	<!--   
    ///////////////////////////////////////////////////////////////////////////////////////
    //
    // Serialization - define the named Serialization Object and assign the set of 
    // mappings that belong to that object.
    //
    ///////////////////////////////////////////////////////////////////////////////////////-->
	<NamedIndividual rdf:about="http://www.semanticweb.org/tomk/ontologies/2023/6/sample#Sample_Serialisation">
		<rdf:type rdf:resource="http://www.semanticweb.org/tomk/ontologies/2022/11/serialization#Serialization" />

In [6]:
from rdflib import URIRef, Literal, Graph, Namespace
ns_tuple=("ser", "http://www.semanticweb.org/tomk/ontologies/2022/11/serialization#")
URIRef("ser:test").n3().replace(ns_tuple[0]+":",ns_tuple[1])

'<http://www.semanticweb.org/tomk/ontologies/2022/11/serialization#test>'

In [7]:
schemafilename = '/home/tomk/Documents/Coding/gitHub/datamodels/DMEAR_ser.json'

t = s_b.process_json_serialization(schemafilename)

with open("../DMEAR_ser.rdf", "w") as f:
    f.writelines(t)
    
print(t)

[]
[<Element <function Comment at 0x7fb7784693f0> at 0x7fb77814a610>, <Element 'NamedIndividual' at 0x7fb77814a700>, <Element <function Comment at 0x7fb7784693f0> at 0x7fb778112fc0>, <Element 'Class' at 0x7fb778138220>, <Element 'Class' at 0x7fb778138270>, <Element 'Class' at 0x7fb7781382c0>, <Element 'Class' at 0x7fb778138310>, <Element 'Class' at 0x7fb778138360>, <Element 'Class' at 0x7fb7781383b0>, <Element 'Class' at 0x7fb778138400>, <Element 'Class' at 0x7fb778138450>, <Element <function Comment at 0x7fb7784693f0> at 0x7fb7781384a0>, <Element 'AnnotationProperty' at 0x7fb778138810>, <Element 'AnnotationProperty' at 0x7fb778138860>, <Element 'AnnotationProperty' at 0x7fb7781388b0>, <Element 'AnnotationProperty' at 0x7fb778138900>, <Element 'AnnotationProperty' at 0x7fb778138950>, <Element 'AnnotationProperty' at 0x7fb7781389a0>, <Element 'AnnotationProperty' at 0x7fb7781389f0>, <Element 'AnnotationProperty' at 0x7fb778138a40>, <Element 'AnnotationProperty' at 0x7fb778138a90>, <Elem