# Harmonizer Demo : converts Json files into RDF files  
### Data from Helexia - Energis

### Import libraries

In [1]:
import os
import json
import pathlib
import glob

## Table of Contents

1. [Aim](#1.-Aim)  

2. [Arguments of the module](#2.-Arguments-of-the-module)  

3. [Examples of use](#3.-Examples-of-use)  
 

## 1. Aim  
  
The aim of this module is to convert files containing data in a natural language into a file which can be read by a machine to create an ontologie.
The principal idea is to transform a **JSON** file into a **RDF/Turtle** file by using a mapping file (**RML**) previously created.
Then, we added a module to modify, insert data into a created **RDF** file thantks to some **SPARQL querries** on the previous graph.  

![scheme](documentation/scheme_harmonizer.png)

## 2. Arguments of the module  

There are 2 modules inside the harmonizer tool :  
* The conversion  
* The Sparql Estage  

To activate the previous modules, you will have to specify them in the call of the tool. 
The basic command line to execute the Python tool is the following :  

&nbsp; **python** &nbsp; $\color{blue}{harmonizer.py}$ &nbsp; $\color{red}{--input}$ &nbsp; $\color{red}{inputFile}$ &nbsp; $\color{orange}{[--mapping}$ &nbsp; $\color{orange}{RMLFile]}$ &nbsp; $\color{green}{[--sparql}$ &nbsp; $\color{green}{SparqlFiles]}$ &nbsp; $\color{purple}{[--output}$ &nbsp; $\color{purple}{outputFilename]}$    



## 3. Examples of use

The following example will use the data from Helexia-Energis, there are 5 **json** files, 5 **rml** files and some **sparql** queries.  
You will find them in the **data/Demo_Helexia-Energis** folder : 

In [2]:
os.listdir(os.path.join(os.getcwd(),'data\\Demo_Helexia-Energis'))

['1bis_interamerican_asset_details.json',
 '1bis_interamerican_asset_details.rml',
 '1bis_interamerican_asset_details.ttl',
 '1bis_interamerican_asset_details_updated.rml',
 '1_bigg_project_full_asset_tree_with_tag_ids.json',
 '1_bigg_project_full_asset_tree_with_tag_ids.rml',
 '1_bigg_project_full_asset_tree_with_tag_ids.ttl',
 '1_bigg_project_full_asset_tree_with_tag_ids_test.ttl',
 '1_bigg_project_full_asset_tree_with_tag_ids_updated.rml',
 '1_bigg_project_full_asset_tree_with_tag_ids_withoutOrphan.ttl',
 '2bis_interamerican_electricity_consumption_metric_details.json',
 '2bis_interamerican_electricity_consumption_metric_details.rml',
 '2bis_interamerican_electricity_consumption_metric_details.ttl',
 '2bis_interamerican_electricity_consumption_metric_details_updated.rml',
 '2_interamerican_electricity_consumption_timeseries.json',
 '2_interamerican_electricity_consumption_timeseries.rml',
 '2_interamerican_electricity_consumption_timeseries.ttl',
 '2_interamerican_electricity_consum

### Convert Json Files with the RML mapping files

In [3]:
for jsonFile in glob.glob('data\\Demo_Helexia-Energis\\*.json'):
    RMLFile = jsonFile.replace('.json','.rml')
    OutputFile = jsonFile.replace('.json','.ttl')

    print(f'---- JSON File : {jsonFile} ') 
    os.system(f'python harmonizer.py --input {jsonFile} --mapping {RMLFile} --output {OutputFile}')
    print(f'-- Conversion TTL : OK ') 

---- JSON File : data\Demo_Helexia-Energis\1bis_interamerican_asset_details.json 
-- Conversion TTL : OK 
---- JSON File : data\Demo_Helexia-Energis\1_bigg_project_full_asset_tree_with_tag_ids.json 
-- Conversion TTL : OK 
---- JSON File : data\Demo_Helexia-Energis\2bis_interamerican_electricity_consumption_metric_details.json 
-- Conversion TTL : OK 
---- JSON File : data\Demo_Helexia-Energis\2_interamerican_electricity_consumption_timeseries.json 
-- Conversion TTL : OK 
---- JSON File : data\Demo_Helexia-Energis\3_interamerican_outdoor_temperature_timeseries.json 
-- Conversion TTL : OK 


### Example of a ttl output file

In [4]:
ttl = pathlib.Path("data\\Demo_Helexia-Energis\\1bis_interamerican_asset_details.ttl").read_text()
print(ttl)

@prefix ns1: <http://bigg-project.eu/> .

<http://bigg-project.eu/instances/building_58> a ns1:Building ;
    ns1:buildingIdFromOrganization "58" ;
    ns1:buildingName "Interamerican" ;
    ns1:hasLocationInfo <http://bigg-project.eu/instances/locationInfo_58> .

<http://bigg-project.eu/instances/addressCountry_58> a ns1:AddressCountry ;
    ns1:addressCountryCode "GR" .

<http://bigg-project.eu/instances/locationInfo_58> a ns1:LocationInfo ;
    ns1:addressLatitude "37.9607804" ;
    ns1:addressLongitude "23.72126837" ;
    ns1:addressPostalCode "11745" ;
    ns1:addressStreetName "Syggrou 124 avenue" ;
    ns1:addressStreetNumber "" ;
    ns1:hasAddressCountry <http://bigg-project.eu/instances/addressCountry_58> .




### Example of use of the sparql stage

In the following example, we will remove orphans devices, which are not related to a space. The sparql request is in the file named 'remove_orphan_devices.txt' and we will associate it with the first **json** file which contained orphans devices.  

In [5]:
jsonFile = 'data\\Demo_Helexia-energis\\1_bigg_project_full_asset_tree_with_tag_ids.json'
RMLFile = 'data\\Demo_Helexia-energis\\1_bigg_project_full_asset_tree_with_tag_ids.rml'
SparqlFile = 'data\\Demo_Helexia-energis\\remove_orphan_devices.txt'
OutputFile = 'data\\Demo_Helexia-energis\\1_bigg_project_full_asset_tree_with_tag_ids_withoutOrphan.ttl'

os.system(f'python harmonizer.py --input {jsonFile} --mapping {RMLFile} --sparql {SparqlFile} --output {OutputFile}')

0

The result **ttl** file will no longer have the 5 orphans devices (101, 192, 196, 197, 95)

In [6]:
ttl = pathlib.Path("data\\Demo_Helexia-energis\\1_bigg_project_full_asset_tree_with_tag_ids_withoutOrphan.ttl").read_text()
print(ttl)

@prefix ns1: <http://bigg-project.eu/> .

<http://bigg-project.eu/instances/project_158> a ns1:Project ;
    ns1:affectsBuilding <http://bigg-project.eu/instances/building_58> ;
    ns1:projectIdFromOrganization "158" ;
    ns1:projectName "Cordia" .

<http://bigg-project.eu/instances/buildingSpace_58> a ns1:BuildingSpace ;
    ns1:containsElement <http://bigg-project.eu/instances/deviceMeter_59>,
        <http://bigg-project.eu/instances/deviceMeter_60> .

<http://bigg-project.eu/instances/building_58> a ns1:Building ;
    ns1:buildingIdFromOrganization "58" ;
    ns1:buildingName "Interamerican" ;
    ns1:hasSpace <http://bigg-project.eu/instances/buildingSpace_58> ;
    ns1:pertainsToOrganization <http://bigg-project.eu/instances/organisation_58> .

<http://bigg-project.eu/instances/deviceMeter_59> a ns1:Device ;
    ns1:deviceIdFromOrganization "59" ;
    ns1:name "Main Switch Meter 1" .

<http://bigg-project.eu/instances/deviceMeter_60> a ns1:Device ;
    ns1:deviceIdFromOrganizat