Skip to content
Waleed Shabbir edited this page Mar 17, 2023 · 5 revisions

IDS Vocabulary Provider is a special connector that offers vocabularies (ontologies, reference data models, metadata elements) that can be used to annotate and describe datasets and connector/apps/services/resources and thus enhancing the interoperability. It helps to exchange metadata of any assets registered in the Metadata Broker, Marketplace, and App store. This document includes architectural and functional information about IDS Vocabulary Provider. In addition, a Vocabulary Provider has been developed and provided on Github https://github.com/International-Data-Spaces-Association/IDS-VocabularyProvider which manages some ontologies to enable to understand of the datasets/connector metadata. 

All the IDS components need to follow this Information Model for IDS messaging and communication. Participants of the IDS use the RDF vocabulary provided by the Information Model as their common language within the IDS. To ensure the correct usage and understanding of the vocabulary, validation structures are provided in the W3C Shapes Constraint Language (SHACL). These so-called SHACL shape graphs can be used to validate self-generated RDF statements against the Information Model and check if:

a Connector's self-description is valid a Resource is described using the correct metadata terms an HTTP multipart message exchanged between IDS components provides the necessary information

The SHACL shapes can be found in the testing subdirectory of the IDS Information Model[2].

1       IDS  Vocabulary Provider

The IDS Vocabulary Provider plays an essential role within the IDS reference architecture. It is the middleware and registry that host data and link between the Data Governance, Security, Privacy and Sovereignty layer (based on IDS reference architecture) and the Interoperability layer formed. 

As aforementioned, IDS Vocabulary Provider is a central IDS component that manages all the vocabularies (ontologies, reference data models, metadata elements), which can be used to annotate and describe datasets and data analytics tools. It is a special connector that allows data exchanges of the requested information in a secure manner in order words, through IDS msg. For instance, Company A and Company B can send SPARQL queries from their connectors to Vocabulary Provider through IDS messages and can use the Vocabulary Provider to annotate and understand the corresponding data assets (datasets/services). Furthermore, the IDS vocabulary provider does not support only machine-readable communication in a secure manner through IDS msgs but also human-readable data representation through Graphical User Interfaces (GUI) and plugins, where users can easily manage vocabularies (upload/upgrade/delete), search for specific terms, and visualize the vocabularies in a network graph and execute SPARQL queries.

 The design and implementation of the developed IDS Vocabulary Provider are explained in the following sections.

The IDS Vocabulary Provider uses at its center VoCol [3] which is open-source software that allows managing (upload/modify/remove) ontologies using version control systems such as Git and repository hosting platforms like Github. For the Reverse proxy, a Multipart endpoint was developed using Apache Tomcat[3]. Apache Tomcat is a free and open-source implementation of the Jakarta Servlet, Jakarta Expression Language, and WebSocket technologies [4]. Apache Tomcat provides a "pure Java" HTTP web server environment in which Java code can run. For the Persistent storage, Apache Jena Fuseki [5] was used. Apache Jena Fuseki is a SPARQL server. It can run as an operating system service, a Java web application (WAR file), and a standalone server. Fuseki comes in two forms, a single system, "webapp", combined with a UI for admin and query, and as "main", a server suitable to run as part of a larger deployment, including with Docker or running embedded.  In this case, the latter was used, and it was embedded into a Docker. Fuseki provides the SPARQL 1.1 protocols for query and update as well as the SPARQL Graph Store protocol. Fuseki is tightly integrated with TDB to provide a robust, transactional persistent storage layer and incorporates Jena text query.

1.1 Monitoring Feature within IDS Vocabulary Provider

Another important feature that was developed within the IDS project - T68 is a Monitoring Feature using Grafana. This is due to the high interactions for instance in Mobility Data Space, between the IDS vocabulary provider and the broker or even the connector, APIs for monitoring and logging are needed. Technical operators use monitoring tools such as Grafana, or Graylog to visualize these interactions such as component failures, updates etc.  Professional operators are heavily rely on standard monitoring capabilities and a dashboard to record events and interactions between the IDS vocabulary provider and the connector/broker, thus understanding the behavior of the IDS components and improving the services. Therefore, the IDS Vocabulary Provider should be able to interact with such monitoring/logging systems through logs to provide information about updates, configuration changes, component failures etc. To implement this We used 2 plugins: Prometheus and Grafana. Prometheus is responsible for exposing application-specific metrics and those can be used as a data source in Grafana. After integrating Prometheus in our nodeJS application we use it as a data source in Grafana and use a NODE JS dashboard to visualize those metrics in our dashboard prom-client is the most popular Prometheus client library for Node.js. It provides the building blocks to export metrics to Prometheus via the pull and push methods and supports all Prometheus metric types such as histogram, summaries, gauges and counters. Prometheus is important also without data Grafana doesn't know what is happening basically and Prometheus is the main component which exposes application-related metrics.

Prometheus client library was used in NodeJs application to export metrics such as CPU usage, restarts, API routes etc.  It basically works as a data source for grafana.  It scraps targets in internal of 5sec but can be configured according to our needs. 

Grafana is responsible for visualizing data sources via graphs or chat. The metrics displayed in Grafana comes from a data source (Prometheus in our case) below. Both Prometheus and Grafana are running through docker containers. Once metrics are available in Grafana we want to view them in Grafana and for this Dashboard has to be configured. You can import the prebuilt NodeJS dashboard in Grafana from this folder (/docker/grafana/NodeJS-Dashboard.json)

It monitors metrics for node.js and express router status and total IDS-descriptionRequest messages sent to VocabularyProvider.

1.2       Interaction With IDS Vocabulary Provider

Apart from the GUI, the  IDS Vocabulary supports machine-to-machine communication allowing to query the different ontologies directly through an IDS connector. To do this, a REST API service was built on top of Vocol to allow the exchange of information between Vocol, Fuseki, and the IDS Core Container.  The IDS Core Container was developed using the SpringBoot library. A number of IDS messages have been implemented. Some of the messages allow IDS connectors from the pilots to communicate with the  IDS Vocabulary Provider. Other messages allow the  IDS Vocabulary Provider to communicate with other components of the IDS ecosystem, such as the Broker. In the following sections, each of the messages is explained in more detail. Also, the complete text of the messages is included in Appendix A.

1.2.1       DescriptionRequestMessage

This message allows a connector to call the  IDS Vocabulary Provider and obtain generic information from it; specifically, requesting a property by providing URI of it. It is a multipart POST message that will contain a  header and a payload. In the header, the content of the message is displayed. In this case, the payload is empty.

#URL to call: SERVER-URL:8080\api\ids\data

URL to call : SERVER-URL/infrastructure

1.2.2      DescriptionResponseMessage

This message is the corresponding reply message to the description request message with the same information regarding requested Elemet.

1.2.3      QueryMessage

This message allows an IDS connector to send SPARQL queries directly to the  IDS Vocabulary Provider. It is a multipart POST message that contains a header and a payload. The header contains the content of the message, and the payload contains the corresponding SPARQL query.

URL call: SERVER-URL:8080\api\ids\data

URL call: SERVER-URL:8080/infrastructure

As mentioned, the header contains the content of the message. This message contains a series of fields defined by the IDS reference architecture, which will vary depending on the type of message. The specific fields are the type of message, context, token, and information model used. 

If everything has gone correctly, a ResponseMessage will be returned with the results of the query specified in the payload field as shown in the following Figure:

2       Deployment of IDS Vocabulary Provider

The code of IDS Vocabulary Provider is hosted in ithub repository: https://github.com/International-Data-Spaces-Association/IDS-VocabularyProvider

The development environment of the IDS Vocabulary Provider consist on three main components:

Nginx reverse proxy Vocol: an open-source Vocabulary Manager. Apache Jena Fuseki and TDB: A Sparql Server and database for storing and accessing the different vocabularies. IDS Vocabulary Provider: The core component that permits the communication between the conectors and the vocabulary provider. 2.1      Prerequisites

Prerequisite to run the IDS Vocabulary Provider:

Docker Docker Compose Java Maven OpenSSL

2.2      Structure of IDS Vocabulary Provider

Normally there will exist one Vocabulary Provider for each ecosystem type. There will be a docker-compose file that will be in charge of creating all the corresponding images, the containers and running them. All the containers will be able to communicate internally.

Apart from the docker-compose file, there will be four dockerfiles, one for each of the mentioned components:

Nginx-reverseProxy: Will create the image for nginx reverse proxy to handle the routing of each component Vocol: Will create the image for the vocol manager, downloading all the needed requirements; nodejs, npm etc. .and execute the application. Fuseki: Will create the image for the Fuseki server and launch the server. IDS Vocabulary provider: will create the image for the core component and launch the corresponding server, a Java environment running the Maven package of the code. The IDS Vocabulary provider is a connector and so, it needs some certificates to implement a secure communication between the different components. Hence, it will communicate with a DAPS and the DAPS will give the connector a valid token each time a message is send. This component will communicate also with the Fuseki component to execute the requested queries and to obtain the results.

2.3      Creation of SSL Certificates

A valid X.509 certificate, signed by a trusted certification authority, is strongly recommended to avoid warnings about insecure HTTPS connections. The certificate needs to be of .crt format and must have the name server.crt. In case your certificate is of .pem format, it can be converted with the following commands, which require OpenSSL to be installed:

OpenSSL x509 -in mycert.pem -out server.crt

OpenSSL RSA -in mycert.pem -out server.key

mkdir cert

mv server.crt cert/

mv server.key cert/

2.4      Configuring the Docker-compose File

The docker-compose file is responsible of launching the four service containers needed for the Vocabulary Provider to run properly and enable the communication of the three components as they are built in the same network.

Each of the components is exposed in one specific port. Then, if a certain port is occupied by another component of the ecosystem, it is possible to change this port through this file.

The changes of the port can be done in the Vocol component and in the IDS Vocabulary provider if needed, as these components are independent. To do that, edit the docker-compose file and change the corresponding port:

vocab-vocol: image: registry.gitlab.cc-asp.fraunhofer.de/vocoreg/open-src/vocol expose: - "8888" ports: - "8888:8888" command: [ "npm", "start","8888", "3030", "http://vocab-fuseki"]

Another crucial part of adapting the configuration is to provide the correct location of the X.509 certificate in the IDS messages service. Assuming the location of the certificate is "/VP-root/idsvocabularyprovider/docker/reverseproxy-vocol/cert ," the corresponding configuration would be:

volumes:

In addition, when building docker image of Nginx same certificates must be added to the containers and should be used in nginx.conf 

[…]

1.2.4.1       Run Application

Build Images: To build the required docker images navigate to idsvocabularyprovider/docker folder and run command 'sh buildImagesLocally.sh '. This will create 4 images necessary to run IDS vocabulary provider.  

To start up the IDS Vocabulary Provider, run the following command inside the directory(composeFiles) of the docker-compose.yml file:

docker-compose up -d

This process can take few seconds to complete. You can test whether the IDS Vocabulary Provider has successfully started by opening the following URLs:

https://localhost/ or http://localhost:8080/ - This is the main page of vocabulary provider with its self description http://localhost/:8888 or https://localhost/vocob/. This is the main page of the vocol service. The UI of the vocol will appear listing the existing ontologies (if this is first time vocol is being run then an instance should be created). http://localhost:3030/ or https://localhost/fuseki/ If the fuseki server is running properly, you could see the main page for the fuseki manager. This page will only be used for manteinance purposes. The dot in the right (server status) side must be green

To enter to a running container, you could use:

Docker exec -it container_name /bin/bash : to get a bash shell in the container

Exit to get out

To see the logs of the container:

Go to the directory containing the docker files and:

Docker-compose logs1.2.4.2       Update 

Alternatively, one can restart the entire service by running:

docker-compose down : To stop all the containers

docker-compose up –d

Appendix A: IDS Vocabulary Provider – IDS messages

DescriptionRequestMessage

Header:

{ "@context" : { "ids" : "https://w3id.org/idsa/core/", "idsc" : "https://w3id.org/idsa/code/", "xsd":"http://www.w3.org/2001/XMLSchema#" }, "@type" : "ids:DescriptionRequestMessage", "@id" : "http://industrialdataspace.org/1a421b8c-3407-44a8-aeb9-253f145c869a", "ids:issued" : {"@value":"2021-05-25T15:35:34.589Z","@type":"xsd:dateTimeStamp"}, "ids:modelVersion" : "4.0.0", "ids:senderAgent":{"@id":"https://localhost/agent"}, "ids:issuerConnector":{"@id":"https://localhost/59a68243"}, "ids:securityToken" : { "@type" : "ids:DynamicAttributeToken", "@id" : "https://w3id.org/idsa/autogen/dynamicAttributeToken/2bd53efc-5995-d75590476820", "ids:tokenFormat" : { "@id" : "https://w3id.org/idsa/code/JWT" }, "ids:tokenValue" : "{{dat}}" }, "ids:requestedElement":{ "@id": "http://w3id.org/mds#DataCategory" } }

Payload:

Empty

 

ResponseMessage

--u0aGPDffsSDXXeGNrF93gmvHWNa4ulde Content-Disposition: form-data; name="header" Content-Type: application/ld+json Content-Length: 2477

{   "@context" : {     "ids" : "https://w3id.org/idsa/core/",     "idsc" : "https://w3id.org/idsa/code/"   },   "@type" : "ids:DescriptionResponseMessage",   "@id" : "https://w3id.org/idsa/autogen/descriptionResponseMessage/b4448dc5-17d1-416a-986c-32a0fe292317",   "ids:issued" : {     "@value" : "2022-05-23T16:25:43.844Z",     "@type" : "http://www.w3.org/2001/XMLSchema#dateTimeStamp"   },   "ids:correlationMessage" : {     "@id" : "http://industrialdataspace.org/1a421b8c-3407-44a8-aeb9-253f145c869a"   },   "ids:issuerConnector" : {     "@id" : "https://www.iais.fraunhofer.de"   },   "ids:senderAgent" : {     "@id" : "https://www.iais.fraunhofer.de"   },   "ids:modelVersion" : "4.2.8-SNAPSHOT",   "ids:securityToken" : {     "@type" : "ids:DynamicAttributeToken",     "@id" : "https://w3id.org/idsa/autogen/dynamicAttributeToken/34f24f3a-2d6b-4ebc-98ee-0f2edab5d371",     "ids:tokenValue" : "eyJ0eXAiOiJhdCtqd3QiLCJraWQiOiJUQ1VGZUNOYXphbEtIZzlLenJ6TElBelJXVE1ERFdTYTdMY005WndITXlvIiwiYWxnIjoiUlMyNTYifQ.eyJhdWQiOiJpZHNjOklEU19DT05ORUNUT1JTX0FMTCIsImlzcyI6Imh0dHBzOi8vZGFwcy5haXNlYy5mcmF1bmhvZmVyLmRlIiwic3ViIjoiNUQ6OTc6ODM6RTc6RkU6MkM6MzQ6MDg6RTU6NzM6N0M6Mzc6ODM6QzI6OUE6NkQ6RjE6QzU6MTA6Mjc6a2V5aWQ6Q0I6OEM6Qzc6QjY6ODU6Nzk6QTg6MjM6QTY6Q0I6MTU6QUI6MTc6NTA6MkY6RTY6NjU6NDM6NUQ6RTgiLCJuYmYiOjE2NTMzMjMxNDMsImlhdCI6MTY1MzMyMzE0MywianRpIjoiTmprMk5EazVNelkzTmpJNE5UWTBOVFV3TkE9PSIsImV4cCI6MTY1MzMyNjc0MywiY2xpZW50X2lkIjoiNUQ6OTc6ODM6RTc6RkU6MkM6MzQ6MDg6RTU6NzM6N0M6Mzc6ODM6QzI6OUE6NkQ6RjE6QzU6MTA6Mjc6a2V5aWQ6Q0I6OEM6Qzc6QjY6ODU6Nzk6QTg6MjM6QTY6Q0I6MTU6QUI6MTc6NTA6MkY6RTY6NjU6NDM6NUQ6RTgiLCJzZWN1cml0eVByb2ZpbGUiOiJpZHNjOkJBU0VfU0VDVVJJVFlfUFJPRklMRSIsInJlZmVycmluZ0Nvbm5lY3RvciI6Imh0dHA6Ly9pZHMuYnJva2VyLnRlc3QubW9iaWxpdHlkYXRhc3BhY2UuaW8uZGVtbyIsIkB0eXBlIjoiaWRzOkRhdFBheWxvYWQiLCJAY29udGV4dCI6Imh0dHBzOi8vdzNpZC5vcmcvaWRzYS9jb250ZXh0cy9jb250ZXh0Lmpzb25sZCIsInRyYW5zcG9ydENlcnRzU2hhMjU2IjoiOTU5YmRjMjAyNWYxY2FlYWQ3ZmNiNzRmNzQwZGQ1NGQ3ZjNkN2Q4Yjg1ODJlMmQxN2ZhNzg3ZTc0MzE5YmRlNCIsInNjb3BlcyI6WyJpZHNjOklEU19DT05ORUNUT1JfQVRUUklCVVRFU19BTEwiXX0.HqM6AC9sxaoNaSeERaeAWslL96uz-Q6XS0cXbQ1pocgg4_RJ90y4G_p_rBPTT6WB0bhQH-c0WLtEO-lVf7BTgy_h9t-BLmDvUcHQWAkZEziafUCkOm9-6i48V0v5TmJzkHvfCqGAVZFxx66ipsiNHKYYbqyA_hSa0OaHlX958jYgJ2aEu8w4onfPiEOWpYxaSuSn647h0bCqF-VLfsdyKheiwGsk5V3BeyCHRwS-vVeQ1eawkWS7qJ5TtuDMjw3BhJsHRvtekly1ph0Or20uBaUAgqq_e4RsRymuoKQ22MJKwtdDIa60FPS1EdYVzlHhUnWR-FCSsFgiglnl-3A7Sg",     "ids:tokenFormat" : {       "@id" : "https://w3id.org/idsa/code/JWT"     }   } } --u0aGPDffsSDXXeGNrF93gmvHWNa4ulde Content-Disposition: form-data; name="payload" Content-Type: application/ld+json Content-Length: 552

{   "@id" : "http://purl.org/db/nosql#Cypher",   "@type" : "http://www.w3.org/2002/07/owl#Class",   "comment" : {     "@language" : "en",     "@value" : "Query language used to intereact with a Neo4j database."   },   "subClassOf" : "http://purl.org/db/nosql#QueryLanguage",   "@context" : {     "subClassOf" : {       "@id" : "http://www.w3.org/2000/01/rdf-schema#subClassOf",       "@type" : "@id"     },     "comment" : {       "@id" : "http://www.w3.org/2000/01/rdf-schema#comment"     },     "rdfs" : "http://www.w3.org/2000/01/rdf-schema#"   } }

--u0aGPDffsSDXXeGNrF93gmvHWNa4ulde--

 

 

QueryMessage

--msgpart Content-Type: application/json Content-Disposition: form-data; name="header"

{   "@context" : {     "ids" : "https://w3id.org/idsa/core/",     "idsc" : "https://w3id.org/idsa/code/"   },   "@type" : "ids:QueryMessage",   "@id" : "https://w3id.org/idsa/autogen/queryMessage/dbb77622-7508-4630-9830-aa07b196eebc",   "ids:securityToken" : {     "@type" : "ids:DynamicAttributeToken",     "@id" : "https://w3id.org/idsa/autogen/dynamicAttributeToken/f9f2b139-0e9b-4e6f-b320-abf22a7224aa",     "ids:tokenFormat" : {       "@id" : "idsc:JWT"     },     "ids:tokenValue" : "{{dat}}"   },   "ids:senderAgent" : {     "@id" : "https://apptest.connector.de/"   },   "ids:modelVersion" : "4.0.0",   "ids:issued" : {     "@value" : "2020-06-23T16:10:57.781+02:00",     "@type" : "http://www.w3.org/2001/XMLSchema#dateTimeStamp"   },   "ids:issuerConnector" : {     "@id" : "https://apptest.connector.de/"   },   "ids:queryLanguage" : {     "@id" : "idsc:SPARQL"   },   "ids:queryScope" : {     "@id" : "idsc:ALL"   } } --msgpart Content-Type: text/plain Content-Disposition: form-data; name="payload"

SELECT ?subject ?predicate ?object WHERE {   ?subject ?predicate ?object } --msgpart--

Reference 

[1] https://github.com/International-Data-Spaces-Association/InformationModel

[2] https://github.com/International-Data-Spaces-Association/InformationModel/tree/develop/testing

[3] https://vocol.iais.fraunhofer.de/

[4] http://tomcat.apache.org/

[5] https://projects.eclipse.org/projects/ee4j.jakartaee-platform

[6] https://jena.apache.org/documentation/fuseki2/

[7] https://github.com/vocol/vocol

[8] https://www.vocoreg.com/

Clone this wiki locally