Skip to content

Open-MBEE/mms-rdf

Repository files navigation

MMS-RDF

Prototype Architecture

Slides

See slides here.

Difference Between RDF Graph and LPG

The generated labelled property graph (LPG) is a view of the Project Data Graph. Its purpose is to provide a means to perform graph traversal queries on instance-level data. The LPG view strictly covers the datatype properties of, and relationships between, model elements. It does not cover the UML and MMS ontologies, so queries that depend on the type hierarchy from the UML metmodel, such as those involving subclass relations, are not supported. In other words, while RDF allows for ABox and TBox statements to exist within the same model, LPGs are mostly suited for ABox statements.

General Requirements

Node.js version 10.*, 11.*, 12.* .

make (for building native add-ons for node.js; node-gyp)

'jq' executable.

The following environment variables also need to be set:

  • SPARQL_ENDPOINT - URL of the SPARQL endpoint (local or remote) which will be updated and queried during triplification, e.g., http://localhost:13030.
  • MMS_PROJECT_ALIAS - name of the project (determines name of generated data graph)
  • MMS_MAPPING_FILE - path to input JSON mapping file(s) (either absolute path or relative to this project root dir)
  • MMS_PROJECT_ID - the project ID to download from openmbee.org MMS API.

Using with AWS Neptune Requirements

The following environment variables need to be set:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • NEPTUNE_ENDPOINT - deprecated and removed. use SPARQL_ENDPOINT instead.
  • NEPTUNE_S3_BUCKET - S3 Bucket URI (see example)
  • NEPTUNE_REGION - S3 Bucket region string
  • NEPTUNE_PROXY - (optional) define a proxy to tunnel requests to the endpoint thru (see example)

Setup

From the project root dir:

  1. Install the npm package: $ npm i
  2. Create an environment variables file. Here is an example file for local usage: .local.env:
#!/bin/bash
export SPARQL_ENDPOINT=http://localhost:13030/ds
export SPARQL_QUERY_PATH=sparql
export SPARQL_UPDATE_PATH=update

export MMS_PROJECT_ALIAS=tmt
export MMS_MAPPING_FILE=input/tmt/mapping.json

export MMS_PROJECT_ID=PROJECT-d94630c2-576c-4edd-a8cd-ae3ecd25d16c
export MMS_PROJECT_ALIAS=tmt
export MMS_BASE_URL=https://mms.openmbee.org/alfresco/service
export MMS_AUTH_USERNAME=openmbeeguest
export MMS_AUTH_PASSWORD=guest

export HTTP_MAX_REQUESTS=32
export HTTP_MAX_SOCKETS=32
  1. Load the environment variables into you shell session, $ source .local.env .
  2. Ensure that you have a mapping file located at the MMS_MAPPING_FILE path (a default mapping for the TMT project is provided with this repository).
  3. Ensure that you have a project data file ready to use. If you cloned this repository, you can run this command to download the latest TMT snapshot from OpenMBEE:
$ emk -f input/tmt/data.json -g '{insecure:1}'
  1. If you are connecting to AWS Neptune, make sure to open a tunnel to an EC2 instance that is within the same VPC as the Neptune cluster:
$ ssh -i aws.pem -D 3031 ubuntu@EC2_IP

Getting Started For Local Usage

  1. Follow the steps in Setup to initialize the project.
  2. Prepare a local triplestore. A helper bash script is provided at ./util/local-endpoint.sh which you can run from the command-line. The script will read the port from the SPARQL_ENDPOINT URI string and attempt to launch a named docker container of the Apache Jena Fuseki triplestore with an in-memory database that binds to the host at 0.0.0.0 on the port in the endpoint string (e.g., :13030). Be aware that re-running this script will overwrite older containers, i.e., the triplestore may have to be reloaded.
  3. Build the vocabulary graph: $ npx emk local.update.vocabulary.* . It is OK if you see warnings about UML properties having multiple keys, this means the system is handling URI minting conflicts. You can now inspect the source of the vocabulary Turtle files under ./build/vocabulary/.
  4. Build the instance data graph: $ ./util/build-local.sh input/tmt/data.json . This is a multithreaded build tool that may take a while depending on the size of the input dataset. A progress bar will be printed to console. It is OK if you see warnings about unmapped object keys that begin with an underscore (unmapped metadata properties).
  5. All done! The script from the previous step will stitch together all the output files into a single master output Turtle file located at ./build/.

All together, the commands in order might look like this:

$ source .local.env
$ ./util/local-endpoint.sh
$ npx emk local.update.vocabulary.*
$ ./util/build-local.sh input/tmt/data.json

Building and Uploading RDF for Remote Usage

Build tasks are handled by emk.js. The following tasks are available:

In the following sections, TYPE is a placeholder for either vocabulary or data.

Building graphs (i.e., triplification):

  • local.TYPE - build the given TYPE graph locally (e.g., local.vocabulary or local.data), which outputs graphs as both RDF (in Turtle format) and LPG (in CSV format).

Modifying contents of the remote triplestore:

  • remote.clear.TYPE - clear the given TYPE graph from the remote triplestore
  • remote.upload.TYPE - upload the local TYPE graph file to the S3 bucket
  • remote.update.TYPE - load the TYPE graph from the S3 bucket into the triplestore

Example:

# clear all graphs from remote triplestore
$ npx emk remote.clear.*

# upload vocabulary graph to S3 and then update Neptune
$ npx emk remote.upload.vocabulary remote.update.vocabulary

The build targets automatically depend on the necessary tasks, so you can simply run:

$ npx emk remote.update.*

which will build the vocabulary and data graphs, upload them to S3, and then update the triplestore.

Example .neptune.env file:

#!/bin/bash
export SPARQL_ENDPOINT=http://open-cae-mms.c0fermrnxxyy.us-east-2.neptune.amazonaws.com:8182
export NEPTUNE_S3_BUCKET_URL=s3://open-cae-mms-rdf
export NEPTUNE_S3_IAM_ROLE_ARN=arn:aws:iam::230084004409:role/NeptuneLoadFromS3
export NEPTUNE_REGION=us-east-2
export NEPTUNE_PROXY=socks://127.0.0.1:3031

export MMS_PROJECT_ALIAS=tmt
export MMS_MAPPING_FILE=input/tmt/mapping.json

export MMS_PROJECT_ID=PROJECT-d94630c2-576c-4edd-a8cd-ae3ecd25d16c
export MMS_BASE_URL=https://mms.openmbee.org/alfresco/service
export MMS_AUTH_USERNAME=openmbeeguest
export MMS_AUTH_PASSWORD=guest

export AWS_ACCESS_KEY_ID=YOUR_AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET_ACCESS_KEY