The following briefly describes the complete workflow of mongo2neo4j using randomly generated sample data:
- generation of sample data and import into MongoDB
- extraction from MongoDB, transformation and import into Neo4j
- exploration with SemSpect
You need to have Node.js and npm installed on your system. Furthermore the data generator requires the mongoose library:
# npm install mongoose
The sample data simulates a typical web application using to the MEAN-stack with the help of the mongoose object modeling framework. The data set contains users with different subscriptions who have placed orders consisting of a set of products. You can adjust the number of objects in the sample data by changing the settings in generateDummyData.js.
In case your MongoDB deamon is set up and running with default setting just call the following to create and populate a MongoDB databases called mydatabase
:
# node generateDummyData.js
You can supply host, port and the database identifier when required:
# node generateDummyData.js [host [port [database]]]
The resulting database contains four collections (manually edited schema generated with Moon Modeler):
You can skip the generation process and directly import the additionally supplied database dump with:
# mongorestore --drop --gzip --archive=./MongooseEx_dump.gz
Use --uri='mongodb://localhost:27017'
to set host and port.
The Neo4j application must be installed. The most easiest way is to download and install Neo4j Desktop which is a desktop UI that allows to manage Neo4j DBMS on your local system. Furthermore, it's the most comfortable way of installing the SemSpect Graph App for Neo4j.
Once you have installed Neo4j Desktop:
- create a local DBMS
- install the SemSpect Graph App for Neo4j Desktop and the plugin for the created DBMS
- start the DMBS
When both - MongoDB and Neo4j - are running with the default settings you simply can load the data from MongoDB and import into Neo4j just by specifying the Neo4j password (here we use the default user neo4j
) and the MongoDB database (using the mydatabase
database that was created by the generation script as default).
# mongo2neo4j --neo4j_password=<your-password> mydatabase --conf mongo2neo4j.conf
The configuration file mongo2neo4j.conf
is optional but shows some key configurations that typically occur in most use cases (see the Script Arguments for a full list of options):
{
"chunk_size": 1000,
"sublabels": [
"users.subscription._user,`no subscription`"
],
"relations": [
"users.country,countries.isoname"
]
}
The sublabels configuration creates an additional Neo4j label for the value found in the subscription field (which are either free
, home
, pro
or no value in our sample data) for each object in the user
collection. A postfix _user
will be added to each label and the objects which do not have any value will get the no subscription
label.
The relations configuration furthermore adds an additional relationship to the Neo4j graph from objects in the user
collection to objects in the countries
collection whenever the field country
matches the field isoname
.
Given this configuration the resulting Neo4j graph schema looks as follows (shows the labels used in the graph data and all relationships in between them):
The objects of each collection get at least one Neo4j label named after the collection name (user
, orders
, products
, countries
). However, there is an additional label called orders_products
. That label is created by mongo2neo4j automatically and assigned to each JSON object in the array of the products slot of the objects in the orders collection (gray box in the picture above).
Note: you can skip data generation and data import by importing the also supplied Neo4j dump (file neo4j.dump
) directly into a Neo4j DBMS.
Now open the SemSpect Graph App and explore the graph by picking a label from the tree view on the left and dropping it into the exploration pane in the center. Expand the graph exploration by successively expanding the relationships.