This is a skeleton repository for generating graph structured data with engine block and AssetHub at DataStax. There are four sections to this README:
A. A Quick startup.order
refresher
C. Mapping Variables Between Files
A quick refresher on understanding .startup/startup.order
There are four required items for creating an asset in AssetHub that generates graph structured data.
hugo
This startup script downloads and installs hugo
. AssetHub use hugo for asset documentation.
nomnoml
The nomnoml
startup script starts the server for documentation. Documentation is viewable on port :1313
.
ebdserun
This startup script starts the data insertion process by calling runebdse.sh
uploadNotebook
The uploadNotebook
startup script uses the Studio APIs and imports the NotebookSkeleton.studio-nb.tar
to studio
Most of your work will be in the following files:
ebdse/runebdse.sh
This file defines the size of your graph, different engine block variables, and engine block commands.
ebdse/activities/driver.yaml
This file contains the dse graph statements that engine block will execute.
NotebookSkeleton.studio-nb.tar
This an example notebook that posts at the end of the data generation process.
docsrc/content/index.html
This file is the asset specific documentation to view on port :1313
.
The bulk of the work is understanding how to translate variables defined in runebdse.sh
over to driver.yaml
. This section gives you a quick tour of the ones you will use the most.
/tmp/ebdse/ebdse run yaml=driver
The important piece here is yaml=driver
. This indicates that the commands for ebdse
are in driver.yaml
tags=phase:create-graph
This indicates which phase to execute in driver.yaml
. This maps to the definition in your driver.yaml
that looks like:
blocks:
- name: create-graph
tags:
phase: create-graph
nameofgraph=$graphname
This indicates which graph to alter. In runebdse.sh
, you set this via graphname=<<nameOfYourGraph>>
.
host=$host
This is configurable to be your local host or a node in your cluster. The best options:
a. node0 (for working in a cluster)
b. localhost (for working locally)
cycles=$person
cycles
indicates how many times the statement is to be executed. In runebdse.sh
, we defined person=500
. This means that the statementn in driver.yaml
under the phase take will be executed person
number of times, or 500
times.
For schema and set-up related statements, we want cycles=1
.
- All other
variable=$variable
statements
This is the key to passing variables over to ebdse
execution statements. The variables you define in runebdse.sh
can be passed as command line arguments with this syntax