Quick Start Guide

mariehoffmann edited this page Oct 28, 2014 · 8 revisions

Getting Started

This short tutorial provides a brief overview of the Myriad internals.

  1. Bootstrap Your Project

To bootstrap the development of a new Myriad-based data generator project, please follow these steps.

First, create the root folder for the new project (let's call it my-datagen) and setup the initial folder structure.

  my_datagen=my-datagen #name your data generator
  mkdir $my_datagen
  cd $my_datagen
  mkdir vendor

a) Checkout Myriad's Git Repository

Assuming you want to use git to version the source code of your generator, the recommended way to do this is to check out the myriad-toolkit project as a git submodule:

  git init
  git submodule add git://github.com/TU-Berlin-DIMA/myriad-toolkit.git vendor/myriad-toolkit

b) Download Myriad as Zip

Alternatively you can download the myriad-toolkit as an archive file. Place the unzipped toolkit into the vendor folder and remove the version suffix:

  unzip -d ./vendor <path-to-myriad-toolkit.zip>
  mv ./vendor/myriad-toolkit-<version> ./vendor/myriad-toolkit

The Myriad Toolkit comes with a standard command line assistant tool available under vendor/myriad-toolkit/bin/assistant. This tool greatly simplifies the implementation process by providing support for common development tasks. As you probably end up using the CLI tool a lot (especially if you intend to develop a new generator from scratch), we suggest creating a soft link to it under the project root:

  ln -s vendor/myriad-toolkit/bin/assistant myriad-assistant

Take a look at the list of tasks supported by the assistant by calling it without any options or arguments:

  ./myriad-assistant

The first common task that can be handled by the assistant is the initialization of an empty new project:

  ./myriad-assistant initialize:project --ns=MyDataGen $my_datagen

This will create the basic structure of a new generator called my-datagen project and will use the C++ namespace MyDataGen as a default namespace for all C++ library extensions. When the task is complete, you will see two new directories (build and src) as well as several other files in your root folder. The input parameters for the initialize:project task are stored under my-datagen/.myriad-settings and will be used as default values for all other tasks (e.g. compile:prototype).

  1. Specify the Data Generator Program

The Myriad toolkit promotes a general-purpose data generation model centered around the generation of pseudo-random sequences of user defined domain types. To fully specify a Myriad data generator, the user must provide a family of domain types and an associated family of pseudo-random domain type generators (PRDGs), which essentially are programs that map sequences of pseudo-random numbers to sequences of the user-defined domain types.

The specification can be implemented at one of two possible levels - as a high-level XML prototype specification, or directly at the code level in one of the C++ classes extending the Myriad runtime library. The XML layer is the recommended entry point for new users, as it is well suited for rapid prototyping and probably sufficient for simple relational use-cases. Code level extensions are an advanced feature that is useful when tailor-made data generating logic is required and will not be discussed further in this section.

The XML specification for a Myriad-based data generator project is typically located under src/config/${my_datagen}-prototype.xml. In order to invoke the Myriad prototype compiler you have to execute the compile:prototype task in the assistant CLI tool:

  ./myriad-assistant compile:prototype

If you are working from the build folder, you can use the enclosing make target as a shortcut instead:

  make prototype

Note that when you create a new project, the src/config folder is initially empty. Before you execute the compile:prototype task, you have to create the XML specification for your project. You can do this in one of two ways:

  • The first option is to manually create the XML specification from scratch, following the expected XML specification syntax. You can find out more about the Myriad XML specification language in the XML Specification Reference Manual.
  • The second option is to use the Oligos tool to generate an XML that mirrors a reference database. This option is useful if you have a reference database with sensitive data that you want to mimic. A short introduction to the Oligos tool is available on the Using Oligos Guide.
  1. Build the Data Generator Binary

Before you start the build process, make sure you have updated your build configuration. To do so, run the ./configure tool or type

  ./configure --help 

to see more information about the supported build options. The ./configure script will create a file named makefile.defs that will contain all build variables. To start the build process, go to the build folder and type

  make all

If you want to deploy the compiled data generator to the configured install folder, type

  make install

after the compilation is finished. This will copy the contents of the build/bin, build/config and build/lib folders into a new folder my-datagen located under MYRIAD_INSTALL_DIR (check makefile.defs for the current MYRIAD_INSTALL_DIR value).

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.