Home

GMQL

What is GMQL?
GenoMetric Query Language Engine

WHAT IS GMQL?

GMQL is a closed algebra over datasets: each operation applies to one or two datasets including many samples; the result of each operation is a new dataset, also including many samples. Thus, operations build new regions and trace the provenance of each resulting sample, by computing the sample metadata. It is inspired by Pig Latin (a language in the Hadoop family) and targeted to cloud.

GenoMetric Query Language (GMQL) Engine

A GMQL script is expressed as a sequence of GMQL operations with the following structure:

<dataset> = operation(<parameters>) <datasets>

where each dataset stands for a Genomic Data Model (GDM) dataset. Operations are either unary (with one input dataset), or binary (with two input datasets), and construct one result dataset.

For Quick Start please refer to:

Installation Guide

For detailed GMQL language documentation:

GMQL Language Commands and documentation.

For a look at GDMS architecture:

Engine architecture and deployments.

For programmatical importing of GDMS kernel JARs in Scala applications and programmatically scripting GMQL in Scala:

Scripting GMQL programmatically.

For more information about GDMS repository architecture and repository manager:

Repository Manager

GDMS repository is based on a dataset notion, for more information about the data module and GDM dataset architecture:

GDM DataSet architecture.

Shell API is provided for GDMS repository, to list datasets, add, delete, alter datasets in GDMS repository:

Repository Manager shell API

The first step in the installation is to understand the engine configurations, currently, we have two sets of configurations. One set of configurations for the repository and the other for the executor.

Engine Configurations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

WHAT IS GMQL?

GenoMetric Query Language (GMQL) Engine

Clone this wiki locally