Translation layer between the GDC Data Dictionary and psqlgraph
GDC Data Model

Repo to keep information about the GDC data model design.


To install the gdcdatamodel library run the setup script:

❯ python install

Jupyter + Graphviz

It's helpful to examine the relationships between nodes visually. One way to do this is to run an Jupyter notebook with a Python2 kernal. When used with Graphviz's SVG support, you can view a graphical representation of a subgraph directly in a REPL. To do so, install the dev-requirements.txt dependencies. There is an example Jupyter notebook at examples/jupyter_example.ipynb (replicated in examples/ for clarity)

pip install -r dev-requirements
PG_USER=* PG_HOST=* PG_DATABASE=* PG_PASSWORD=*   jupyter notebook examples/jupyter_example.ipynb


Visual representation

For instructions on how to build the Graphviz representation of the datamodel, see the docs readme.


Before continuing you must have the following programs installed:

The gdcdatamodel library requires the following pip dependencies

Project Dependencies

Project dependencies are managed using PIP

Example validation usage

from gdcdatamodel import node_avsc_object
from gdcdatamodel.mappings import get_participant_es_mapping, get_file_es_mapping
from import validate
import json

with open('examples/nodes/aliquot_valid.json', 'r') as f:
    node = json.loads(
print validate(node_avsc_object, node)  # if valid, prints True

print(get_participant_es_mapping())  # Prints participant elasticsearch mapping
print(get_file_es_mapping())         # Prints file elasticsearch mapping

Example Elasticsearch mapping usage

from gdcdatamodel import mappings
print mappings.get_file_es_mapping()
print mappings.get_participant_es_mapping()


❯  nosetests -v
test_invalid_aliquot_node (test_avro_schemas.TestAvroSchemaValidation) ... ok
test_valid_aliquot_node (test_avro_schemas.TestAvroSchemaValidation) ... ok

Ran 2 tests in 0.033s



