Building Blocks and Schemas for GA4GH Implementations
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

GA4GH SchemaBlocks

A graph showing the basic objects and their relationships. The example attributes are placeholders for elements defined in the general schema description.

This repository contains the schema "blocks" for the GA4GH project, in a collaborative effort between members of the Clinical and Phenotypic Data Capture (GA4GH::CP) and the Genomic Knowledge Standards (GA4GH::GKS) work streams.

The primary documents are in the yaml directory, with JSON versions and examples extracted from them. The "readable" documentation is also created from the YAML files and can be accessed through the links below.

  • common (raw) object classes, which are used in the schemas themselves
  • biosample (raw) Most relevant "bio"data (such as diagnoses, phenotypes ...) is stored in the biosample object.
  • individual (raw) The individual object contains information which pertains to the whole biological entity biosamples are derived from (e.g. sex, heritable phenotypes...).

The "genomic" parts of the schema recommendations do not yet represent authoritative recommendations of the GA4GH::GKS group, but rather reflect extended versions of the original, VCF-derived GA4GH schema. Examples for current use of this schema are e.g. in the and the Beacon+ projects.

  • variant (raw) The variant object includes attributes and examples for both structural (DUP, DEL ...) and precise genome variants.
  • callset (raw) The callset object is for technoical data and series information (e.g. used platform and analysis metods). It is not strictly needed for querying combined variant + biosample aspects, since in the current implementation the variant object contains a reference to the biosample it was derived from.