This repository contains the schema "blocks" for the GA4GH project, in a collaborative effort between members of the Clinical and Phenotypic Data Capture (GA4GH::CP) and the Genomic Knowledge Standards (GA4GH::GKS) work streams.
The primary documents are in the yaml directory, with JSON versions and examples extracted from them. The "readable" documentation is also created from the YAML files and can be accessed through the links below.
- common (raw) object classes, which are used in the schemas themselves
- biosample (raw)
Most relevant "bio"data (such as diagnoses, phenotypes ...) is stored in the
- individual (raw)
individualobject contains information which pertains to the whole biological entity biosamples are derived from (e.g. sex, heritable phenotypes...).
The "genomic" parts of the schema recommendations do not yet represent authoritative recommendations of the GA4GH::GKS group, but rather reflect extended versions of the original, VCF-derived GA4GH schema. Examples for current use of this schema are e.g. in the arraymap.org and the Beacon+ projects.
- variant (raw)
variantobject includes attributes and examples for both structural (DUP, DEL ...) and precise genome variants.
- callset (raw)
callsetobject is for technoical data and series information (e.g. used platform and analysis metods). It is not strictly needed for querying combined variant + biosample aspects, since in the current implementation the
variantobject contains a reference to the
biosampleit was derived from.