This repository has been archived by the owner on Oct 28, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 114
/
genotypephenotype.avdl
176 lines (140 loc) · 5.79 KB
/
genotypephenotype.avdl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
@namespace("org.ga4gh.models")
/**
This protocol defines the associations between genotype
and phenotype (G2P). Associations can be made as a
result of literature curation, computational modeling,
inference, etc., and modeled and shared using this schema.
Here, we follow the dogma of:
Genotype + Environment = Phenotype
where a G2P association is between the G(enotype) in the context of
some E(environment), which gives rise to a P(henotype). These
associations have further evidence, provenance, and attribution.
We leverage the GenomicFeature in the sequenceAnnotation schema here
as it can accomodate any genomic feature from a single nucleotide variation
(SNV), up through a gene, and/or complex rearrangements. Each can
be modeled as genomic features, and generally linked to a phenotype.
Collections of these features can represent a genotype at different levels
of completeness. Therefore, we can represent single allelic variation,
allelic complement, and multiple variants in a genotype that can each or
collectively be associated with a phenotype.
To enable standardized integration, this schema relies heavily on
OntologyTerms, for typing phenotype, genomic features, and levels
of evidence. Suggested ontologies to leverage include (with browser links):
Human Phenotype Ontology (HPO): http://www.ontobee.org/browser/index.php?o=hp
Disease Ontology (DO): http://purl.obolibrary.org/obo/DOID_4
Sequence Ontology (SO): http://www.sequenceontology.org/browser/
Evidence Code Ontology (ECO): http://www.ontobee.org/browser/index.php?o=ECO
Phenotypic Qualities (PATO): http://www.ontobee.org/browser/index.php?o=PATO
*/
protocol GenotypePhenotype {
import idl "common.avdl";
import idl "metadata.avdl";
import idl "sequenceAnnotations.avdl";
/**
The context in which a genotype gives rise to a phenotype.
This is fairly open-ended; as a stub we have a simple ontology term.
For example, a controlled term for a drug, or perhaps an instance of a
complex environment including temperature and air quality, or perhaps
the anatomical environment (gut vs tissue type vs whole organism).
*/
record EnvironmentalContext {
/** The Environment ID. */
union { null, string } id = null;
/**
Examples of some environment types could be drawn from:
Ontology for Biomedical Investigations (OBI): http://purl.obofoundry.org/obo/obi/browse
Chemical Entities of Interest (ChEBI): http://www.ontobee.org/browser/index.php?o=chebi
Environment Ontology (ENVO): http://www.ontobee.org/browser/index.php?o=ENVO
Anatomy (Uberon): http://www.ontobee.org/browser/index.php?o=uberon
*/
OntologyTerm environmentType;
/**
A textual description of the environment. This is used to complement
the structured description in the environmentType field
*/
union { null, string } description = null;
}
/**
An association to a phenotype and related information.
This record is intended primarily to be used in conjunction with variants, but
the record can also be composed with other kinds of entities such as diseases
*/
record PhenotypeInstance {
/** The Phenotype ID. */
union { null, string } id;
/** HPO is recommended */
OntologyTerm type;
/**
PATO is recommended. Often this qualifier might be for abnormal/normal,
or severity.
For example, severe: http://purl.obolibrary.org/obo/PATO_0000396
or abnormal: http://purl.obolibrary.org/obo/PATO_0000460
*/
union { null, array<OntologyTerm> } qualifier = null;
//TODO: also allow quantitative recording?
/**
HPO is recommended, for example, subclasses of
http://purl.obolibrary.org/obo/HP_0011007
*/
union { null, OntologyTerm } ageOfOnset = null;
/**
A textual description of the phenotype. This is used to complement the
structured phenotype description in the type field.
*/
union { null, string } description = null;
}
/**
Evidence for the phenotype association.
This is also a stub for further expansion. We should consider moving this into
it's own schema.
*/
record Evidence {
/** ECO or OBI is recommended */
OntologyTerm evidenceType;
/**
A textual description of the evidence. This is used to complement the
structured description in the evidenceType field
*/
union { null, string } description = null;
}
/**
An association between one or more genomic features and a phenotype.
The instance of association allows us to link a feature to a phenotype,
multiple times, each bearing potentially different levels of confidence,
such as resulting from alternative experiments and analysis.
*/
record FeaturePhenotypeAssociation {
string id;
/**
The set of features of the organism that bears the phenotype.
This could be as complete as a full complement of variants,
or as minimal as the confirmed variants that are known causation
for the annotated phenotype.
Examples of features could be variations at the nucleotide level,
large rearrangements at the chromosome level, or relevant epigenetic
markers. Relevant genomic feature types are suggested to be
those typed in the Sequence Ontology (SO).
The feature set can have only one item, and must not be null.
*/
array<Feature> features;
/**
The evidence for this specific instance of association between the
features and the phenotype.
*/
array<Evidence> evidence;
/**
The phenotypic component of this association.
Note that we delegate this to a separate record to allow us the flexibility
to composition of phenotype associations with records that are not
variant sets - for example, diseases.
*/
PhenotypeInstance phenotype;
/** A textual description of the association. */
union { null, string } description = null;
/**
The context in which the phenotype arises.
Multiple contexts can be specified - these are assumed to all hold together
*/
array<EnvironmentalContext> environmentalContexts;
}
}