Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 404 #416

Merged
merged 13 commits into from
Apr 1, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -3,3 +3,4 @@ __pycache__
archive
docs/build
venv
pyproject.toml
3 changes: 2 additions & 1 deletion .requirements.txt
Original file line number Diff line number Diff line change
@@ -4,4 +4,5 @@ jsonschema==3.2.0
ipython
pyyaml
ga4gh.gks.metaschema==0.2.0rc4
sphinx ~= 3.5
sphinx ~= 4.5
sphinx-rtd-theme ~= 1.2
38 changes: 20 additions & 18 deletions docs/source/terms_and_model.rst
Original file line number Diff line number Diff line change
@@ -368,22 +368,21 @@ Systemic Variation

.. include:: defs/SystemicVariation.rst

.. _AbsoluteCopyNumber:
.. _CopyNumberCount:

AbsoluteCopyNumber
$$$$$$$$$$$$$$$$$$
CopyNumberCount
$$$$$$$$$$$$$$$

*Absolute Copy Number Variation* captures the copies of a molecule within a
genome, and can be used to express concepts such as amplification
and copy loss. Copy Number Variation has conflated meanings in the
*Copy Number Count* captures the integral copies of a molecule within a
genome. Copy Number Count has conflated meanings in the
genomics community, and can mean either (or both) the notion of copy
number *in a genome* or copy number *on a molecule*. VRS separates
the concerns of these two types of statements; this concept is a type
of :ref:`SystemicVariation` and so describes the number of copies in a
genome. The related :ref:`MolecularVariation` concept can be expressed
as an :ref:`Allele` with a :ref:`RepeatedSequenceExpression`.

.. include:: defs/AbsoluteCopyNumber.rst
.. include:: defs/CopyNumberCount.rst

**Examples**

@@ -401,21 +400,24 @@ Two, three, or four total copies of BRCA1:
"gene_id": "ncbigene:348",
"type": "Gene"
},
"type": "AbsoluteCopyNumber"
"type": "CopyNumberCount"
}

.. _RelativeCopyNumber:
.. _CopyNumberChange:

RelativeCopyNumber
$$$$$$$$$$$$$$$$$$
CopyNumberChange
$$$$$$$$$$$$$$$$

*Relative Copy Number Variation* captures a classification of copies
*Copy Number Change* captures a categorization of copies
of a molecule within a system, relative to a baseline. These types
of Variation are common outputs from CNV callers, particularly in the
somatic domain where Absolute Copy Counts are difficult to estimate
and less useful in practice than relative statements.
somatic domain where integral :ref:`CopyNumberCount` are difficult to
estimate and less useful in practice than relative statements. Somatic CNV
callers typically express changes as relative statements, and many HGVS
expressions submitted to express copy number variation are interpreted to be
relative copy changes.

.. include:: defs/RelativeCopyNumber.rst
.. include:: defs/CopyNumberChange.rst

**Examples**

@@ -424,12 +426,12 @@ Low-level copy gain of BRCA1:
.. parsed-literal::

{
"relative_copy_class": "low-level gain",
"relative_copy_class": "EFO_0030071", # low-level gain
"subject": {
"gene_id": "ncbigene:348",
"gene_id": "ncbigene:348", # BRCA1 gene
"type": "Gene"
},
"type": "RelativeCopyNumber"
"type": "CopyNumberChange"
}

.. _genotype:
3 changes: 0 additions & 3 deletions schema/defs/vrs/CopyNumber.rst

This file was deleted.

34 changes: 34 additions & 0 deletions schema/defs/vrs/CopyNumberChange.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
**Computational Definition**

An assessment of the copy number of a :ref:`Location` or a :ref:`Feature` within a system (e.g. genome, cell, etc.) relative to a baseline ploidy.

**Information Model**

Some CopyNumberChange attributes are inherited from :ref:`Variation`.

.. list-table::
:class: clean-wrap
:header-rows: 1
:align: left
:widths: auto

* - Field
- Type
- Limits
- Description
* - _id
- :ref:`CURIE`
- 0..1
- Variation Id. MUST be unique within document.
* - type
- string
- 1..1
- MUST be "CopyNumberChange"
* - subject
- :ref:`Location` | :ref:`CURIE` | :ref:`Feature`
- 1..1
- A location for which the number of systemic copies is described.
* - copy_assessment
- string
- 1..1
- MUST be one of "EFO_0030069" (complete genomic loss), "EFO_0020073" (high-level loss), "EFO_0030068" (low-level loss), "EFO_0030067" (loss), "EFO_0030064" (regional base ploidy), "EFO_0030070" (gain), "EFO_0030071" (low-level gain), "EFO_0030072" (high-level gain).
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
**Computational Definition**

The absolute count of discrete copies of a :ref:`Location`, within a system (e.g. genome, cell, etc.).
The absolute count of discrete copies of a :ref:`Location` or :ref:`Feature`, within a system (e.g. genome, cell, etc.).

**Information Model**

Some AbsoluteCopyNumber attributes are inherited from :ref:`Variation`.
Some CopyNumberCount attributes are inherited from :ref:`Variation`.

.. list-table::
:class: clean-wrap
@@ -23,9 +23,9 @@ Some AbsoluteCopyNumber attributes are inherited from :ref:`Variation`.
* - type
- string
- 1..1
- MUST be "AbsoluteCopyNumber"
- MUST be "CopyNumberCount"
* - subject
- :ref:`Location` | :ref:`CURIE`
- :ref:`Location` | :ref:`CURIE` | :ref:`Feature`
- 1..1
- A location for which the number of systemic copies is described.
* - copies
34 changes: 0 additions & 34 deletions schema/defs/vrs/RelativeCopyNumber.rst

This file was deleted.

4 changes: 2 additions & 2 deletions schema/ga4gh.yaml
Original file line number Diff line number Diff line change
@@ -25,8 +25,8 @@ identifiers:
Text: VT
Genotype: GT
Haplotype: VH
AbsoluteCopyNumber: VAC
RelativeCopyNumber: VRC
CopyNumberCount: CN
CopyNumberChange: CX
SequenceLocation: VSL
ChromosomeLocation: VCL

69 changes: 35 additions & 34 deletions schema/vrs-source.yaml
Original file line number Diff line number Diff line change
@@ -71,7 +71,8 @@ definitions:
A Variation of multiple molecules in the context of a system, e.g.
a genome, sample, or homologous chromosomes.
oneOf:
- $ref: "#/definitions/CopyNumber"
- $ref: "#/definitions/CopyNumberCount"
- $ref: "#/definitions/CopyNumberChange"
- $ref: "#/definitions/Genotype"
discriminator:
propertyName: type
@@ -187,64 +188,64 @@ definitions:
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# SystemicVariation

CopyNumber:
CopyNumberCount:
inherits: SystemicVariation
description: >-
The copies of :ref:`Location` in a system, expressed as an absolute integer
quantity (:ref:`AbsoluteCopyNumber`) or a qualitative description of copies
relative to a baseline state (:ref:`RelativeCopyNumber`).
heritable_properties:
subject:
oneOf:
- $ref: "#/definitions/Location"
- $ref: "#/definitions/CURIE"
description: >-
A location for which the number of systemic copies is described.
heritable_required: [ "subject" ]

AbsoluteCopyNumber:
inherits: CopyNumber
type: object
maturity: draft
description: >-
The absolute count of discrete copies of a :ref:`Location`,
The absolute count of discrete copies of a :ref:`Location` or :ref:`Feature`,
within a system (e.g. genome, cell, etc.).
properties:
type:
type: string
const: "AbsoluteCopyNumber"
default: "AbsoluteCopyNumber"
const: "CopyNumberCount"
default: "CopyNumberCount"
description: >-
MUST be "AbsoluteCopyNumber"
MUST be "CopyNumberCount"
subject:
oneOf:
- $ref: "#/definitions/Location"
- $ref: "#/definitions/CURIE"
- $ref: "#/definitions/Feature"
description: >-
A location for which the number of systemic copies is described.
copies:
oneOf:
- $ref: "#/definitions/Number"
- $ref: "#/definitions/IndefiniteRange"
- $ref: "#/definitions/DefiniteRange"
description: >-
The integral number of copies of the subject in a system
required: [ "copies" ]
required: [ "subject", "copies" ]

RelativeCopyNumber:
inherits: CopyNumber
CopyNumberChange:
inherits: SystemicVariation
type: object
maturity: draft
description: >-
The copies of a :ref:`Location` within a system (e.g. genome, cell, etc.)
relative to a baseline state.
An assessment of the copy number of a :ref:`Location` or a :ref:`Feature` within a system (e.g. genome, cell,
etc.) relative to a baseline ploidy.
properties:
type:
type: string
const: "RelativeCopyNumber"
default: "RelativeCopyNumber"
const: "CopyNumberChange"
default: "CopyNumberChange"
description: >-
MUST be "RelativeCopyNumber"
relative_copy_class:
MUST be "CopyNumberChange"
subject:
oneOf:
- $ref: "#/definitions/Location"
- $ref: "#/definitions/CURIE"
- $ref: "#/definitions/Feature"
description: >-
A location for which the number of systemic copies is described.
copy_assessment:
type: string
enum: [ "complete loss", "partial loss", "copy neutral", "low-level gain", "high-level gain" ]
enum: [ "EFO_0030069", "EFO_0020073", "EFO_0030068", "EFO_0030067", "EFO_0030064", "EFO_0030070", "EFO_0030071", "EFO_0030072" ]
description: >-
MUST be one of "complete loss", "partial loss", "copy neutral", "low-level gain" or "high-level gain".
required: [ "relative_copy_class" ]
MUST be one of "EFO_0030069" (complete genomic loss), "EFO_0020073" (high-level loss),
"EFO_0030068" (low-level loss), "EFO_0030067" (loss), "EFO_0030064" (regional base ploidy),
"EFO_0030070" (gain), "EFO_0030071" (low-level gain), "EFO_0030072" (high-level gain).
required: [ "subject", "copy_assessment" ]

Genotype:
inherits: SystemicVariation
74 changes: 41 additions & 33 deletions schema/vrs.json
Original file line number Diff line number Diff line change
@@ -7,19 +7,19 @@
"description": "A representation of the state of one or more biomolecules.",
"oneOf": [
{
"$ref": "#/definitions/AbsoluteCopyNumber"
"$ref": "#/definitions/Allele"
},
{
"$ref": "#/definitions/Allele"
"$ref": "#/definitions/CopyNumberChange"
},
{
"$ref": "#/definitions/Genotype"
"$ref": "#/definitions/CopyNumberCount"
},
{
"$ref": "#/definitions/Haplotype"
"$ref": "#/definitions/Genotype"
},
{
"$ref": "#/definitions/RelativeCopyNumber"
"$ref": "#/definitions/Haplotype"
},
{
"$ref": "#/definitions/Text"
@@ -64,13 +64,13 @@
"description": "A Variation of multiple molecules in the context of a system, e.g. a genome, sample, or homologous chromosomes.",
"oneOf": [
{
"$ref": "#/definitions/AbsoluteCopyNumber"
"$ref": "#/definitions/CopyNumberChange"
},
{
"$ref": "#/definitions/Genotype"
"$ref": "#/definitions/CopyNumberCount"
},
{
"$ref": "#/definitions/RelativeCopyNumber"
"$ref": "#/definitions/Genotype"
}
],
"discriminator": {
@@ -221,23 +221,23 @@
"ordered": false,
"items": {
"oneOf": [
{
"$ref": "#/definitions/AbsoluteCopyNumber"
},
{
"$ref": "#/definitions/Allele"
},
{
"$ref": "#/definitions/CURIE"
},
{
"$ref": "#/definitions/Genotype"
"$ref": "#/definitions/CopyNumberChange"
},
{
"$ref": "#/definitions/Haplotype"
"$ref": "#/definitions/CopyNumberCount"
},
{
"$ref": "#/definitions/RelativeCopyNumber"
"$ref": "#/definitions/Genotype"
},
{
"$ref": "#/definitions/Haplotype"
},
{
"$ref": "#/definitions/Text"
@@ -256,20 +256,19 @@
],
"additionalProperties": false
},
"AbsoluteCopyNumber": {
"CopyNumberCount": {
"type": "object",
"maturity": "draft",
"description": "The absolute count of discrete copies of a Location, within a system (e.g. genome, cell, etc.).",
"description": "The absolute count of discrete copies of a Location or Feature, within a system (e.g. genome, cell, etc.).",
"properties": {
"_id": {
"$ref": "#/definitions/CURIE",
"description": "Variation Id. MUST be unique within document."
},
"type": {
"type": "string",
"const": "AbsoluteCopyNumber",
"default": "AbsoluteCopyNumber",
"description": "MUST be \"AbsoluteCopyNumber\""
"const": "CopyNumberCount",
"default": "CopyNumberCount",
"description": "MUST be \"CopyNumberCount\""
},
"subject": {
"oneOf": [
@@ -279,6 +278,9 @@
{
"$ref": "#/definitions/ChromosomeLocation"
},
{
"$ref": "#/definitions/Gene"
},
{
"$ref": "#/definitions/SequenceLocation"
}
@@ -307,20 +309,20 @@
],
"additionalProperties": false
},
"RelativeCopyNumber": {
"CopyNumberChange": {
"type": "object",
"maturity": "draft",
"description": "The copies of a Location within a system (e.g. genome, cell, etc.) relative to a baseline state.",
"description": "An assessment of the copy number of a Location or a Feature within a system (e.g. genome, cell, etc.) relative to a baseline ploidy.",
"properties": {
"_id": {
"$ref": "#/definitions/CURIE",
"description": "Variation Id. MUST be unique within document."
},
"type": {
"type": "string",
"const": "RelativeCopyNumber",
"default": "RelativeCopyNumber",
"description": "MUST be \"RelativeCopyNumber\""
"const": "CopyNumberChange",
"default": "CopyNumberChange",
"description": "MUST be \"CopyNumberChange\""
},
"subject": {
"oneOf": [
@@ -330,26 +332,32 @@
{
"$ref": "#/definitions/ChromosomeLocation"
},
{
"$ref": "#/definitions/Gene"
},
{
"$ref": "#/definitions/SequenceLocation"
}
],
"description": "A location for which the number of systemic copies is described."
},
"relative_copy_class": {
"copy_assessment": {
"type": "string",
"enum": [
"complete loss",
"partial loss",
"copy neutral",
"low-level gain",
"high-level gain"
"EFO_0030069",
"EFO_0020073",
"EFO_0030068",
"EFO_0030067",
"EFO_0030064",
"EFO_0030070",
"EFO_0030071",
"EFO_0030072"
],
"description": "MUST be one of \"complete loss\", \"partial loss\", \"copy neutral\", \"low-level gain\" or \"high-level gain\"."
"description": "MUST be one of \"EFO_0030069\" (complete genomic loss), \"EFO_0020073\" (high-level loss), \"EFO_0030068\" (low-level loss), \"EFO_0030067\" (loss), \"EFO_0030064\" (regional base ploidy), \"EFO_0030070\" (gain), \"EFO_0030071\" (low-level gain), \"EFO_0030072\" (high-level gain)."
}
},
"required": [
"relative_copy_class",
"copy_assessment",
"subject",
"type"
],
62 changes: 34 additions & 28 deletions schema/vrs.yaml
Original file line number Diff line number Diff line change
@@ -5,11 +5,11 @@ definitions:
Variation:
description: A representation of the state of one or more biomolecules.
oneOf:
- $ref: '#/definitions/AbsoluteCopyNumber'
- $ref: '#/definitions/Allele'
- $ref: '#/definitions/CopyNumberChange'
- $ref: '#/definitions/CopyNumberCount'
- $ref: '#/definitions/Genotype'
- $ref: '#/definitions/Haplotype'
- $ref: '#/definitions/RelativeCopyNumber'
- $ref: '#/definitions/Text'
- $ref: '#/definitions/VariationSet'
discriminator:
@@ -34,9 +34,9 @@ definitions:
description: A Variation of multiple molecules in the context of a system, e.g.
a genome, sample, or homologous chromosomes.
oneOf:
- $ref: '#/definitions/AbsoluteCopyNumber'
- $ref: '#/definitions/CopyNumberChange'
- $ref: '#/definitions/CopyNumberCount'
- $ref: '#/definitions/Genotype'
- $ref: '#/definitions/RelativeCopyNumber'
discriminator:
propertyName: type
Allele:
@@ -138,12 +138,12 @@ definitions:
ordered: false
items:
oneOf:
- $ref: '#/definitions/AbsoluteCopyNumber'
- $ref: '#/definitions/Allele'
- $ref: '#/definitions/CURIE'
- $ref: '#/definitions/CopyNumberChange'
- $ref: '#/definitions/CopyNumberCount'
- $ref: '#/definitions/Genotype'
- $ref: '#/definitions/Haplotype'
- $ref: '#/definitions/RelativeCopyNumber'
- $ref: '#/definitions/Text'
- $ref: '#/definitions/VariationSet'
description: List of Variation objects or identifiers. Attribute is required,
@@ -152,24 +152,24 @@ definitions:
- members
- type
additionalProperties: false
AbsoluteCopyNumber:
CopyNumberCount:
type: object
maturity: draft
description: The absolute count of discrete copies of a Location, within a system
(e.g. genome, cell, etc.).
description: The absolute count of discrete copies of a Location or Feature, within
a system (e.g. genome, cell, etc.).
properties:
_id:
$ref: '#/definitions/CURIE'
description: Variation Id. MUST be unique within document.
type:
type: string
const: AbsoluteCopyNumber
default: AbsoluteCopyNumber
description: MUST be "AbsoluteCopyNumber"
const: CopyNumberCount
default: CopyNumberCount
description: MUST be "CopyNumberCount"
subject:
oneOf:
- $ref: '#/definitions/CURIE'
- $ref: '#/definitions/ChromosomeLocation'
- $ref: '#/definitions/Gene'
- $ref: '#/definitions/SequenceLocation'
description: A location for which the number of systemic copies is described.
copies:
@@ -183,38 +183,44 @@ definitions:
- subject
- type
additionalProperties: false
RelativeCopyNumber:
CopyNumberChange:
type: object
maturity: draft
description: The copies of a Location within a system (e.g. genome, cell, etc.)
relative to a baseline state.
description: An assessment of the copy number of a Location or a Feature within
a system (e.g. genome, cell, etc.) relative to a baseline ploidy.
properties:
_id:
$ref: '#/definitions/CURIE'
description: Variation Id. MUST be unique within document.
type:
type: string
const: RelativeCopyNumber
default: RelativeCopyNumber
description: MUST be "RelativeCopyNumber"
const: CopyNumberChange
default: CopyNumberChange
description: MUST be "CopyNumberChange"
subject:
oneOf:
- $ref: '#/definitions/CURIE'
- $ref: '#/definitions/ChromosomeLocation'
- $ref: '#/definitions/Gene'
- $ref: '#/definitions/SequenceLocation'
description: A location for which the number of systemic copies is described.
relative_copy_class:
copy_assessment:
type: string
enum:
- complete loss
- partial loss
- copy neutral
- low-level gain
- high-level gain
description: MUST be one of "complete loss", "partial loss", "copy neutral",
"low-level gain" or "high-level gain".
- EFO_0030069
- EFO_0020073
- EFO_0030068
- EFO_0030067
- EFO_0030064
- EFO_0030070
- EFO_0030071
- EFO_0030072
description: MUST be one of "EFO_0030069" (complete genomic loss), "EFO_0020073"
(high-level loss), "EFO_0030068" (low-level loss), "EFO_0030067" (loss),
"EFO_0030064" (regional base ploidy), "EFO_0030070" (gain), "EFO_0030071"
(low-level gain), "EFO_0030072" (high-level gain).
required:
- relative_copy_class
- copy_assessment
- subject
- type
additionalProperties: false
22 changes: 11 additions & 11 deletions validation/models.yaml
Original file line number Diff line number Diff line change
@@ -356,7 +356,7 @@ Haplotype:
ga4gh_digest: i8owCOBHIlRCPtcw_WzRFNTunwJRy99-
ga4gh_identify: ga4gh:VH.i8owCOBHIlRCPtcw_WzRFNTunwJRy99-
ga4gh_serialize: '{"members":["-kUJh47Pu24Y3Wdsk1rXEDKsXWNY-68x","Z_rYRxpUvwqCLsCBO3YLl70o2uf9_Op1"],"type":"Haplotype"}'
AbsoluteCopyNumber:
CopyNumberCount:
- name: ">=3 copies APOE"
in:
copies:
@@ -374,15 +374,15 @@ AbsoluteCopyNumber:
value: 44905795
type: SequenceInterval
type: SequenceLocation
type: AbsoluteCopyNumber
type: CopyNumberCount
out:
ga4gh_digest: New2SZ7NZU_gBbjzcmA8IwmA-EShG5JI
ga4gh_identify: ga4gh:VAC.New2SZ7NZU_gBbjzcmA8IwmA-EShG5JI
ga4gh_serialize: '{"copies":{"comparator":">=","type":"IndefiniteRange","value":3},"subject":"oz3NEuhtbBep3yqu3wrhqfDKbLPK7vcE","type":"AbsoluteCopyNumber"}'
RelativeCopyNumber:
ga4gh_digest: salZa9yW-GduRxsRFwIGCQvi_YfpjeF4
ga4gh_identify: ga4gh:CN.salZa9yW-GduRxsRFwIGCQvi_YfpjeF4
ga4gh_serialize: '{"copies":{"comparator":">=","type":"IndefiniteRange","value":3},"subject":"oz3NEuhtbBep3yqu3wrhqfDKbLPK7vcE","type":"CopyNumberCount"}'
CopyNumberChange:
- name: "Low-level copy gain of BRCA1"
in:
relative_copy_class: low-level gain
copy_assessment: EFO_0030071
subject:
sequence_id: ga4gh:SQ.IIB53T8CNeJJdUqzn9V_JnRtQadwWCbl
interval:
@@ -394,11 +394,11 @@ RelativeCopyNumber:
value: 44905795
type: SequenceInterval
type: SequenceLocation
type: RelativeCopyNumber
type: CopyNumberChange
out:
ga4gh_digest: 69x30aZU0KQF0RDqq3CaaVBid_xrgzrI
ga4gh_identify: ga4gh:VRC.69x30aZU0KQF0RDqq3CaaVBid_xrgzrI
ga4gh_serialize: '{"relative_copy_class":"low-level gain","subject":"oz3NEuhtbBep3yqu3wrhqfDKbLPK7vcE","type":"RelativeCopyNumber"}'
ga4gh_digest: MLA_TGdelT-_jrlsC6N19S2itmcWqHfj
ga4gh_identify: ga4gh:CX.MLA_TGdelT-_jrlsC6N19S2itmcWqHfj
ga4gh_serialize: '{"copy_assessment":"EFO_0030071","subject":"oz3NEuhtbBep3yqu3wrhqfDKbLPK7vcE","type":"CopyNumberChange"}'
Text:
-
in: