Skip to content

Commit

Permalink
Epic #480 "Update Haplotype to CisPhasedBlock" (#490)
Browse files Browse the repository at this point in the history
* Addresses epic #480 "Update Haplotype to CisPhasedBlock".
* Addresses #481 bullet 1.
* Related to #461 (comment).
  • Loading branch information
ahwagner committed Apr 19, 2024
1 parent fa0129f commit ab0c6e7
Show file tree
Hide file tree
Showing 8 changed files with 73 additions and 45 deletions.
23 changes: 23 additions & 0 deletions docs/source/concepts/molecular_variation/CisPhasedBlock.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
.. _CisPhasedBlock:

CisPhasedBlock
!!!!!!!!!!!!!!

The CisPhasedBlock is a set of Alleles that are found *in-cis*: occurring
on the same physical molecule. The CisPhasedBlock structure is useful for
representing genetic *Haplotypes*, which are commonly described with respect
to locations on a gene, a set of nearby genes, or other physically proximal
genetic markers that tend to be transmitted together. Unlike haplotypes, the
CisPhasedBlock is not also used to convey information about genetic ancestry.

.. admonition:: New in v2

In VRS v1, a class with the same computational use as the `CisPhasedBlock`
was defined and named the `Haplotype` class. This term is not used to describe
this concept in v2, as the use of the `Haplotype` name created confusion in the
community, due to the additional semantics of the term around genetic linkage
and ancestry. In practice, implmentations transitioning from v1 to v2 should
find the `CisPhasedBlock` able to accommodate the same information content
from v1 `Haplotypes`.

.. include:: ../../def/CisPhasedBlock.rst
10 changes: 5 additions & 5 deletions examples/simple_haplotype.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,9 @@
id: simple_haplotype
type: Haplotype
type: CisPhasedBlock
members:
- type: Allele
location:
type: SequenceLocation
sequenceReference:
refgetAccession: SQ.S_KjnFVz-FE7M0W6yoaUDgYxLPc1jyWU
residueAlphabet: na
id: NC_000001.10
start: 601
end: 602
state:
Expand All @@ -21,3 +17,7 @@ members:
state:
type: LiteralSequenceExpression
sequence: C
sequenceReference:
refgetAccession: SQ.S_KjnFVz-FE7M0W6yoaUDgYxLPc1jyWU
residueAlphabet: na
id: NC_000001.10
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ An ordered set of co-occurring :ref:`variants <Variation>` on the same molecule.

**Information Model**

Some Haplotype attributes are inherited from :ref:`Variation`.
Some CisPhasedBlock attributes are inherited from :ref:`Variation`.

.. list-table::
:class: clean-wrap
Expand Down Expand Up @@ -35,7 +35,7 @@ Some Haplotype attributes are inherited from :ref:`Variation`.
* - type
- string
- 0..1
- MUST be "Haplotype"
- MUST be "CisPhasedBlock"
* - digest
- string
- 0..1
Expand All @@ -45,6 +45,10 @@ Some Haplotype attributes are inherited from :ref:`Variation`.
- 0..m
-
* - members
- :ref:`Adjacency` | :ref:`Allele` | :ref:`IRI`
- :ref:`Allele` | :ref:`IRI`
- 2..m
- A list of :ref:`Alleles <Allele>` and :ref:`Adjacencies <Adjacency>` that comprise a Haplotype. Members must share the same reference sequence as adjacent members. Alleles should not have overlapping or adjacent coordinates with neighboring Alleles. Neighboring alleles should be ordered by ascending coordinates, unless represented on a DNA inversion (following an Adjacency with end-defined adjoinedSequences), in which case they should be ordered in descending coordinates. Sequence references MUST be consistent for all members between and including the end of one Adjacency and the beginning of another.
- A list of :ref:`Alleles <Allele>` that are found in-cis on a shared molecule.
* - sequenceReference
- :ref:`SequenceReference`
- 0..1
- An optional Sequence Reference on which all of the in-cis Alleles are found. When defined, this may be used to implicitly define the `sequenceReference` attribute for each of the CisPhasedBlock member Alleles.
23 changes: 12 additions & 11 deletions schema/vrs/json/Haplotype → schema/vrs/json/CisPhasedBlock
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://w3id.org/ga4gh/schema/vrs/2.x/json/Haplotype",
"title": "Haplotype",
"$id": "https://w3id.org/ga4gh/schema/vrs/2.x/json/CisPhasedBlock",
"title": "CisPhasedBlock",
"type": "object",
"maturity": "draft",
"ga4ghDigest": {
"prefix": "HT",
"prefix": "CPB",
"keys": [
"members",
"type"
Expand Down Expand Up @@ -34,9 +34,9 @@
},
"type": {
"type": "string",
"const": "Haplotype",
"default": "Haplotype",
"description": "MUST be \"Haplotype\""
"const": "CisPhasedBlock",
"default": "CisPhasedBlock",
"description": "MUST be \"CisPhasedBlock\""
},
"digest": {
"description": "A sha512t24u digest created using the VRS Computed Identifier algorithm.",
Expand All @@ -52,14 +52,11 @@
},
"members": {
"type": "array",
"ordered": true,
"ordered": false,
"minItems": 2,
"uniqueItems": false,
"items": {
"oneOf": [
{
"$ref": "/ga4gh/schema/vrs/2.x/json/Adjacency"
},
{
"$ref": "/ga4gh/schema/vrs/2.x/json/Allele"
},
Expand All @@ -68,7 +65,11 @@
}
]
},
"description": "A list of Alleles that comprise a Haplotype. Members must share the same reference sequence as adjacent members. Alleles should not have overlapping or adjacent coordinates with neighboring Alleles. Neighboring alleles should be ordered by ascending coordinates, unless represented on a DNA inversion (following an Adjacency with end-defined adjoinedSequences), in which case they should be ordered in descending coordinates. Sequence references MUST be consistent for all members between and including the end of one Adjacency and the beginning of another."
"description": "A list of Alleles that are found in-cis on a shared molecule."
},
"sequenceReference": {
"$ref": "/ga4gh/schema/vrs/2.x/json/SequenceReference",
"description": "An optional Sequence Reference on which all of the in-cis Alleles are found. When defined, this may be used to implicitly define the `sequenceReference` attribute for each of the CisPhasedBlock member Alleles."
}
},
"required": [
Expand Down
2 changes: 1 addition & 1 deletion schema/vrs/json/MolecularVariation
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"$ref": "/ga4gh/schema/vrs/2.x/json/Allele"
},
{
"$ref": "/ga4gh/schema/vrs/2.x/json/Haplotype"
"$ref": "/ga4gh/schema/vrs/2.x/json/CisPhasedBlock"
}
],
"discriminator": {
Expand Down
6 changes: 3 additions & 3 deletions schema/vrs/json/Variation
Original file line number Diff line number Diff line change
Expand Up @@ -47,13 +47,13 @@
"$ref": "/ga4gh/schema/vrs/2.x/json/Allele"
},
{
"$ref": "/ga4gh/schema/vrs/2.x/json/CopyNumberChange"
"$ref": "/ga4gh/schema/vrs/2.x/json/CisPhasedBlock"
},
{
"$ref": "/ga4gh/schema/vrs/2.x/json/CopyNumberCount"
"$ref": "/ga4gh/schema/vrs/2.x/json/CopyNumberChange"
},
{
"$ref": "/ga4gh/schema/vrs/2.x/json/Haplotype"
"$ref": "/ga4gh/schema/vrs/2.x/json/CopyNumberCount"
}
],
"discriminator": {
Expand Down
30 changes: 15 additions & 15 deletions schema/vrs/vrs-source.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ $defs:
A :ref:`variation` on a contiguous molecule.
oneOf:
- $ref: "#/$defs/Allele"
- $ref: "#/$defs/Haplotype"
- $ref: "#/$defs/CisPhasedBlock"
discriminator:
propertyName: type

Expand Down Expand Up @@ -145,10 +145,10 @@ $defs:
An expression of the sequence state
required: [ "location", "state" ]

Haplotype:
CisPhasedBlock:
maturity: draft
ga4ghDigest:
prefix: HT
prefix: CPB
keys:
- members
inherits: MolecularVariation
Expand All @@ -158,28 +158,28 @@ $defs:
properties:
type:
type: string
const: "Haplotype"
default: "Haplotype"
const: "CisPhasedBlock"
default: "CisPhasedBlock"
description: >-
MUST be "Haplotype"
MUST be "CisPhasedBlock"
members:
type: array
ordered: true
ordered: false
minItems: 2
uniqueItems: false
items:
oneOf:
- $ref: "#/$defs/Adjacency"
- $ref: "#/$defs/Allele"
- $refCurie: gks.common:IRI
description: >-
A list of :ref:`Alleles <Allele>` and :ref:`Adjacencies <Adjacency>` that comprise a Haplotype.
Members must share the same reference sequence as adjacent members. Alleles should not have
overlapping or adjacent coordinates with neighboring Alleles. Neighboring alleles should be ordered
by ascending coordinates, unless represented on a DNA inversion (following an Adjacency with
end-defined adjoinedSequences), in which case they should be ordered in descending coordinates.
Sequence references MUST be consistent for all members between and including the end of one
Adjacency and the beginning of another.
A list of :ref:`Alleles <Allele>` that are found in-cis on a shared molecule.
sequenceReference:
$ref: "#/$defs/SequenceReference"
description: >-
An optional Sequence Reference on which all of the in-cis Alleles are found.
When defined, this may be used to implicitly define the `sequenceReference`
attribute for each of the CisPhasedBlock member Alleles.
required: [ "members" ]

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Expand Down
12 changes: 6 additions & 6 deletions tests/test_definitions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,15 @@ tests:
image: ../../docs/images/ex_ambiguous_linker.png
schema: vrs
definition: Adjacency
- test_file: sv_haplotype.yaml
description: A haplotype of 3 members. First an adjacency with a litereal sequence linker followed by an SNV on the 2nd sequence and ending with a simple breakpoint adjacency that ends with the 1st sequence in the haplotype.
image: ../../docs/images/ex_sv_haplotype.png
schema: vrs
definition: Haplotype
# - test_file: sv_haplotype.yaml
# description: A haplotype of 3 members. First an adjacency with a litereal sequence linker followed by an SNV on the 2nd sequence and ending with a simple breakpoint adjacency that ends with the 1st sequence in the haplotype.
# image: ../../docs/images/ex_sv_haplotype.png
# schema: vrs
# definition: CisPhasedBlock
- test_file: simple_haplotype.yaml
description: A haplotype of two alleles on a shared chromosome reference sequence.
schema: vrs
definition: Haplotype
definition: CisPhasedBlock
- test_file: SPDI_contraction.yaml
description: A simple RLE contraction from SPDI representation
schema: vrs
Expand Down

0 comments on commit ab0c6e7

Please sign in to comment.