Skip to content

Background

Francesco Palozzi edited this page Jun 5, 2026 · 2 revisions

Background — Formats and Nomenclature

BPSEQ

The standard BPSEQ format is a three-column tab-separated text file:

Index   Nucleotide   PairingPartner
1       G            29
2       G            28
3       G            27
...
  • Index is 1-based.
  • PairingPartner is 0 for unpaired positions, or the 1-based index of the paired nucleotide.

Extended BPSEQ

RNA2DUnifier introduces an extended BPSEQ format that adds one column for each of the 12 Leontis–Westhof interaction families. Each extra column lists the 1-based index (or indices) of pairing partners for that specific interaction type at that position. 0 means no partner of that type.

Example header:

Index   Nucleotide   cWW   tWW   cWH   tWH   cWS   tWS   cHH   tHH   cHS   tHS   cSS   tSS

Leontis–Westhof Nomenclature

The 12 geometric families describe base pairs in terms of the interacting edges (Watson–Crick W, Hoogsteen H, Sugar S) and the glycosidic bond orientation (cis c / trans t):

Code Edges Orientation
cWW W / W cis
tWW W / W trans
cWH W / H cis
tWH W / H trans
cWS W / S cis
tWS W / S trans
cHH H / H cis
tHH H / H trans
cHS H / S cis
tHS H / S trans
cSS S / S cis
tSS S / S trans

Canonical Watson–Crick base pairs (A–U, G–C) are classified as cWW. The tWW type is also considered canonical by the library.


Home

Clone this wiki locally