Skip to content

Background

Francesco Palozzi edited this page Jun 19, 2026 · 2 revisions

Background — Formats and Nomenclature

BPSEQ

The standard BPSEQ format is a three-column space text file:

1 G 29
2 G 28
3 G 27
...
  • Index is 1-based.
  • PairingPartner is 0 for unpaired positions, or the 1-based index of the paired nucleotide.

Extended BPSEQ

RNA2DUnifier introduces an extended BPSEQ format that adds one column for each of the 12 Leontis–Westhof interaction families. Each extra column lists the 1-based index (or indices) of pairing partners for that specific interaction type at that position. 0 means no partner of that type.

Example header:

id nt cWW tWW cWH tWH cWS tWS cHH tHH cHS tHS cSS tSS

Leontis–Westhof Nomenclature

The 12 geometric families describe base pairs in terms of the interacting edges (Watson–Crick W, Hoogsteen H, Sugar S) and the glycosidic bond orientation (cis c / trans t):

Code Edges Orientation
cWW W / W cis
tWW W / W trans
cWH W / H cis
tWH W / H trans
cWS W / S cis
tWS W / S trans
cHH H / H cis
tHH H / H trans
cHS H / S cis
tHS H / S trans
cSS S / S cis
tSS S / S trans

Canonical Watson–Crick base pairs (A–U, G–C) are classified as cWW. The tWW type is also considered canonical by the library.


Home

Clone this wiki locally