-
Notifications
You must be signed in to change notification settings - Fork 0
Core API
RnaUnifier is the main facade class. Instantiate it with the default constructor or inject a custom BpseqExporter.
// Default usage
RnaUnifier unifier = new RnaUnifier();
// With custom exporter
RnaUnifier unifier = new RnaUnifier(myCustomExporter);| Method | Description |
|---|---|
String process(File, ToolType, boolean) |
Parse File using the specified tool parser. boolean selects extended (true) or canonical (false) BPSEQ. |
String process(File, boolean) |
Same as above but auto-detects the tool type from the file content. |
String process(InputStream, ToolType, boolean) |
Parse from an InputStream with explicit tool type. |
String process(InputStream, boolean) |
Parse from a mark/reset-capable InputStream with auto-detection. |
void processToFile(File, ToolType, File, boolean) |
Parse and write the result directly to an output file. |
void processToFile(File, File, boolean) |
Same, with auto-detection of tool type. |
Note on
boolean extended:
Passtrueto obtain the full extended BPSEQ (all 12 Leontis–Westhof columns).
Passfalseto obtain canonical BPSEQ only (Watson–Crick pairs; positions with no canonical partner are omitted).
An enum listing every supported tool. Import from it.unicam.cs.bdslab.rna2dunifier.parser.ToolType.
| Constant | Tool / Format |
|---|---|
FR3D |
FR3D JSON output |
RNAVIEW |
RNAview plain-text output |
RNAPOLIS |
RNApolis FASTA + tabular output |
MCANNOTATE |
mc-annotate plain-text output |
BARNABA |
Barnaba tabular annotation output |
BPNET |
bpnet (BPFIND) tabular output |
X3DNA |
x3dna-DSSR JSON output |
A static factory with two responsibilities: creating parser instances and auto-detecting the tool type.
// Get a parser manually
RnaStructureParser parser = ParserFactory.getParser(ToolType.RNAPOLIS);
// Detect the tool from a stream (stream must support mark/reset)
ToolType detected = ParserFactory.detectTool(bufferedInputStream);Auto-detection reads the first 4 096 bytes of the stream and looks for format-specific signatures:
| Priority | Signal | Detected tool |
|---|---|---|
| 1 | Contains BEGIN_base-pair
|
RNAVIEW |
| 2 | Contains Residue conformations
|
MCANNOTATE |
| 3 | Contains > and seq
|
RNAPOLIS |
| 4 | JSON with key "annotations"
|
FR3D |
| 5 | JSON with key "pairs"
|
X3DNA |
| 6 | Lines matching N_INT_INT N_INT_INT XXc pattern |
BARNABA |
| 7 | Lines with ? separator and W:WC-style tokens |
BPNET |
If no signature matches, an IllegalArgumentException is thrown.
Important: the stream passed to
detectToolmust supportmark()/reset(). Wrap a plainFileInputStreamin aBufferedInputStreambefore calling this method.
The interface implemented by all seven concrete parsers.
public interface RnaStructureParser {
ExtendedRNASecondaryStructure parse(InputStream inputStream)
throws IOException, ParseException;
}You do not normally need to use the concrete parsers directly — ParserFactory.getParser(ToolType) is the preferred way to obtain one.
The central domain model. Build instances through the nested Builder.
ExtendedRNASecondaryStructure structure =
new ExtendedRNASecondaryStructure.Builder()
.setSequence("AUGCAUGC")
.addPair(new Pair(0, 7, "A", "U", BondType.LEONTIS_WESTHOF_cWW))
.addPair(new Pair(1, 6, "U", "G", BondType.LEONTIS_WESTHOF_tWW))
.addHeaderInfo("PDB ID", "1YMO")
.addHeaderInfo("Chain ID", "A")
.build();| Method | Returns | Description |
|---|---|---|
getSequence() |
String |
The RNA nucleotide sequence (A, C, G, U, N). |
getPairs() |
List<Pair> |
All base pairs including non-canonical and stacking. |
getCanonical() |
List<Pair> |
Only canonical pairs (cWW or tWW). |
getHeaderInfo() |
Map<String,String> |
Metadata such as PDB ID, Chain ID (unmodifiable). |
Builder note:
addPair(Pair)automatically adds the pair to both the full list and — ifpair.getType().isCanonical()is true — to the canonical list. UsesetPairs()/setCanonical()only when you want to replace the lists wholesale.
Represents a single base-pair interaction. Instances are immutable. Two Pair objects are equal if they share the same unordered position pair and bond type (i.e., swapping pos1/pos2 still yields the same pair).
// Minimal: positions + bond type
Pair pair = new Pair(0, 29, BondType.LEONTIS_WESTHOF_cWW);
// With nucleotide labels
Pair pair = new Pair(0, 29, "G", "C", BondType.LEONTIS_WESTHOF_cWW);
// Via Builder
Pair pair = new Pair.Builder()
.setPos1(0)
.setPos2(29)
.setNucleotide1("G")
.setNucleotide2("C")
.setType(BondType.LEONTIS_WESTHOF_cWW)
.build();| Method | Returns | Description |
|---|---|---|
getPos1() |
int |
Zero-based index of the first nucleotide. |
getPos2() |
int |
Zero-based index of the second nucleotide. |
getType() |
BondType |
The Leontis–Westhof bond type. |
getNucleotide1() |
String |
Nucleotide label at pos1 (may be null). |
getNucleotide2() |
String |
Nucleotide label at pos2 (may be null). |
Positions are zero-based internally. The exporter converts them to 1-based indices in the BPSEQ output.
An enum covering all Leontis–Westhof families plus special types.
// Get a BondType from a string label (case-insensitive)
BondType type = BondType.fromString("cWW"); // → LEONTIS_WESTHOF_cWW
BondType type = BondType.fromString("tHS"); // → LEONTIS_WESTHOF_tHS
BondType type = BondType.fromString(null); // → UNKNOWN
// Query the type
type.isCanonical(); // true for cWW and tWW
type.isCis(); // true for all cXX types
type.isTrans(); // true for all tXX types
type.getInfo(); // e.g. "cWW", "tHS", "stacking", "unknown"
// Retrieve all 12 LW types in a consistent order
List<BondType> lwFamilies = BondType.getLeontisWesthofFamily();| Constant | Meaning |
|---|---|
UNKNOWN |
Unclassified or unrecognised bond |
STACKING |
Base-stacking interaction (not a base pair) |
fromString normalises the string: it uppercases it and corrects reversed edge notation (e.g., SH → HS, SW → WS, HW → WH) before matching.
Converts an ExtendedRNASecondaryStructure into a string.
BpseqExporter exporter = new BpseqExporter();
// Full extended BPSEQ (12 LW columns)
String extended = exporter.printExtendedBPSEQ(structure);
// Canonical BPSEQ only (positions without a canonical partner are omitted)
String canonical = exporter.printCanonicalBPSEQ(structure);Index Nucleotide cWW tWW cWH tWH cWS tWS cHH tHH cHS tHS cSS tSS
1 G 29 0 0 0 0 0 0 0 0 0 0 0
2 G 28 0 0 0 0 0 0 0 0 0 0 0
...
If a nucleotide has multiple partners of the same type (e.g., two cWH interactions), the partners are comma-separated: 3,17.
1 G 29
2 G 28
...
Positions with no canonical partner are not included in the canonical output.
← Quick Start | Next: Supported Input Formats →
RNA2DUnifier – Copyright © 2026 Francesco Palozzi.
University of Camerino – Licensed under the Apache License, Version 2.0.