Core API

RnaUnifier

RnaUnifier is the main facade class. Instantiate it with the default constructor or inject a custom BpseqExporter.

// Default usage
RnaUnifier unifier = new RnaUnifier();
 
// With custom exporter
RnaUnifier unifier = new RnaUnifier(myCustomExporter);

Methods

Method	Description
`String process(File, ToolType, boolean)`	Parse `File` using the specified tool parser. `boolean` selects extended (`true`) or canonical (`false`) BPSEQ.
`String process(File, boolean)`	Same as above but auto-detects the tool type from the file content.
`String process(InputStream, ToolType, boolean)`	Parse from an `InputStream` with explicit tool type.
`String process(InputStream, boolean)`	Parse from a mark/reset-capable `InputStream` with auto-detection.
`void processToFile(File, ToolType, File, boolean)`	Parse and write the result directly to an output file.
`void processToFile(File, File, boolean)`	Same, with auto-detection of tool type.

Note on boolean extended:
Pass true to obtain the full extended BPSEQ (all 12 Leontis–Westhof columns).
Pass false to obtain canonical BPSEQ only (Watson–Crick pairs; positions with no canonical partner are omitted).

ToolType

An enum listing every supported tool. Import from it.unicam.cs.bdslab.rna2dunifier.parser.ToolType.

Constant	Tool / Format
`FR3D`	FR3D JSON output
`RNAVIEW`	RNAview plain-text output
`RNAPOLIS`	RNApolis FASTA + tabular output
`MCANNOTATE`	mc-annotate plain-text output
`BARNABA`	Barnaba tabular annotation output
`BPNET`	bpnet (BPFIND) tabular output
`X3DNA`	x3dna-DSSR JSON output

ParserFactory

A static factory with two responsibilities: creating parser instances and auto-detecting the tool type.

// Get a parser manually
RnaStructureParser parser = ParserFactory.getParser(ToolType.RNAPOLIS);
 
// Detect the tool from a stream (stream must support mark/reset)
ToolType detected = ParserFactory.detectTool(bufferedInputStream);

Detection heuristics

Auto-detection reads the first 4 096 bytes of the stream and looks for format-specific signatures:

Priority	Signal	Detected tool
1	Contains `BEGIN_base-pair`	`RNAVIEW`
2	Contains `Residue conformations`	`MCANNOTATE`
3	Contains `>` and `seq`	`RNAPOLIS`
4	JSON with key `"annotations"`	`FR3D`
5	JSON with key `"pairs"`	`X3DNA`
6	Lines matching `N_INT_INT N_INT_INT XXc` pattern	`BARNABA`
7	Lines with `?` separator and `W:WC`-style tokens	`BPNET`

If no signature matches, an IllegalArgumentException is thrown.

Important: the stream passed to detectTool must support mark()/reset(). Wrap a plain FileInputStream in a BufferedInputStream before calling this method.

RnaStructureParser

The interface implemented by all seven concrete parsers.

public interface RnaStructureParser {
    ExtendedRNASecondaryStructure parse(InputStream inputStream)
            throws IOException, ParseException;
}

You do not normally need to use the concrete parsers directly — ParserFactory.getParser(ToolType) is the preferred way to obtain one.

ExtendedRNASecondaryStructure

The central domain model. Build instances through the nested Builder.

Builder usage

ExtendedRNASecondaryStructure structure =
    new ExtendedRNASecondaryStructure.Builder()
        .setSequence("AUGCAUGC")
        .addPair(new Pair(0, 7, "A", "U", BondType.LEONTIS_WESTHOF_cWW))
        .addPair(new Pair(1, 6, "U", "G", BondType.LEONTIS_WESTHOF_tWW))
        .addHeaderInfo("PDB ID", "1YMO")
        .addHeaderInfo("Chain ID", "A")
        .build();

Key methods

Method	Returns	Description
`getSequence()`	`String`	The RNA nucleotide sequence (A, C, G, U, N).
`getPairs()`	`List<Pair>`	All base pairs including non-canonical and stacking.
`getCanonical()`	`List<Pair>`	Only canonical pairs (cWW or tWW).
`getHeaderInfo()`	`Map<String,String>`	Metadata such as PDB ID, Chain ID (unmodifiable).

Builder note: addPair(Pair) automatically adds the pair to both the full list and — if pair.getType().isCanonical() is true — to the canonical list. Use setPairs() / setCanonical() only when you want to replace the lists wholesale.

Pair

Represents a single base-pair interaction. Instances are immutable. Two Pair objects are equal if they share the same unordered position pair and bond type (i.e., swapping pos1/pos2 still yields the same pair).

Constructors

// Minimal: positions + bond type
Pair pair = new Pair(0, 29, BondType.LEONTIS_WESTHOF_cWW);
 
// With nucleotide labels
Pair pair = new Pair(0, 29, "G", "C", BondType.LEONTIS_WESTHOF_cWW);
 
// Via Builder
Pair pair = new Pair.Builder()
    .setPos1(0)
    .setPos2(29)
    .setNucleotide1("G")
    .setNucleotide2("C")
    .setType(BondType.LEONTIS_WESTHOF_cWW)
    .build();

Key methods

Method	Returns	Description
`getPos1()`	`int`	Zero-based index of the first nucleotide.
`getPos2()`	`int`	Zero-based index of the second nucleotide.
`getType()`	`BondType`	The Leontis–Westhof bond type.
`getNucleotide1()`	`String`	Nucleotide label at pos1 (may be `null`).
`getNucleotide2()`	`String`	Nucleotide label at pos2 (may be `null`).

Positions are zero-based internally. The exporter converts them to 1-based indices in the BPSEQ output.

BondType

An enum covering all Leontis–Westhof families plus special types.

// Get a BondType from a string label (case-insensitive)
BondType type = BondType.fromString("cWW");   // → LEONTIS_WESTHOF_cWW
BondType type = BondType.fromString("tHS");   // → LEONTIS_WESTHOF_tHS
BondType type = BondType.fromString(null);    // → UNKNOWN
 
// Query the type
type.isCanonical();   // true for cWW and tWW
type.isCis();         // true for all cXX types
type.isTrans();       // true for all tXX types
type.getInfo();       // e.g. "cWW", "tHS", "stacking", "unknown"
 
// Retrieve all 12 LW types in a consistent order
List<BondType> lwFamilies = BondType.getLeontisWesthofFamily();

Special values

Constant	Meaning
`UNKNOWN`	Unclassified or unrecognised bond
`STACKING`	Base-stacking interaction (not a base pair)

fromString normalises the string: it uppercases it and corrects reversed edge notation (e.g., SH → HS, SW → WS, HW → WH) before matching.

BpseqExporter

Converts an ExtendedRNASecondaryStructure into a string.

BpseqExporter exporter = new BpseqExporter();
 
// Full extended BPSEQ (12 LW columns)
String extended = exporter.printExtendedBPSEQ(structure);
 
// Canonical BPSEQ only (positions without a canonical partner are omitted)
String canonical = exporter.printCanonicalBPSEQ(structure);

Extended BPSEQ output structure

Index   Nucleotide   cWW   tWW   cWH   tWH   cWS   tWS   cHH   tHH   cHS   tHS   cSS   tSS
1       G            29    0     0     0     0     0     0     0     0     0     0     0
2       G            28    0     0     0     0     0     0     0     0     0     0     0
...

If a nucleotide has multiple partners of the same type (e.g., two cWH interactions), the partners are comma-separated: 3,17.

Canonical BPSEQ output structure

1   G   29
2   G   28
...

Positions with no canonical partner are not included in the canonical output.

← Quick Start | Next: Supported Input Formats →

Return to top | GitHub Repository

🏠 Home
Fundamentals
- Background
- Architecture
User Guide
Formats
- Supported Input Formats
- Output Formats
Advanced
Support
- Troubleshooting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core API

Core API

RnaUnifier

Methods

ToolType

ParserFactory

Detection heuristics

RnaStructureParser

ExtendedRNASecondaryStructure

Builder usage

Key methods

Pair

Constructors

Key methods

BondType

Special values

BpseqExporter

Extended BPSEQ output structure

Canonical BPSEQ output structure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally