# 🧬 PARSEbp: Pairwise Agreement-based RNA Scoring with Emphasis on Base Pairings

**Authors:** Sumit Tarafder and Debswapna Bhattacharya  
Bhattacharya Lab, Virginia Tech

---
This notebook demonstrates how to install and use **PARSEbp**, a *Pairwise Agreement-based RNA Scoring* method emphasizing **base-pairing consistency** among RNA 3D decoys.

## ⚙️ Installation

Typical installation time should be **less than a minute** on Colab or a 64-bit Linux environment.

```bash
!pip install PARSEbp
```
or install from source:
```bash
!git clone https://github.com/Bhattacharya-Lab/PARSEbp.git
%cd PARSEbp
!pip install .
```

In [None]:
!pip install PARSEbp

In [None]:
!git clone https://github.com/Bhattacharya-Lab/PARSEbp.git
!pip install .

## 🧩 Quick Start Example

The following examples shows how to load RNA structures, compute PARSEbp scores, and access results.

In [None]:
from PARSEbp import parsebp

# Initialize PARSEbp
p = parsebp()
p.set_mode(1)  # scoring mode
p.set_parallel_threads(100)

# Example RNA sequence (for demonstration)
seq = "GGACACGAGUAACUCGUCUAUCUGCUGCAGGCUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGAACAUCUAGGUUUCGUCCGGGUGUUACCGAAAGGUCAGAUGGAGAGCCUUGUCCC"
p.set_target_sequnece(seq)

## set_mode()
This function takes one argument. If you provide 1 (default), it will score the pdbs based on pairwise TM-score similarity weighted by pairwise INF scores to emphasize on base pairings.
If you provide 0, it will only consider pairwise TM-score similarity to score the pdb files.

## set_parallel_threads()
Set the number of threads for parallel pairwise score computations. Default value is 50.

## set_target_sequnece()
If you set target sequence to "", then all PDB files specified in the input directory will be scored.
Otherwise, if you set a specific sequence, then only the PDBs that exactly match the target sequence will be scored.

In [None]:
# Load a directory containing RNA 3D structures (.pdb files)
p.load_pdbs("Inputs")

In [None]:
# Compute scores
score = p.score()

In [None]:
# Save all results
score.save("score.txt")

## Output Explanation

After running, PARSEbp generates a file named **`score.txt`** (provided as argument), which contains predicted quality scores for each decoy structure.

| Decoy | Score |
|:------|:------|
| decoy_1.pdb | 0.812 |
| decoy_2.pdb | 0.793 |
| ... | ... |

### ⏱ Performance
- Scoring a typical RNA (~100 nucleotides) with ~200 decoys takes **≈30 seconds** using 50 threads.

## 📊 Accessing and Analyzing Results

You can retrieve individual or top-ranked scores directly from the `score` object.

### Get score for a specific 3D structure

In [None]:

pred_score = score.getScore("decoy_1.pdb")

### Get the top ranked pdb and the corresponding score using top1()
Returns a list of decoy names (multiple decoys if there is a tie) and the top score.

In [None]:
# Get top-1 ranked model(s)
pdbnames, top_score = score.top1()

### Get top N ranked decoys and their corresponding scores using topN(N)
Takes one argument as input - N: number of top structures to return
Returns a dictionary of the top N scoring decoys and their predicted scores sorted by descending order of scores

In [None]:
# Get top-N ranked decoys
top_scored_dict = score.topN(3)

## Summary

PARSEbp offers a fast and accurate multi-model QA method for RNA 3D structure quality assessment based on pairwise base pair agreement. It is particularly useful for:
- Ranking RNA 3D decoys from prediction pipelines.
- Evaluating ensemble quality in RNA folding studies.
- Benchmarking or CASP-style assessments.

For detailed documentation, visit: [https://github.com/Bhattacharya-Lab/PARSEbp](https://github.com/Bhattacharya-Lab/PARSEbp)