Classical Solver for QUBO Optimization
This project uses a classical integer programming solver to find the optimal RNA secondary structure by minimizing the QUBO (Quadratic Unconstrained Binary Optimization) Hamiltonian. The QUBO is constructed from the potential stems, pseudoknot penalties, and overlap penalties, and encodes the energy landscape of possible RNA foldings.
We use the Google OR-Tools CP-SAT solver to solve this problem. The solver treats each potential stem as a Boolean variable (0 = not selected, 1 = selected) and finds the combination of stems that minimizes the total energy, while automatically respecting the penalties for overlaps and pseudoknots.
Key steps:
Preprocessing: Extract all potential stems, pseudoknots, and overlaps from the RNA sequence.
Model Construction: Build the QUBO Hamiltonian (linear and quadratic terms) using the preprocessing results.
Optimization: Use the CP-SAT solver to find the optimal set of stems (the predicted structure).
Evaluation: Compare the predicted structure to the actual (experimentally determined) structure.

In [None]:
from preprocess_sequence import (
    actual_stems, potential_stems, potential_pseudoknots, potential_overlaps, model
)
from classical_optimizer import solve_qubo_with_cpsat

# Example file paths and directory
ct_file = "your_file.ct"
fasta_file = "your_file.fasta"
subdirectory = "your_data_directory"
pseudoknot_penalty = 2.0

# 1. Get actual stems (ground truth)
actual = actual_stems(ct_file, fasta_file, subdirectory)

# 2. Get potential stems and related info
pot_stems, mu, rna, seq_len = potential_stems(fasta_file, subdirectory)

# 3. Get potential pseudoknots and overlaps
pot_pks = potential_pseudoknots(pot_stems, pseudoknot_penalty)
pot_ovs = potential_overlaps(pot_stems)

# 4. Build the QUBO model (linear and quadratic terms)
L, Q = model(pot_stems, pot_pks, pot_ovs, mu)

# 5. Solve the QUBO using the classical solver
solution, energy = solve_qubo_with_cpsat(L, Q)

# 6. (Optional) Extract the predicted stems from the solution
predicted_stems = [pot_stems[i] for i, v in solution.items() if v == 1]
print("Predicted stems:", predicted_stems)