Confidence-aware tools for protein structure predictions.
foldconf is a Python library and CLI tool that bridges the gap between structure prediction and downstream analysis. It parses output from AlphaFold 3, AlphaFold Database, OpenFold3, Boltz-2, and Chai-1, normalizes confidence metrics into a unified data model, and provides filtering, visualization, reporting, and export tools that propagate uncertainty through your analysis pipeline.
Unlike existing structure tools, foldconf treats confidence data as a first-class citizen. Instead of ignoring pLDDT, PAE, and pTM scores, it uses them to filter unreliable regions, detect interfaces with high confidence, identify structural disagreements between predictors, and generate reports that make prediction uncertainty explicit.
Requires Python 3.11+.
Basic install:
pip install foldconfWith optional 3D visualization (py3Dmol):
pip install 'foldconf[3d]'With PDF report generation (weasyprint):
pip install 'foldconf[pdf]'With all extras:
pip install 'foldconf[3d,pdf]'Load and summarize a prediction:
import foldconf
conf = foldconf.load("path/to/alphafold3_output/")
print(conf.summary())
# Also works with AlphaFold Database downloads
conf = foldconf.load("path/to/afdb_download/")Output:
Source: alphafold3 | Residues: 342 | Mean pLDDT: 82.4
Chains: A (245 res, 87.1), B (97 res, 70.6)
pTM: 0.71 | ipTM: 0.64
Low confidence regions: A:120-135, B:44-52
Filter to high-confidence regions and export:
high_conf = conf.filter(plddt_min=70)
high_conf.export("filtered_high_confidence.cif")Identify and analyze binding interfaces:
interface = conf.select_interface(chain_a="A", chain_b="B", pae_max=5.0)
print(f"Interface has {interface.mean_plddt():.1f} mean pLDDT")
print(f"Interface residues: {len(interface.residues)}")Generate an interactive report:
report = foldconf.report(conf)
report.save("confidence_report.html")Detect disagreement between two predictors:
conf_af3 = foldconf.load("pred_alphafold3/")
conf_boltz = foldconf.load("pred_boltz2/")
diff = foldconf.diff(conf_af3, conf_boltz)
print(diff.summary())
# Shows regions where both are confident but structures disagree — the most interesting predictions
foldconf.compare([conf_af3, conf_boltz], output="comparison.html")From the CLI:
# Summarize a prediction
foldconf summary pred/ --json | jq '.mean_plddt'
# Filter and export
foldconf filter pred/ --plddt-min 70 -o filtered.cif
# Generate report
foldconf report pred/ -o report.html
# Compare three predictions
foldconf compare pred_af3/ pred_boltz/ pred_openfold/ -o comparison.html
# Batch process a directory
foldconf batch predictions/ --output-dir reports/
# Check if a prediction is reliable enough to use
if foldconf check pred/ --plddt-min 70 2>/dev/null; then
echo "Safe to use"
else
echo "Check warnings"
fifoldconf auto-detects the prediction tool by examining signature files. Explicit loaders are also available:
import foldconf
# Auto-detect
conf = foldconf.load("path/")
# Explicit loaders
conf = foldconf.load_alphafold3("path/")
conf = foldconf.load_afdb("path/")
conf = foldconf.load_boltz2("path/")
conf = foldconf.load_openfold3("path/")
conf = foldconf.load_chai1("path/")
# Load all samples (for multi-sample predictions)
conf_set = foldconf.load_all("path/")
top = conf_set.top # highest-ranked sample
consensus = conf_set.consensus_plddt() # mean pLDDT across samples
variance = conf_set.plddt_variance() # per-residue variance# Filter by confidence threshold
high_conf = conf.filter(plddt_min=70)
very_high_conf = conf.filter(plddt_min=90)
# Select specific chains
chain_a = conf.filter(chains=["A"])
# Select a specific region by residue index
domain = conf.select(residues=range(50, 150))
# Select an interface using PAE
interface = conf.select_interface(
chain_a="A",
chain_b="B",
pae_max=5.0, # PAE threshold (Angstroms)
contact_prob_min=0.5 # contact probability threshold (if available)
)
# All filter operations return a new FoldConfidence object (immutable, chainable)
high_conf_a = conf.filter(plddt_min=70).filter(chains=["A"])# Global statistics
mean_plddt = conf.mean_plddt()
# Per-chain statistics
mean_plddt_a = conf.mean_plddt(chain="A")
# Find low-confidence regions
low_regions = conf.low_confidence_regions(threshold=50)
# Returns: [(start, end), ...]
# Quick reliability check
is_reliable = conf.is_reliable(plddt_min=70, ptm_min=0.5) # Returns bool
# Confidence classification
classification = conf.classify()
# classification.labels — per-residue categories (very_high, confident, low, very_low)
# classification.distribution — count per category
# classification.fraction_reliable — fraction above "confident" threshold
# Pre-computed summaries
for chain_id, summary in conf.chain_summaries.items():
print(f"{chain_id}: {summary.mean_plddt:.1f} pLDDT")
for (chain_a, chain_b), summary in conf.interface_summaries.items():
print(f"{chain_a}-{chain_b} interface: PAE {summary.mean_interface_pae:.2f}")import foldconf.plot as plot
# Line plot: pLDDT per residue, colored by confidence band
plot.plddt_line(conf, save_to="plddt.png")
# Heatmap: PAE matrix with chain boundaries
plot.pae_heatmap(conf, save_to="pae.png")
# Histogram: distribution of pLDDT scores
plot.confidence_histogram(conf, save_to="hist.png")
# Overlay: compare pLDDT curves from multiple predictions
plot.compare([conf1, conf2, conf3], save_to="comparison.png")
# Interactive 3D (requires foldconf[3d])
import foldconf.view as view
view.structure(conf) # opens browser
view.structure(conf, highlight_low=True) # highlight unreliable regions# Generate a reliability report
report = foldconf.report(conf)
report.save("report.html") # self-contained HTML with plots + tables
report.save("report.json") # machine-readable
report.save("report.pdf") # requires foldconf[pdf]
# Export structure with confidence embedded
conf.export("structure.cif") # mmCIF with provenance
conf.export("structure.pdb") # PDB with pLDDT as B-factor
conf.filter(plddt_min=70).export("filtered.cif")
# Export structure with pLDDT in B-factor column
conf.export("structure_bfactor.pdb", bfactor="plddt")Exported mmCIF files include custom categories that document:
- Source tool and version
- Global scores (pTM, ipTM, ranking_score)
- Filter history (e.g.,
plddt_min=70, chains=["A"]) - Warnings raised during load
- Confidence band distribution
This means collaborators can see exactly what was done to produce a filtered structure without needing foldconf installed.
# Iterate over prediction directories without holding all in memory
for path, result in foldconf.batch_load("predictions/"):
from foldconf.models import FoldConfidence
from foldconf.errors import LoadError
if isinstance(result, FoldConfidence):
print(f"{path}: {result.mean_plddt():.1f} pLDDT")
else:
# result is a LoadError wrapping the exception
print(f"{path}: failed — {result.error}")
# Generate reports for all predictions
foldconf.batch_report(
foldconf.batch_load("predictions/"),
output_dir="reports/"
)
# Generates report.html, report.json (or .pdf) per prediction
# Also produces failures.json with error details# Compare two predictions of the same target
diff = foldconf.diff(conf_af3, conf_boltz2)
# Per-residue structural difference (RMSD after alignment)
rmsd = diff.rmsd_per_residue
# Per-residue confidence agreement
confidence_diff = diff.confidence_agreement
# Interesting regions: both confident but structures disagree
# These flag genuinely ambiguous biology rather than prediction noise
interesting = diff.interesting_regions()
for start, end in interesting:
print(f"{chain_id}: {start}-{end} — disagreement despite high confidence")
# Summary statistics
print(diff.summary())
# Generate comparison HTML report
foldconf.compare([conf_af3, conf_boltz2, conf_openfold], output="comparison.html")The core object is FoldConfidence, which normalizes confidence data from different prediction tools:
@dataclass
class FoldConfidence:
# Identity
source: str # "alphafold3", "openfold3", "boltz2", "chai1"
metadata: dict # tool version, input params, etc.
# Per-residue confidence
plddt: np.ndarray # 0-100, normalized across all sources
# Pairwise confidence
pae: np.ndarray # predicted aligned error (Angstroms)
contact_probs: np.ndarray # contact probability (if available)
# Global scores
ptm: float # predicted template modeling
iptm: float # interface pTM
ranking_score: float # AF3 composite score (if available)
# Boltz-2 specific
pde: np.ndarray # predicted distance error
affinity: float # binding affinity (log IC50 µM)
# Extended scores (tool-specific, preserved as-is)
extended_scores: dict # e.g., ligand_iptm, protein_iptm, etc.
# Structure
structure: gemmi.Structure # atomic coordinates
residues: list[Residue] # sequence metadata
chains: list[Chain] # chain groupings
# Pre-computed summaries
chain_summaries: dict[str, ChainSummary]
interface_summaries: dict[tuple[str, str], InterfaceSummary]
# Warnings and metadata
warnings: list[ConfidenceWarning]
original_indices: np.ndarray # maps filtered indices to sourceAll library operations are available from the command line. Every command supports --json for piping with jq:
Print a text summary of a prediction:
foldconf summary path/to/prediction/
foldconf summary path/to/prediction/ --jsonShow confidence band distribution (very_high, confident, low, very_low):
foldconf classify path/to/prediction/
foldconf classify path/to/prediction/ --jsonFilter to high-confidence residues and export:
foldconf filter path/to/prediction/ --plddt-min 70 -o filtered.cif
foldconf filter path/to/prediction/ --plddt-min 90 --chain A -o binding_site.pdbOptions: --plddt-min, --chain, -o/--output
Generate static plots (matplotlib):
foldconf plot plddt path/to/prediction/ -o plddt.png
foldconf plot pae path/to/prediction/ -o pae.png
foldconf plot histogram path/to/prediction/ -o hist.pngGenerate an interactive reliability report:
foldconf report path/to/prediction/ -o report.html
foldconf report path/to/prediction/ -o report.json
foldconf report path/to/prediction/ -o report.pdf # requires foldconf[pdf]Batch process a directory of predictions:
foldconf batch predictions/ --output-dir reports/
foldconf batch predictions/ --output-dir reports/ --verboseGenerates: report.html (or .json/.pdf) per prediction, plus failures.json with errors.
Compare multiple predictions side-by-side:
foldconf compare pred_af3/ pred_boltz2/ pred_openfold/ -o comparison.htmlStructural diff between two predictions (finds regions where both are confident but disagree):
foldconf diff pred_af3/ pred_boltz2/ -o diff_report.html
foldconf diff pred_af3/ pred_boltz2/ --json | jq '.interesting_regions'Quick check: exit 0 if reliable, 1 if not. Useful in CI pipelines:
if foldconf check pred/ --plddt-min 70 2>/dev/null; then
echo "Prediction is suitable for downstream use"
else
echo "Check confidence metrics"
fiInteractive 3D visualization (opens browser):
foldconf view path/to/prediction/--json— machine-readable JSON output (works with all commands)--no-warnings— suppress warning output to stderr-o,--output— write output to file instead of stdout
Pipe examples:
# Get mean pLDDT via JSON
foldconf summary pred/ --json | jq '.mean_plddt'
# Find all chains with pLDDT < 60
foldconf summary pred/ --json | jq '.chain_summaries[] | select(.mean_plddt < 60)'
# Batch: filter predictions by confidence
foldconf batch pred/ --json | jq 'select(.mean_plddt > 80)'
# Diff: extract interesting regions
foldconf diff af3/ boltz2/ --json | jq '.interesting_regions'
# Filter and get summary
foldconf filter pred/ --plddt-min 70 -o filtered.cif --json | jq '.num_retained_residues'- AlphaFold 3 — full support (pLDDT, PAE, pTM, ipTM, ranking_score)
- AlphaFold Database — full support (pLDDT, PAE). Reads native AFDB format directly — no conversion needed for files downloaded from alphafold.ebi.ac.uk
- OpenFold3 — full support (pLDDT, PAE, pTM, ipTM)
- Boltz-2 — full support (pLDDT, PAE, pTM, ipTM, pDE, binding affinity)
- Chai-1 — stub (format TBD during implementation)
Auto-detection examines signature files to identify the source without user input. AFDB format is also auto-detected from files downloaded directly from alphafold.ebi.ac.uk.
All metrics are normalized for consistency across sources:
- pLDDT — per-residue confidence (0-100, higher is better)
- PAE — predicted aligned error between residue pairs (Angstroms, lower is better)
- pTM — predicted template modeling (0-1, how well aligned to template)
- ipTM — interface pTM (0-1, confidence in interface modeling)
- pDE — predicted distance error (Boltz-2 only)
- Affinity — predicted binding affinity log IC50 µM (Boltz-2 only)
- Contact probability — cross-chain contact likelihood (if available)
- Threshold filtering — retain residues above a pLDDT minimum
- Chain filtering — select specific chains
- Region selection — select by residue index range
- Interface detection — smart selection using PAE and contact probabilities
- Immutable operations — filter returns new FoldConfidence, original unchanged
- pLDDT line plot — per-residue confidence with AF3-standard color bands
- PAE heatmap — pairwise confidence matrix with chain boundaries
- Confidence histogram — distribution of pLDDT scores
- Multi-prediction overlay — compare pLDDT curves
- Interactive 3D — py3Dmol integration for browser-based visualization
- HTML reports — self-contained, viewable anywhere, includes plots and tables
- JSON reports — machine-readable, suitable for pipelines
- PDF reports — optional, requires weasyprint
- CIF export with provenance — custom categories document source tool, filter history, warnings
- Confidence classification — categorize residues into very_high/confident/low/very_low
- Low-confidence region detection — identify contiguous regions below threshold
- Interface summaries — mean PAE and pLDDT per chain-pair interface
- Structural diff — find regions where two predictions disagree despite high confidence
- Consensus across samples — mean/variance across multiple diffusion samples
- Streaming — process large prediction sets without holding all in memory
- Error handling — capture and report failures per prediction
- Batch reporting — generate reports for all predictions with a single call
- Structured logging — per-directory load time, counts, and error details
import foldconf
# Load prediction
conf = foldconf.load("protein_ligand_complex/")
# Check global reliability
if not conf.is_reliable(plddt_min=70, ptm_min=0.5):
print("Prediction confidence is too low for binding analysis")
exit(1)
# Extract binding interface
interface = conf.select_interface(chain_a="protein", chain_b="ligand", pae_max=5.0)
# Export for downstream docking
interface.export("validated_interface.pdb", bfactor="plddt")import foldconf
conf_af3 = foldconf.load("af3_prediction/")
conf_boltz = foldconf.load("boltz2_prediction/")
# Structural and confidence comparison
diff = foldconf.diff(conf_af3, conf_boltz)
# These are the regions where BOTH tools are confident but disagree
# Often flagging genuine biological uncertainty
interesting = diff.interesting_regions()
print(f"Found {len(interesting)} interesting regions:")
for start, end in interesting:
print(f" residues {start}-{end}")
# Generate visual comparison
foldconf.compare([conf_af3, conf_boltz], output="comparison.html")import foldconf
# Process all predictions, filter high-confidence, generate reports
results = foldconf.batch_load("predictions/")
for path, result in results:
from foldconf.models import FoldConfidence
if isinstance(result, FoldConfidence):
# Filter to high-confidence regions only
filtered = result.filter(plddt_min=70)
# Export filtered structure
filtered.export(f"filtered/{path.stem}_filtered.cif")
# Generate report
report = foldconf.report(filtered)
report.save(f"reports/{path.stem}_report.html")
print(f"{path.stem}: {filtered.mean_plddt():.1f} pLDDT after filtering")
else:
print(f"{path}: failed — {result.error}")#!/bin/bash
# Process AlphaFold 3 output
foldconf check prediction_dir/ --plddt-min 70 || {
echo "Prediction does not meet minimum confidence"
foldconf summary prediction_dir/ --json | jq '.warnings'
exit 1
}
# If we get here, prediction is reliable enough to use
echo "Prediction passed reliability check"
# Export for downstream use
foldconf filter prediction_dir/ --plddt-min 70 -o output.cifOn parse failure, adapters raise typed exceptions:
FormatNotRecognizedError— auto-detection found no matching adapterIncompleteOutputError— signature files present but required files missingDataConsistencyError— pLDDT length doesn't match residue count, PAE not square, etc.
import foldconf
from foldconf.errors import FormatNotRecognizedError, DataConsistencyError
try:
conf = foldconf.load("unknown_format/")
except FormatNotRecognizedError as e:
print(f"Could not detect prediction tool: {e}")
except DataConsistencyError as e:
print(f"Confidence data is malformed: {e}")Predictions that load successfully but have reliability concerns generate warnings:
import foldconf
conf = foldconf.load("prediction/")
for warning in conf.warnings:
print(f"[{warning.severity}] {warning.code}: {warning.message}")
print(f"Details: {warning.details}")Common warning codes:
low_iptm— interface confidence is low (< 0.6)low_ptm— overall fold confidence is low (< 0.5)disordered_region— contiguous region of ≥ 20 residues with pLDDT < 50low_plddt_chain— entire chain has mean pLDDT < 60high_clash— AF3 ranking_score penalized for clashessingle_chain— no interface metrics available for single-chain prediction
Suppress warnings:
conf = foldconf.load("prediction/", suppress_warnings=["single_chain"])
conf = foldconf.load("prediction/", suppress_warnings="all")batch_load never aborts on a single failure. Failures are returned as LoadError objects:
for path, result in foldconf.batch_load("predictions/"):
from foldconf.models import FoldConfidence
if isinstance(result, FoldConfidence):
print(f"{path}: success")
else:
# result is a LoadError with .path and .error attributes
print(f"{path}: {result.error}")batch_report logs all failures to failures.json:
[
{
"path": "predictions/broken_pred/",
"error_type": "FormatNotRecognizedError",
"error_message": "No matching adapter found for this directory"
}
]We welcome contributions! The project uses pytest for testing.
Run tests:
pytest tests/
pytest tests/ -v --cov=foldconfCode should follow PEP 8. All public functions need docstrings.
To add support for a new prediction tool:
- Create an adapter in
foldconf/adapters/new_tool.pyimplementing theAdapterprotocol - Add detection logic to
foldconf/adapters/detect.py - Add test fixtures and tests in
tests/ - Update this README
MIT License. See LICENSE file for details.
If you use foldconf in your research, please cite:
@software{foldconf2026,
title={foldconf: Confidence-aware tools for protein structure predictions},
author={...},
year={2026},
url={https://github.com/...}
}0.1.0 (2026-03-15)