foldconf

Confidence-aware tools for protein structure predictions.

foldconf is a Python library and CLI tool that bridges the gap between structure prediction and downstream analysis. It parses output from AlphaFold 3, AlphaFold Database, OpenFold3, Boltz-2, and Chai-1, normalizes confidence metrics into a unified data model, and provides filtering, visualization, reporting, and export tools that propagate uncertainty through your analysis pipeline.

Unlike existing structure tools, foldconf treats confidence data as a first-class citizen. Instead of ignoring pLDDT, PAE, and pTM scores, it uses them to filter unreliable regions, detect interfaces with high confidence, identify structural disagreements between predictors, and generate reports that make prediction uncertainty explicit.

Installation

Requires Python 3.11+.

Basic install:

pip install foldconf

With optional 3D visualization (py3Dmol):

pip install 'foldconf[3d]'

With PDF report generation (weasyprint):

pip install 'foldconf[pdf]'

With all extras:

pip install 'foldconf[3d,pdf]'

Quick Start

Load and summarize a prediction:

import foldconf

conf = foldconf.load("path/to/alphafold3_output/")
print(conf.summary())

# Also works with AlphaFold Database downloads
conf = foldconf.load("path/to/afdb_download/")

Output:

Source: alphafold3 | Residues: 342 | Mean pLDDT: 82.4
Chains: A (245 res, 87.1), B (97 res, 70.6)
pTM: 0.71 | ipTM: 0.64
Low confidence regions: A:120-135, B:44-52

Filter to high-confidence regions and export:

high_conf = conf.filter(plddt_min=70)
high_conf.export("filtered_high_confidence.cif")

Identify and analyze binding interfaces:

interface = conf.select_interface(chain_a="A", chain_b="B", pae_max=5.0)
print(f"Interface has {interface.mean_plddt():.1f} mean pLDDT")
print(f"Interface residues: {len(interface.residues)}")

Generate an interactive report:

report = foldconf.report(conf)
report.save("confidence_report.html")

Detect disagreement between two predictors:

conf_af3 = foldconf.load("pred_alphafold3/")
conf_boltz = foldconf.load("pred_boltz2/")

diff = foldconf.diff(conf_af3, conf_boltz)
print(diff.summary())
# Shows regions where both are confident but structures disagree — the most interesting predictions

foldconf.compare([conf_af3, conf_boltz], output="comparison.html")

From the CLI:

# Summarize a prediction
foldconf summary pred/ --json | jq '.mean_plddt'

# Filter and export
foldconf filter pred/ --plddt-min 70 -o filtered.cif

# Generate report
foldconf report pred/ -o report.html

# Compare three predictions
foldconf compare pred_af3/ pred_boltz/ pred_openfold/ -o comparison.html

# Batch process a directory
foldconf batch predictions/ --output-dir reports/

# Check if a prediction is reliable enough to use
if foldconf check pred/ --plddt-min 70 2>/dev/null; then
    echo "Safe to use"
else
    echo "Check warnings"
fi

Library Usage

Core Operations

Loading predictions

foldconf auto-detects the prediction tool by examining signature files. Explicit loaders are also available:

import foldconf

# Auto-detect
conf = foldconf.load("path/")

# Explicit loaders
conf = foldconf.load_alphafold3("path/")
conf = foldconf.load_afdb("path/")
conf = foldconf.load_boltz2("path/")
conf = foldconf.load_openfold3("path/")
conf = foldconf.load_chai1("path/")

# Load all samples (for multi-sample predictions)
conf_set = foldconf.load_all("path/")
top = conf_set.top           # highest-ranked sample
consensus = conf_set.consensus_plddt()  # mean pLDDT across samples
variance = conf_set.plddt_variance()    # per-residue variance

Filtering and selection

# Filter by confidence threshold
high_conf = conf.filter(plddt_min=70)
very_high_conf = conf.filter(plddt_min=90)

# Select specific chains
chain_a = conf.filter(chains=["A"])

# Select a specific region by residue index
domain = conf.select(residues=range(50, 150))

# Select an interface using PAE
interface = conf.select_interface(
    chain_a="A",
    chain_b="B",
    pae_max=5.0,           # PAE threshold (Angstroms)
    contact_prob_min=0.5   # contact probability threshold (if available)
)

# All filter operations return a new FoldConfidence object (immutable, chainable)
high_conf_a = conf.filter(plddt_min=70).filter(chains=["A"])

Queries and summaries

# Global statistics
mean_plddt = conf.mean_plddt()

# Per-chain statistics
mean_plddt_a = conf.mean_plddt(chain="A")

# Find low-confidence regions
low_regions = conf.low_confidence_regions(threshold=50)
# Returns: [(start, end), ...]

# Quick reliability check
is_reliable = conf.is_reliable(plddt_min=70, ptm_min=0.5)  # Returns bool

# Confidence classification
classification = conf.classify()
# classification.labels — per-residue categories (very_high, confident, low, very_low)
# classification.distribution — count per category
# classification.fraction_reliable — fraction above "confident" threshold

# Pre-computed summaries
for chain_id, summary in conf.chain_summaries.items():
    print(f"{chain_id}: {summary.mean_plddt:.1f} pLDDT")

for (chain_a, chain_b), summary in conf.interface_summaries.items():
    print(f"{chain_a}-{chain_b} interface: PAE {summary.mean_interface_pae:.2f}")

Visualization

import foldconf.plot as plot

# Line plot: pLDDT per residue, colored by confidence band
plot.plddt_line(conf, save_to="plddt.png")

# Heatmap: PAE matrix with chain boundaries
plot.pae_heatmap(conf, save_to="pae.png")

# Histogram: distribution of pLDDT scores
plot.confidence_histogram(conf, save_to="hist.png")

# Overlay: compare pLDDT curves from multiple predictions
plot.compare([conf1, conf2, conf3], save_to="comparison.png")

# Interactive 3D (requires foldconf[3d])
import foldconf.view as view
view.structure(conf)                        # opens browser
view.structure(conf, highlight_low=True)    # highlight unreliable regions

Reports and export

# Generate a reliability report
report = foldconf.report(conf)
report.save("report.html")      # self-contained HTML with plots + tables
report.save("report.json")      # machine-readable
report.save("report.pdf")       # requires foldconf[pdf]

# Export structure with confidence embedded
conf.export("structure.cif")                    # mmCIF with provenance
conf.export("structure.pdb")                    # PDB with pLDDT as B-factor
conf.filter(plddt_min=70).export("filtered.cif")

# Export structure with pLDDT in B-factor column
conf.export("structure_bfactor.pdb", bfactor="plddt")

Exported mmCIF files include custom categories that document:

Source tool and version
Global scores (pTM, ipTM, ranking_score)
Filter history (e.g., plddt_min=70, chains=["A"])
Warnings raised during load
Confidence band distribution

This means collaborators can see exactly what was done to produce a filtered structure without needing foldconf installed.

Batch processing

# Iterate over prediction directories without holding all in memory
for path, result in foldconf.batch_load("predictions/"):
    from foldconf.models import FoldConfidence
    from foldconf.errors import LoadError
    if isinstance(result, FoldConfidence):
        print(f"{path}: {result.mean_plddt():.1f} pLDDT")
    else:
        # result is a LoadError wrapping the exception
        print(f"{path}: failed — {result.error}")

# Generate reports for all predictions
foldconf.batch_report(
    foldconf.batch_load("predictions/"),
    output_dir="reports/"
)
# Generates report.html, report.json (or .pdf) per prediction
# Also produces failures.json with error details

Structural comparison

# Compare two predictions of the same target
diff = foldconf.diff(conf_af3, conf_boltz2)

# Per-residue structural difference (RMSD after alignment)
rmsd = diff.rmsd_per_residue

# Per-residue confidence agreement
confidence_diff = diff.confidence_agreement

# Interesting regions: both confident but structures disagree
# These flag genuinely ambiguous biology rather than prediction noise
interesting = diff.interesting_regions()
for start, end in interesting:
    print(f"{chain_id}: {start}-{end} — disagreement despite high confidence")

# Summary statistics
print(diff.summary())

# Generate comparison HTML report
foldconf.compare([conf_af3, conf_boltz2, conf_openfold], output="comparison.html")

Data Model

The core object is FoldConfidence, which normalizes confidence data from different prediction tools:

@dataclass
class FoldConfidence:
    # Identity
    source: str                 # "alphafold3", "openfold3", "boltz2", "chai1"
    metadata: dict              # tool version, input params, etc.

    # Per-residue confidence
    plddt: np.ndarray           # 0-100, normalized across all sources

    # Pairwise confidence
    pae: np.ndarray             # predicted aligned error (Angstroms)
    contact_probs: np.ndarray   # contact probability (if available)

    # Global scores
    ptm: float                  # predicted template modeling
    iptm: float                 # interface pTM
    ranking_score: float        # AF3 composite score (if available)

    # Boltz-2 specific
    pde: np.ndarray             # predicted distance error
    affinity: float             # binding affinity (log IC50 µM)

    # Extended scores (tool-specific, preserved as-is)
    extended_scores: dict       # e.g., ligand_iptm, protein_iptm, etc.

    # Structure
    structure: gemmi.Structure   # atomic coordinates
    residues: list[Residue]      # sequence metadata
    chains: list[Chain]          # chain groupings

    # Pre-computed summaries
    chain_summaries: dict[str, ChainSummary]
    interface_summaries: dict[tuple[str, str], InterfaceSummary]

    # Warnings and metadata
    warnings: list[ConfidenceWarning]
    original_indices: np.ndarray  # maps filtered indices to source

CLI Reference

All library operations are available from the command line. Every command supports --json for piping with jq:

summary

Print a text summary of a prediction:

foldconf summary path/to/prediction/
foldconf summary path/to/prediction/ --json

classify

Show confidence band distribution (very_high, confident, low, very_low):

foldconf classify path/to/prediction/
foldconf classify path/to/prediction/ --json

filter

Filter to high-confidence residues and export:

foldconf filter path/to/prediction/ --plddt-min 70 -o filtered.cif
foldconf filter path/to/prediction/ --plddt-min 90 --chain A -o binding_site.pdb

Options: --plddt-min, --chain, -o/--output

plot

Generate static plots (matplotlib):

foldconf plot plddt path/to/prediction/ -o plddt.png
foldconf plot pae path/to/prediction/ -o pae.png
foldconf plot histogram path/to/prediction/ -o hist.png

report

Generate an interactive reliability report:

foldconf report path/to/prediction/ -o report.html
foldconf report path/to/prediction/ -o report.json
foldconf report path/to/prediction/ -o report.pdf  # requires foldconf[pdf]

batch

Batch process a directory of predictions:

foldconf batch predictions/ --output-dir reports/
foldconf batch predictions/ --output-dir reports/ --verbose

Generates: report.html (or .json/.pdf) per prediction, plus failures.json with errors.

compare

Compare multiple predictions side-by-side:

foldconf compare pred_af3/ pred_boltz2/ pred_openfold/ -o comparison.html

diff

Structural diff between two predictions (finds regions where both are confident but disagree):

foldconf diff pred_af3/ pred_boltz2/ -o diff_report.html
foldconf diff pred_af3/ pred_boltz2/ --json | jq '.interesting_regions'

check

Quick check: exit 0 if reliable, 1 if not. Useful in CI pipelines:

if foldconf check pred/ --plddt-min 70 2>/dev/null; then
    echo "Prediction is suitable for downstream use"
else
    echo "Check confidence metrics"
fi

view

Interactive 3D visualization (opens browser):

foldconf view path/to/prediction/

Global options

--json — machine-readable JSON output (works with all commands)
--no-warnings — suppress warning output to stderr
-o, --output — write output to file instead of stdout

Pipe examples:

# Get mean pLDDT via JSON
foldconf summary pred/ --json | jq '.mean_plddt'

# Find all chains with pLDDT < 60
foldconf summary pred/ --json | jq '.chain_summaries[] | select(.mean_plddt < 60)'

# Batch: filter predictions by confidence
foldconf batch pred/ --json | jq 'select(.mean_plddt > 80)'

# Diff: extract interesting regions
foldconf diff af3/ boltz2/ --json | jq '.interesting_regions'

# Filter and get summary
foldconf filter pred/ --plddt-min 70 -o filtered.cif --json | jq '.num_retained_residues'

Features

Source Support

AlphaFold 3 — full support (pLDDT, PAE, pTM, ipTM, ranking_score)
AlphaFold Database — full support (pLDDT, PAE). Reads native AFDB format directly — no conversion needed for files downloaded from alphafold.ebi.ac.uk
OpenFold3 — full support (pLDDT, PAE, pTM, ipTM)
Boltz-2 — full support (pLDDT, PAE, pTM, ipTM, pDE, binding affinity)
Chai-1 — stub (format TBD during implementation)

Auto-detection examines signature files to identify the source without user input. AFDB format is also auto-detected from files downloaded directly from alphafold.ebi.ac.uk.

Confidence Metrics

All metrics are normalized for consistency across sources:

pLDDT — per-residue confidence (0-100, higher is better)
PAE — predicted aligned error between residue pairs (Angstroms, lower is better)
pTM — predicted template modeling (0-1, how well aligned to template)
ipTM — interface pTM (0-1, confidence in interface modeling)
pDE — predicted distance error (Boltz-2 only)
Affinity — predicted binding affinity log IC50 µM (Boltz-2 only)
Contact probability — cross-chain contact likelihood (if available)

Filtering & Selection

Threshold filtering — retain residues above a pLDDT minimum
Chain filtering — select specific chains
Region selection — select by residue index range
Interface detection — smart selection using PAE and contact probabilities
Immutable operations — filter returns new FoldConfidence, original unchanged

Visualization

pLDDT line plot — per-residue confidence with AF3-standard color bands
PAE heatmap — pairwise confidence matrix with chain boundaries
Confidence histogram — distribution of pLDDT scores
Multi-prediction overlay — compare pLDDT curves
Interactive 3D — py3Dmol integration for browser-based visualization

Reporting

HTML reports — self-contained, viewable anywhere, includes plots and tables
JSON reports — machine-readable, suitable for pipelines
PDF reports — optional, requires weasyprint
CIF export with provenance — custom categories document source tool, filter history, warnings

Structural Analysis

Confidence classification — categorize residues into very_high/confident/low/very_low
Low-confidence region detection — identify contiguous regions below threshold
Interface summaries — mean PAE and pLDDT per chain-pair interface
Structural diff — find regions where two predictions disagree despite high confidence
Consensus across samples — mean/variance across multiple diffusion samples

Batch Processing

Streaming — process large prediction sets without holding all in memory
Error handling — capture and report failures per prediction
Batch reporting — generate reports for all predictions with a single call
Structured logging — per-directory load time, counts, and error details

Examples

Example 1: Validate a binding prediction

import foldconf

# Load prediction
conf = foldconf.load("protein_ligand_complex/")

# Check global reliability
if not conf.is_reliable(plddt_min=70, ptm_min=0.5):
    print("Prediction confidence is too low for binding analysis")
    exit(1)

# Extract binding interface
interface = conf.select_interface(chain_a="protein", chain_b="ligand", pae_max=5.0)

# Export for downstream docking
interface.export("validated_interface.pdb", bfactor="plddt")

Example 2: Find disagreements between two predictors

import foldconf

conf_af3 = foldconf.load("af3_prediction/")
conf_boltz = foldconf.load("boltz2_prediction/")

# Structural and confidence comparison
diff = foldconf.diff(conf_af3, conf_boltz)

# These are the regions where BOTH tools are confident but disagree
# Often flagging genuine biological uncertainty
interesting = diff.interesting_regions()
print(f"Found {len(interesting)} interesting regions:")
for start, end in interesting:
    print(f"  residues {start}-{end}")

# Generate visual comparison
foldconf.compare([conf_af3, conf_boltz], output="comparison.html")

Example 3: Batch processing with filtering

import foldconf

# Process all predictions, filter high-confidence, generate reports
results = foldconf.batch_load("predictions/")

for path, result in results:
    from foldconf.models import FoldConfidence
    if isinstance(result, FoldConfidence):
        # Filter to high-confidence regions only
        filtered = result.filter(plddt_min=70)

        # Export filtered structure
        filtered.export(f"filtered/{path.stem}_filtered.cif")

        # Generate report
        report = foldconf.report(filtered)
        report.save(f"reports/{path.stem}_report.html")

        print(f"{path.stem}: {filtered.mean_plddt():.1f} pLDDT after filtering")
    else:
        print(f"{path}: failed — {result.error}")

Example 4: Pipeline check

#!/bin/bash

# Process AlphaFold 3 output
foldconf check prediction_dir/ --plddt-min 70 || {
    echo "Prediction does not meet minimum confidence"
    foldconf summary prediction_dir/ --json | jq '.warnings'
    exit 1
}

# If we get here, prediction is reliable enough to use
echo "Prediction passed reliability check"

# Export for downstream use
foldconf filter prediction_dir/ --plddt-min 70 -o output.cif

Error Handling

Loading errors

On parse failure, adapters raise typed exceptions:

FormatNotRecognizedError — auto-detection found no matching adapter
IncompleteOutputError — signature files present but required files missing
DataConsistencyError — pLDDT length doesn't match residue count, PAE not square, etc.

import foldconf
from foldconf.errors import FormatNotRecognizedError, DataConsistencyError

try:
    conf = foldconf.load("unknown_format/")
except FormatNotRecognizedError as e:
    print(f"Could not detect prediction tool: {e}")
except DataConsistencyError as e:
    print(f"Confidence data is malformed: {e}")

Structured warnings

Predictions that load successfully but have reliability concerns generate warnings:

import foldconf

conf = foldconf.load("prediction/")
for warning in conf.warnings:
    print(f"[{warning.severity}] {warning.code}: {warning.message}")
    print(f"Details: {warning.details}")

Common warning codes:

low_iptm — interface confidence is low (< 0.6)
low_ptm — overall fold confidence is low (< 0.5)
disordered_region — contiguous region of ≥ 20 residues with pLDDT < 50
low_plddt_chain — entire chain has mean pLDDT < 60
high_clash — AF3 ranking_score penalized for clashes
single_chain — no interface metrics available for single-chain prediction

Suppress warnings:

conf = foldconf.load("prediction/", suppress_warnings=["single_chain"])
conf = foldconf.load("prediction/", suppress_warnings="all")

Batch failures

batch_load never aborts on a single failure. Failures are returned as LoadError objects:

for path, result in foldconf.batch_load("predictions/"):
    from foldconf.models import FoldConfidence
    if isinstance(result, FoldConfidence):
        print(f"{path}: success")
    else:
        # result is a LoadError with .path and .error attributes
        print(f"{path}: {result.error}")

batch_report logs all failures to failures.json:

[
  {
    "path": "predictions/broken_pred/",
    "error_type": "FormatNotRecognizedError",
    "error_message": "No matching adapter found for this directory"
  }
]

Contributing

We welcome contributions! The project uses pytest for testing.

Run tests:

pytest tests/
pytest tests/ -v --cov=foldconf

Code should follow PEP 8. All public functions need docstrings.

To add support for a new prediction tool:

Create an adapter in foldconf/adapters/new_tool.py implementing the Adapter protocol
Add detection logic to foldconf/adapters/detect.py
Add test fixtures and tests in tests/
Update this README

License

MIT License. See LICENSE file for details.

Citation

If you use foldconf in your research, please cite:

@software{foldconf2026,
  title={foldconf: Confidence-aware tools for protein structure predictions},
  author={...},
  year={2026},
  url={https://github.com/...}
}

Version

0.1.0 (2026-03-15)

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github/workflows		.github/workflows
docs/superpowers		docs/superpowers
foldconf		foldconf
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

foldconf

Table of Contents

Installation

Quick Start

Library Usage

Core Operations

Loading predictions

Filtering and selection

Queries and summaries

Visualization

Reports and export

Batch processing

Structural comparison

Data Model

CLI Reference

summary

classify

filter

plot

report

batch

compare

diff

check

view

Global options

Features

Source Support

Confidence Metrics

Filtering & Selection

Visualization

Reporting

Structural Analysis

Batch Processing

Examples

Example 1: Validate a binding prediction

Example 2: Find disagreements between two predictors

Example 3: Batch processing with filtering

Example 4: Pipeline check

Error Handling

Loading errors

Structured warnings

Batch failures

Contributing

License

Citation

Version

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages