A plan-agnostic library for generating PostgreSQL pg_hint_plan hints from any query optimizer's execution plans.
This project provides a generic, reusable hint generation engine that can convert query plans from any optimizer into PostgreSQL hints. The core hint-engine library is completely independent of any specific query engine—it works with logical plans, physical plans, or any other plan representation through a simple visitor interface.
The DataFusion implementations are reference examples, demonstrating how to integrate the hint engine with a real query optimizer. The same pattern can be applied to other systems like Apache Calcite, Spark, commercial databases, or custom query planners.
Build a universal hint generation library that:
- Works with any query plan format (logical, physical, custom)
- Generates PostgreSQL
pg_hint_planhints for join order, algorithms, and scan methods - Requires only a simple visitor implementation to support new query engines
- Provides a mostly complete reference implementation with DataFusion
┌─────────────────────────────────────────────────────────────┐
│ Your Query Engine │
│ (DataFusion, Calcite, Spark, etc.) │
└──────────────────────────┬──────────────────────────────────┘
│
│ Implement PlanVisitor trait
▼
┌─────────────────────────────────────────────────────────────┐
│ hint-engine (Core) │
│ Generic, plan-agnostic hint generation │
│ Input: Vec<PlanNodeMetadata> (post-order traversal) │
│ Output: PostgreSQL pg_hint_plan hints │
└──────────────────────────┬──────────────────────────────────┘
│
▼
/*+ Leading(t1 t2 t3)
HashJoin(t1 t2) */
generic-hint-engine/
├── hint-engine/ # ⭐ Core library (plan-agnostic)
│ ├── src/
│ │ └── datafusion_visitor/ # Optional DataFusion visitors (feature: datafusion-visitor)
│ └── README.md
├── auto_explain_rs/ # PostgreSQL auto-explain log processor
│ └── src/plan_visitor.rs # PostgreSQL plan visitor implementation
└── df-autohint-runner/ # Example: CLI tool for DataFusion benchmarks
└── datafusion-logical-runner/
hint-engine ⭐ - The main library. Completely generic and plan-agnostic. Converts standardized plan metadata into PostgreSQL hints. This is the reusable core that works with any query engine.
- Optional
datafusion-visitorfeature provides reference DataFusion implementations - Optional
hint-tablefeature enables PostgreSQL hint table integration
auto_explain_rs - PostgreSQL auto-explain log processor with a PostgresPlanVisitor that converts PostgreSQL execution plans to hints.
datafusion-logical-runner - Example CLI tool that integrates the hint engine with DataFusion for TPC-H/JOB benchmarking. Uses the datafusion-visitor feature.
Hints are derived from plan node metadata, not computed by the engine. As the visitor implementor, you are responsible for:
- Setting appropriate
ScanMethodonLeafNode(e.g.,SeqScan,Index("idx_name")) - Setting appropriate
JoinAlgorithmon join metadata (e.g.,HashJoin,NestedLoopJoin) - Providing cardinality estimates, parallel hints, and memoization settings
The engine's role is to:
- Build the join tree structure from your metadata
- Traverse the tree and generate hint text based on the metadata you provided
- Apply configuration filters (which hint types to include)
[dependencies]
hint-engine = { path = "path/to/hint-engine" }Implement the PlanVisitor trait and populate metadata with hint information from your plan:
- Set
scan_methodonLeafNodeto control scan hints (SeqScan, Index, etc.) - Set
join_algorithmon join metadata to control join method hints (HashJoin, NestLoop, etc.) - Provide optional fields for cardinality, parallel hints, and memoization
See hint-engine/src/datafusion_visitor/logical.rs or hint-engine/src/datafusion_visitor/execution.rs for DataFusion implementations (enabled with datafusion-visitor feature).
See auto_explain_rs/src/plan_visitor.rs for a PostgreSQL plan visitor implementation.
Important: The visitor determines what hints are generated by setting metadata fields. The engine simply reads this metadata and formats it as PostgreSQL hints.
Configure the engine, generate hints, and combine with SQL. See the hint-engine crate documentation for detailed examples and API reference.
The core hint-engine library provides support for most postgres hints. See the hints module documentation for full details.
See df-autohint-runner/ for instructions on running benchmarks
- hint-engine/README.md - Detailed API documentation, architecture, and integration guide
- hint-engine datafusion_visitor - DataFusion visitor implementations
- datafusion-logical-runner/README.md - CLI tool usage
- df-autohint-runner/README.md - Benchmark setup
- auto_explain_rs/README.md - Auto-explain log processing and hint generation
- CLAUDE.md - Development guide
# Test core library
cargo test -p hint-engine
# Test DataFusion visitor implementations (requires datafusion-visitor feature)
cargo test -p hint-engine --features datafusion-visitor
# Test PostgreSQL integration
cargo test -p auto_explain_rs
# Test all
cargo test- Rust 1.70+
- Rust 1.70+
- PostgreSQL 17 with
pg_hint_plan(for running generated hints) - ~10GB disk for TPC-H scale factor 1
- ~50GB disk for JOB dataset
- Cross-optimizer comparison - Compare join orders chosen by different optimizers
- Hint injection - Transfer optimization decisions between systems
- Query optimization research - Study the impact of different join orders and algorithms
- Performance tuning - Generate hints from a research optimizer for production databases
- Benchmark analysis - Understand how different optimizers handle standard benchmarks
PostgreSQL's pg_hint_plan extension is widely used and well-documented, making it an ideal target format. However, the core hint generation logic could be adapted to other hint formats (Oracle, SQL Server, etc.) by implementing different hint formatters.
- pg_hint_plan documentation
- DataFusion - Reference implementation
- TPC-H benchmark
- Join Order Benchmark (JOB)