Skip to content

chore(compile-time): add opt-in compile-time SQL validation (query_as!, derive(Table), VS Code squigglies)#93

Merged
StefanSteiner merged 17 commits into
tableau:mainfrom
StefanSteiner:compile-time-sql-validator-milestone-a
Jun 1, 2026
Merged

chore(compile-time): add opt-in compile-time SQL validation (query_as!, derive(Table), VS Code squigglies)#93
StefanSteiner merged 17 commits into
tableau:mainfrom
StefanSteiner:compile-time-sql-validator-milestone-a

Conversation

@StefanSteiner
Copy link
Copy Markdown
Contributor

@StefanSteiner StefanSteiner commented Jun 1, 2026

Summary

Adds opt-in, sqlx-style compile-time SQL validation to hyperdb-api. When the compile-time cargo feature is enabled on hyperdb-api-derive, query_as!(User, "SELECT id, name FROM users") validates the SQL against registered struct schemas at cargo build time — catching dropped/renamed columns, typos, and missing tables before they reach runtime.

The existing runtime path (Connection::fetch_all_as, #[derive(FromRow)]) is unchanged. Compile-time checking is strictly additive behind an opt-in cargo feature.

How it works

// Cargo.toml:
// hyperdb-api-derive = { version = "...", features = ["compile-time"] }

#[derive(Debug, FromRow, Table)]
#[hyperdb(table = "users", register)]   // ← registers schema at build time
struct User {
    #[hyperdb(primary_key)]
    id: i64,
    name: String,
    email: Option<String>,
}

// Validated at cargo build time when compile-time feature is on:
let users: Vec<User> = query_as!(User, "SELECT id, name, email FROM users")
    .fetch_all(&conn)?;

// Column typo → compile error:
let _ = query_as!(User, "SELECT id, nme FROM users");
// error: `User` requires column "name" but the query does not project it

// Non-existent column → compile error:
let _ = query_as!(User, "SELECT id, emai1, name FROM users");
// error: column "emai1" does not exist on any table in the query;
//        check for a typo or a renamed/dropped column

What's included

New crate: hyperdb-compile-check (standalone, not a workspace member — avoids a Cargo dep cycle)

  • CompileTimeDb: one HyperProcess + one Connection behind OnceLock<parking_lot::Mutex>, shared across all macro expansions in a crate compilation
  • registry: dual-indexed by Rust struct ident + SQL table name; populated at macro expansion time (not in user binary)
  • dry_run: wraps SQL as WITH __hdb_q AS (<sql>) SELECT * FROM __hdb_q LIMIT 0, drives next_chunk() once (lazy TCP execution), returns ResultSchema
  • error_extract: SQLSTATE-based classification (42P01/42703/42601) — no sqlparser dependency
  • validate_query_as / validate_scalar_sql: bounded seed-and-retry loop (up to 8 rounds) handles multi-table JOINs where all tables need first-seeding
  • Structured ValidationError variants with actionable messages for each error class (missing struct registration, unknown column, missing columns, unregistered table, syntax error)

hyperdb-api-derive additions

  • #[derive(Table)]: generates impl Table { const NAME; const CREATE_SQL } from field types; with #[hyperdb(register)] + compile-time feature, registers the struct in the compile-check registry at expansion time
  • query_as!(T, "sql" [, args…]): returns QueryAs<T> builder; validates SQL at build time when feature enabled
  • query_scalar!(T, "sql" [, args…]): returns QueryScalar<T> builder for single-column queries; validates at build time
  • #[hyperdb(primary_key)] on fields: now silently ignored by FromRow (previously caused an error when combining derive(FromRow) and derive(Table))
  • Fully rewritten README.md documenting all macros, compile-time feature, supported types, and VS Code setup

hyperdb-api additions

  • Table trait: const NAME: &'static str + const CREATE_SQL: &'static str
  • QueryAs<T: FromRow>: fetch_all/fetch_one/fetch_optional(&conn)
  • QueryScalar<T: RowValue>: same interface for scalar values

.vscode/settings.json (committed, force-added past .gitignore)

  • Enables compile-time feature in rust-analyzer so bad SQL shows squigglies in VS Code
  • Sets HYPERD_PATH via extraEnv for the proc-macro host
  • Uses the flat-array format for rust-analyzer.cargo.features — the JSON-object form is silently ignored by RA and was the root cause of IDE validation not firing

Architecture notes

  • Dep cycle resolved: hyperdb-api-derive is no longer a dep of hyperdb-api. Users import derive macros directly from hyperdb-api-derive. This is a breaking change for anyone using use hyperdb_api::FromRow as a derive macro (the trait re-export is unchanged; only the proc-macro re-export is removed).
  • Feature off = zero overhead: without compile-time, query_as! is a pure pass-through; hyperdb-compile-check is not built.
  • One Hyper instance per crate: ~156ms cold start amortized across all macro invocations; RA uses the same instance without re-expanding unchanged macros (salsa memoization).
  • Module ordering constraint: derive(Table) structs must be in modules declared before modules containing query_as! calls (within a file, ordering is always correct). Documented in README.

Diagnostic examples

Error Message
Column typo in SELECT column "emai1" does not exist on any table in the query; check for a typo or a renamed/dropped column
Struct field not projected `User` requires column "email" but the query does not project it; add it to the SELECT list or remove the field from `User`
Unregistered table table "orders" is not registered; did you forget \#[derive(Table)] #[hyperdb(register)]`?`
Struct not registered type \Foo` must `#[derive(Table)]` with `#[hyperdb(register)]` to be used with `query_as!``
SQL syntax error SQL syntax error: ERROR: syntax error at or near ... (42601)
query_scalar! multi-column query_scalar! requires exactly one projected column, but the query projects 3

Tests

  • 13 unit tests in hyperdb-compile-check (no HYPERD_PATH needed)
  • 10 trybuild UI tests in hyperdb-api-derive/tests/ui/ (4 pass + 6 fail with golden .stderr)
  • 6 integration tests in hyperdb-api/tests/compile_time_validation_tests.rs (require HYPERD_PATH; cover full stack including JOINs)
  • All existing workspace tests continue to pass

Known limitations (documented in README)

  • Type checking deferred to v2: validates column names, not types. Runtime Error::Column { kind: TypeMismatch } still catches type drift.
  • No parameter type checking: bind parameters are opaque at compile time (Hyper has no PREPARE metadata endpoint).
  • Validates structs vs. SQL, not SQL vs. production DB: if your struct and prod schema have drifted, that remains a runtime error.

Test plan

  • cargo test --workspace — all tests green
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — clean
  • cargo clippy --workspace --all-targets -- -D warnings — clean (feature off)
  • cargo test -p hyperdb-api-derive --test ui — 10 trybuild cases pass
  • cargo test -p hyperdb-api --test compile_time_validation_tests — 6 integration tests pass (requires HYPERD_PATH)
  • cargo build --example compile_time_validation -p hyperdb-api — example builds
  • VS Code: open repo, reload window, verify squigglies appear on bad query_as! SQL

The #[ignore]d measurement spikes (startup, LIMIT 0 dry-run, SQLSTATE
classification, lazy next_chunk drive) are the reference implementation
W1 lifts into hyperdb-compile-check. Delete once W1 lands. No CI cost
(ignored).
Unit-testable home for all compile-time SQL validation logic, following
the sqlx-macros-core split pattern. proc-macro shells in
hyperdb-api-derive call into this crate (feature-gated, not yet wired).

- db.rs: CompileTimeDb + get_or_init() via OnceLock<parking_lot::Mutex>
  (non-poisoning, safe at proc-macro panic sites)
- dry_run.rs: LIMIT 0 CTE wrapper; drives next_chunk() once (Phase 0 S6
  lazy-execution gotcha)
- error_extract.rs: SQLSTATE-based classification (42P01/42703/42601);
  no sqlparser dependency (Phase 0 S5)
- registry.rs: global table/struct registry behind OnceLock<Mutex>
- validate.rs: validate_query_as() entry point with Hyper-first
  seed-and-retry on 42P01 and name-subset diff
- diagnostic.rs: ValidationError variants with human-readable formatting

11 unit tests pass (no HYPERD_PATH needed). 10 integration tests marked
#[ignore]; run manually with HYPERD_PATH set. Cycle check + feature-off
invariant verified.
- hyperdb-api/src/table.rs: runtime Table trait with NAME + CREATE_SQL consts
- hyperdb-api-derive/src/table_derive.rs: derive(Table) macro
  - Parses #[hyperdb(table=, register)] struct attrs and
    #[hyperdb(primary_key, rename=)] field attrs
  - Emits CREATE TABLE IF NOT EXISTS SQL from Rust type map
    (i16/i32/i64/f32/f64/bool/String/Vec<u8>/NaiveDate/NaiveDateTime/
     NaiveTime/DateTime/Numeric; Option<T> → nullable)
  - default table name: lower_snake_case of struct ident
  - Unsupported types error at compile time with a helpful message
  - #[hyperdb(register)] parsed but registration deferred to Milestone B
    (dependency cycle resolution required before wiring compile-check in)
- hyperdb-compile-check: moved to standalone workspace (excluded from
  root workspace via exclude=[]) to break the Cargo dep cycle:
    hyperdb-api → hyperdb-api-derive → hyperdb-compile-check → hyperdb-api
  Retains its own [workspace] + [lints] so it builds/tests independently.
  Built via: cargo build --manifest-path hyperdb-compile-check/Cargo.toml
- Clarified unsafe Send/Sync comment in db.rs: the Mutex serializes
  Connection access (one TCP session, not concurrent); not the process
  being single-threaded.

All workspace tests pass. Cycle check: cargo build -p hyperdb-api-derive
--no-default-features succeeds.
One Connection (single TCP session) protected by parking_lot::Mutex so
macro expansion threads serialize on it. Contrasts with the production
pool (N independent connections via deadpool). The Mutex is the safety
mechanism, not process-level single-threading.
Adds the minimal proc-macro and runtime type needed to confirm
rust-analyzer expansion behavior (Milestone A, step A8).

- hyperdb-api-derive: query_as!(T, "sql" [, args…]) function-like macro
  Pass-through for now: emits QueryAs::<T>::new(sql, params).
  Compile-time validation wired in Milestone B (cycle resolution pending).
- hyperdb-api/src/query_as.rs: QueryAs<T> builder — fetch_all/fetch_one/
  fetch_optional delegating to Connection::fetch_*_as.
- hyperdb-api/tests/ra_expansion_check.rs: A8 observation harness.
  Open in VS Code, edit the "EDIT ME" comment, verify query_as! does NOT
  re-expand (salsa memoization on token-tree key). Remove after confirmed.

All workspace tests pass.
1. MAJOR: registry dual-indexed by struct name + table name
   validate_query_as received the Rust struct ident ("User") but
   registry::get() looked up by SQL table name ("users"). These differ
   by convention (snake_case) and override. Fixed by:
   - Adding STRUCT_TO_TABLE: OnceLock<Mutex<HashMap<String,String>>>
   - register() now takes (struct_name, table_name, create_sql, fields)
   - Added get_by_struct() → Option<(table_name, TableEntry)>
   - validate_query_as now uses get_by_struct for the initial lookup
   - Registry::seed_if_known still uses get_by_table (Hyper reports SQL names)

2. MAJOR: Vec<T> → BYTES only for Vec<u8>
   Vec<String>/Vec<i32>/etc. silently mapped to BYTES, producing incorrect
   CREATE TABLE SQL. Now inspects the generic argument; only Vec<u8> → BYTES,
   anything else → compile_error! with a helpful message.
   Added is_vec_u8() helper that checks the PathArguments.

3. MINOR (pre-Milestone-B fix): column_name_for returns "" for index-based
   fields; filter_map kept them, so the field list would contain "" entries
   causing spurious MissingColumns{ missing: vec![""] } errors at validation.
   Fixed: filter out empty strings in _field_names collection.

All 12 compile-check unit tests pass. Full workspace clippy clean.
…stone B core)

Resolves the Cargo dependency cycle and connects all the pieces:

## Cycle resolution
Remove hyperdb-api-derive from hyperdb-api's [dependencies] — it was only
needed for proc-macro re-exports. Re-exports (FromRow, query_as, Table)
are removed from hyperdb-api/src/lib.rs. Users import from
hyperdb-api-derive directly. hyperdb-api/Cargo.toml adds
hyperdb-api-derive as a dev-dependency only (for integration tests).

This breaks the cycle:
  hyperdb-api → hyperdb-api-derive → hyperdb-compile-check → hyperdb-api

hyperdb-api-derive now has a clean path to hyperdb-compile-check via
the optional compile-time feature.

## Registration timing fix
derive(Table) #[hyperdb(register)] now calls registry::register()
directly inside the proc-macro expand() function (in the proc-macro host
process) instead of emitting a LazyLock into the user's binary. This is
the correct moment: query_as! runs in the same host process and finds
the entry in the registry. No registration code emitted into user binary.

## Validation wired
query_as!(T, "sql") now calls hyperdb_compile_check::validate_query_as()
at expansion time when compile-time feature is enabled. On error: emits
compile_error!(diagnostic_message) pinned to the call site. On success:
emits QueryAs::<T>::new(sql, &[args...]) as before.

## Changes
- hyperdb-api-derive/Cargo.toml: compile-time feature + hyperdb-compile-check dep
- hyperdb-api-derive/src/lib.rs: validation call in expand_query_as()
- hyperdb-api-derive/src/table_derive.rs: in-host registration; field_names/
  column_name_for gated behind cfg(feature="compile-time")
- hyperdb-api/Cargo.toml: remove derive dep from [dependencies], add to [dev-deps]
- hyperdb-api/src/lib.rs: remove proc-macro re-exports
- hyperdb-api/tests/: update imports to use hyperdb_api_derive directly
- hyperdb-compile-check: registry dual-index already in place from A9 fix

All workspace tests pass. Both feature-on and feature-off clippy clean.
W4 — query_scalar! macro:
- hyperdb-api/src/query_as.rs: add QueryScalar<T: RowValue> builder
  with fetch_all/fetch_one/fetch_optional via ScalarRow<T> wrapper
- hyperdb-api/src/lib.rs: re-export QueryScalar
- hyperdb-api-derive/src/lib.rs: query_scalar! proc-macro (pass-through
  without compile-time; validates exactly-one-column with it)
- hyperdb-compile-check/src/validate.rs: validate_scalar_sql() + shared
  run_dry_run_with_seed() helper (DRY refactor of validate_query_as)

B4 — end-to-end integration tests:
- hyperdb-api/tests/compile_time_validation_tests.rs: 6 tests covering
  derive(Table) CREATE SQL correctness, query_as! fetch_all/fetch_one/
  fetch_optional, lenient-additions (SELECT * ok), and JOIN across two
  registered tables. All 6 pass without compile-time feature; compile with
  --features hyperdb-api-derive/compile-time to also exercise validation.
1. MAJOR: bounded seed-and-retry loop in run_dry_run_with_seed
   Single retry was insufficient for multi-table JOINs where both tables
   need initial seeding. Changed to a loop (max 8 rounds) with `continue`
   on each successful 42P01 seed. Fixes confusing HyperError messages on
   JOIN queries where more than one registered table needs first-seeding.
   Simplified return type from Option<ResultSchema> to ResultSchema (the
   None arm was unreachable — removed the dead unreachable! callers).

2. MAJOR: document cross-module macro expansion ordering constraint
   derive(Table) must expand before query_as! in the same proc-macro
   host. Within a file this is always true (struct derives before function
   bodies). Across files, the module with derive(Table) structs must be
   declared (mod X;) before the module with query_as! calls. Added clear
   warning to query_as! rustdoc.

3. MINOR: stale comment on StructOpts::register field
   "registration wired in Milestone B" was stale — it's already wired.
   Updated to "only used when compile-time feature is enabled".
…ow fix (W5 W6)

W5 — trybuild UI golden tests:
- 4 pass cases: derive(Table) basic, custom table name, field rename, derive(FromRow)
- 6 fail cases with .stderr golden files: Table on enum, FromRow on enum,
  unsupported Vec<T> field type, unrecognized attr, query_as!/query_scalar! missing args
- test runner: hyperdb-api-derive/tests/ui.rs
  Update with: TRYBUILD=overwrite cargo test -p hyperdb-api-derive --test ui

W6 — example + fix:
- examples/additional_examples/compile_time_validation.rs: end-to-end example
  showing derive(Table), query_as!, query_scalar!, QueryAs/QueryScalar builder
  reuse, and the CREATE_SQL const for runtime table creation.
- Fix: FromRow's field_source_for now silently ignores #[hyperdb(primary_key)]
  (a Table-derive attribute) so structs can derive both Table and FromRow with
  #[hyperdb(primary_key)] without a compile error from the FromRow parser.
…/W6 followup)

- hyperdb-compile-check/src/diagnostic.rs: add UnknownColumn variant
  (SQLSTATE 42703) with message "column X does not exist on any table in
  the query; check for a typo or a renamed/dropped column". Previously
  reported as a generic HyperError.
- hyperdb-compile-check/src/validate.rs: use UnknownColumn instead of
  HyperError for the 42703 path; remove redundant `continue` (clippy).
- hyperdb-api-derive/README.md: full rewrite documenting derive(Table),
  query_as!, query_scalar!, compile-time feature, VS Code setup (the
  settings.json array format, HYPERD_PATH), and known limitations.
- .vscode/settings.json: committed (force-added despite .gitignore) so
  contributors get compile-time squigglies out of the box. Uses
  rust-analyzer.cargo.features as a flat array — the map form is silently
  ignored by RA and was the root cause of validation not firing in IDE.
- release.yml: publish hyperdb-compile-check after hyperdb-api
  (topological order: it depends on hyperdb-api). Uses
  --manifest-path since it's not a workspace member.
- release.yml: version-check step now also verifies
  hyperdb-compile-check/Cargo.toml matches the release tag.
- release-please-config.json: add hyperdb-compile-check/Cargo.toml
  to extra-files so release-please bumps its version in lockstep.
- hyperdb-compile-check/Cargo.toml: add x-release-please-start/end
  markers around the package version for release-please to update.
- ci.yml: update publish-dry-run comment to mention
  hyperdb-compile-check is excluded from CI dry-run (same reason as
  hyperdb-api: it has a path dep that can't resolve before publish).
…pass case

Since derive macros are no longer re-exported from hyperdb_api,
use hyperdb_api::FromRow imported the trait only, which was unused.
--all-features enables hyperdb-api-derive/compile-time, which starts an
embedded Hyper instance inside the proc-macro host during clippy. Without
HYPERD_PATH + the hyperd binary the proc-macro panics on every query_as!
and query_scalar! call in the example file. Added the same cache/download
steps and HYPERD_PATH env var that the test job already has.
hyperdb-api-derive now has an optional path dep on hyperdb-compile-check.
cargo publish --dry-run resolves ALL deps (including optional ones) against
the live crates.io index, so the dry-run fails because hyperdb-compile-check
hasn't been published yet.

Same situation as hyperdb-api-core/hyperdb-api — excluded from CI dry-run
and exercised at release time by the full wave in release.yml.
@StefanSteiner StefanSteiner merged commit 73f9b0f into tableau:main Jun 1, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant