Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,10 @@ file-locks = ["dep:fs2"]
[dependencies]
# Always-on engine deps — pure-Rust and wasm-compatible.
log = "0.4"
sqlparser = "0.61"
# SQLR-23: the `visitor` feature unlocks `visit_expressions_mut`, which
# we use for the prepared-statement `?` placeholder rewrite + bind
# substitution pass in `src/sql/params.rs`.
sqlparser = { version = "0.61", features = ["visitor"] }
thiserror = "2.0"
prettytable-rs = "0.10"
# Phase 7e: JSON column type. `serde_json` powers both the validation
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ The project is staged in phases, each independently shippable. A finished phase
- [x] **4f — Transactions (`BEGIN` / `COMMIT` / `ROLLBACK`)**: `BEGIN` snapshots the in-memory tables (`Table::deep_clone`) and suppresses auto-save; every subsequent mutation stays in memory. `COMMIT` flushes accumulated changes in one `save_database` call (one WAL commit frame for the whole transaction). `ROLLBACK` restores the pre-BEGIN snapshot. Nested begins, orphan commits/rollbacks, and BEGIN on read-only DBs all return typed errors. Errors mid-transaction keep the transaction open so the caller can explicitly recover.

**Phase 5 — Embedding surface: public API + language SDKs**
- [x] **5a — Public Rust API** *(partial)*: `Connection` / `Statement` / `Rows` / `Row` / `OwnedRow` / `FromValue` / `Value` at the crate root; structured row return from the executor; `examples/rust/quickstart.rs` runnable via `cargo run --example quickstart`. Parameter binding + cursor abstraction deferred to 5a.2.
- [x] **5a — Public Rust API**: `Connection` / `Statement` / `Rows` / `Row` / `OwnedRow` / `FromValue` / `Value` at the crate root; structured row return from the executor; `examples/rust/quickstart.rs` runnable via `cargo run --example quickstart`. **SQLR-23 — parameter binding + plan cache:** `Connection::prepare_cached` (default 16-entry LRU), `Statement::query_with_params(&[Value])`, `Statement::execute_with_params(&[Value])`. The cached AST skips re-running sqlparser per execute; `?` placeholders bind via positional `&[Value]`. `Value::Vector(Vec<f32>)` is a first-class bind type so HNSW-eligible KNN queries skip per-iter lexing of the 4 KB query vector — and the HNSW optimizer hook still recognizes the bound vector. Cursor abstraction still deferred to 5a.2.
- [x] **5b — C FFI shim**: new `sqlrite-ffi/` workspace crate ships `libsqlrite_c.{so,dylib,dll}` + a cbindgen-generated `sqlrite.h`. Opaque-pointer types, thread-local last-error, split `sqlrite_execute` (DDL/DML/transactions) vs `sqlrite_query`/`sqlrite_step` (SELECT iteration). Runnable `examples/c/hello.c` + `Makefile` (`cd examples/c && make run`).
- [x] **5c — Python SDK**: new `sdk/python/` workspace crate via PyO3 (`abi3-py38`) + maturin. DB-API 2.0-inspired — `sqlrite.connect(path)` → `Cursor.execute` / `fetchall` / iteration, context-manager support (commit-on-clean-exit / rollback-on-exception), read-only connections, 16-test pytest suite. `examples/python/hello.py` runs after `maturin develop`. PyPI publish landed in Phase 6f as `sqlrite`.
- [x] **5d — Node.js SDK**: new `sdk/nodejs/` workspace crate via napi-rs (N-API v9, Node 18+). Prebuilt `.node` binaries — no `node-gyp` install step. `better-sqlite3`-style sync API (`new Database(path)`, `stmt.all() / get() / iterate()` returning row objects), auto-generated TypeScript defs, 11 `node:test` integration tests. `examples/nodejs/hello.mjs` runs after `npm install && npm run build`. npm publish landed in Phase 6g as `@joaoh82/sqlrite` (scoped — npm rejected the unscoped `sqlrite` name as too similar to `sqlite`).
Expand Down
5 changes: 5 additions & 0 deletions benchmarks/src/data.rs
Original file line number Diff line number Diff line change
Expand Up @@ -404,6 +404,11 @@ fn gen_vector(seed: u64, dim: usize) -> Vec<f32> {

/// Render a `&[f32]` as the bracket-array literal SQLRite + the
/// `[f32; 4]` example use: `[0.123, -0.456, …]`.
///
/// SQLR-23 — no longer used by W10/W12 (those bind via
/// [`crate::Value::Vector`]); kept as a public helper for any future
/// workload that needs a SQL-string vector literal.
#[allow(dead_code)]
pub fn vec_to_sql_literal(v: &[f32]) -> String {
let mut s = String::with_capacity(v.len() * 12 + 2);
s.push('[');
Expand Down
28 changes: 20 additions & 8 deletions benchmarks/src/drivers/duckdb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ impl Driver for DuckDBDriver {
sql: &str,
params: &[Value],
) -> Result<()> {
let bound: Vec<duckdb::types::Value> = params.iter().map(to_duckdb).collect();
let bound = to_duckdb_params(params)?;
conn.execute(sql, duckdb::params_from_iter(bound.iter()))
.with_context(|| format!("duckdb execute: {sql}"))?;
Ok(())
Expand All @@ -70,7 +70,7 @@ impl Driver for DuckDBDriver {
let mut stmt = conn
.prepare(sql)
.with_context(|| format!("duckdb prepare: {sql}"))?;
let bound: Vec<duckdb::types::Value> = params.iter().map(to_duckdb).collect();
let bound = to_duckdb_params(params)?;
let mut rows = stmt
.query(duckdb::params_from_iter(bound.iter()))
.with_context(|| format!("duckdb query: {sql}"))?;
Expand Down Expand Up @@ -102,7 +102,7 @@ impl Driver for DuckDBDriver {
let mut stmt = conn
.prepare(sql)
.with_context(|| format!("duckdb prepare: {sql}"))?;
let bound: Vec<duckdb::types::Value> = params.iter().map(to_duckdb).collect();
let bound = to_duckdb_params(params)?;
let mut rows = stmt
.query(duckdb::params_from_iter(bound.iter()))
.with_context(|| format!("duckdb query: {sql}"))?;
Expand All @@ -125,12 +125,24 @@ impl Driver for DuckDBDriver {
}
}

fn to_duckdb(v: &Value) -> duckdb::types::Value {
fn to_duckdb_params(params: &[Value]) -> Result<Vec<duckdb::types::Value>> {
params.iter().map(to_duckdb).collect()
}

fn to_duckdb(v: &Value) -> Result<duckdb::types::Value> {
match v {
Value::Null => duckdb::types::Value::Null,
Value::Integer(i) => duckdb::types::Value::BigInt(*i),
Value::Real(f) => duckdb::types::Value::Double(*f),
Value::Text(s) => duckdb::types::Value::Text(s.clone()),
Value::Null => Ok(duckdb::types::Value::Null),
Value::Integer(i) => Ok(duckdb::types::Value::BigInt(*i)),
Value::Real(f) => Ok(duckdb::types::Value::Double(*f)),
Value::Text(s) => Ok(duckdb::types::Value::Text(s.clone())),
// VECTOR is SQLRite-only (DuckDB doesn't ship a comparable
// primitive). The Group C workloads gate on
// `driver_supports("sqlrite")`, so reaching this arm
// indicates a registration bug.
Value::Vector(_) => anyhow::bail!(
"duckdb driver: VECTOR params are SQLRite-only; this workload should not register \
against duckdb"
),
}
}

Expand Down
28 changes: 20 additions & 8 deletions benchmarks/src/drivers/sqlite.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ impl Driver for SQLiteDriver {
sql: &str,
params: &[Value],
) -> Result<()> {
let bound: Vec<rusqlite::types::Value> = params.iter().map(to_rusqlite).collect();
let bound = to_rusqlite_params(params)?;
conn.execute(sql, rusqlite::params_from_iter(bound.iter()))
.with_context(|| format!("rusqlite execute: {sql}"))?;
Ok(())
Expand All @@ -79,7 +79,7 @@ impl Driver for SQLiteDriver {
let mut stmt = conn
.prepare_cached(sql)
.with_context(|| format!("rusqlite prepare_cached: {sql}"))?;
let bound: Vec<rusqlite::types::Value> = params.iter().map(to_rusqlite).collect();
let bound = to_rusqlite_params(params)?;
let cols = stmt.column_count();
let mut rows = stmt
.query(rusqlite::params_from_iter(bound.iter()))
Expand Down Expand Up @@ -111,7 +111,7 @@ impl Driver for SQLiteDriver {
let mut stmt = conn
.prepare_cached(sql)
.with_context(|| format!("rusqlite prepare_cached: {sql}"))?;
let bound: Vec<rusqlite::types::Value> = params.iter().map(to_rusqlite).collect();
let bound = to_rusqlite_params(params)?;
let cols = stmt.column_count();
let mut rows = stmt
.query(rusqlite::params_from_iter(bound.iter()))
Expand All @@ -131,12 +131,24 @@ impl Driver for SQLiteDriver {
}
}

fn to_rusqlite(v: &Value) -> rusqlite::types::Value {
fn to_rusqlite_params(params: &[Value]) -> Result<Vec<rusqlite::types::Value>> {
params.iter().map(to_rusqlite).collect()
}

fn to_rusqlite(v: &Value) -> Result<rusqlite::types::Value> {
match v {
Value::Null => rusqlite::types::Value::Null,
Value::Integer(i) => rusqlite::types::Value::Integer(*i),
Value::Real(f) => rusqlite::types::Value::Real(*f),
Value::Text(s) => rusqlite::types::Value::Text(s.clone()),
Value::Null => Ok(rusqlite::types::Value::Null),
Value::Integer(i) => Ok(rusqlite::types::Value::Integer(*i)),
Value::Real(f) => Ok(rusqlite::types::Value::Real(*f)),
Value::Text(s) => Ok(rusqlite::types::Value::Text(s.clone())),
// VECTOR is SQLRite-only; the W10/W12 workloads gate on
// `driver_supports("sqlrite")` so this branch indicates a
// bug in workload registration. Fail loudly rather than
// silently coercing.
Value::Vector(_) => anyhow::bail!(
"rusqlite driver: VECTOR params are SQLRite-only; this workload should not register \
against sqlite"
),
}
}

Expand Down
151 changes: 49 additions & 102 deletions benchmarks/src/drivers/sqlrite.rs
Original file line number Diff line number Diff line change
@@ -1,16 +1,26 @@
//! SQLRite driver.
//!
//! Binds against the engine's public [`sqlrite::Connection`] surface —
//! the same API the language SDKs use. SQLRite has no parameter
//! binding yet (see `connection.rs:145` — "parameter binding and
//! prepared-plan caching are future work"), so the driver formats
//! `[Value]` into the SQL string at call time. That's an honest cost
//! to include in the comparison: a SQLRite user calling a hot SELECT
//! today pays the same per-call parse + format overhead.
//! the same API the language SDKs use.
//!
//! Once SQLRite gains parameter binding (post-9.6 follow-up), this
//! driver will switch to the bound path and a workload `v` bump will
//! capture the methodology change.
//! ## SQLR-23 — bound + cached path
//!
//! SQLRite gained a prepared-statement plan cache + parameter binding
//! in SQLR-23. This driver uses both:
//!
//! - `query_one` / `query_all` route through [`sqlrite::Connection::prepare_cached`]
//! so a hot SELECT pays the sqlparser walk exactly once across the
//! whole bench loop (cache cap defaults to 16, plenty for any single
//! workload).
//! - `execute_with_params` does the same for INSERT-loop hot paths.
//! - `Value::Vector` binds directly through `Statement::query_with_params`
//! without round-tripping through a 4 KB bracket-array SQL literal —
//! this is the W10/W12 unlock. The HNSW probe optimizer recognizes
//! the bound vector via the same in-band shape an inline `[…]` would
//! produce, so the optimizer hook still kicks in on bound queries.
//!
//! That's how a perf-conscious SQLRite user would write hot-path code
//! today.

use std::path::Path;

Expand Down Expand Up @@ -44,20 +54,23 @@ impl Driver for SQLRiteDriver {
sql: &str,
params: &[Value],
) -> Result<()> {
let inlined = inline_params(sql, params)?;
conn.execute(&inlined)
.map_err(|e| anyhow::anyhow!("sqlrite execute_with_params: {e}\n sql: {inlined}"))?;
let bound = to_engine_values(params);
let mut stmt = conn
.prepare_cached(sql)
.map_err(|e| anyhow::anyhow!("sqlrite prepare_cached: {e}\n sql: {sql}"))?;
stmt.execute_with_params(&bound)
.map_err(|e| anyhow::anyhow!("sqlrite execute_with_params: {e}\n sql: {sql}"))?;
Ok(())
}

fn query_one(&self, conn: &mut Self::Conn, sql: &str, params: &[Value]) -> Result<Vec<Value>> {
let inlined = inline_params(sql, params)?;
let bound = to_engine_values(params);
let stmt = conn
.prepare(&inlined)
.map_err(|e| anyhow::anyhow!("sqlrite prepare: {e}\n sql: {inlined}"))?;
.prepare_cached(sql)
.map_err(|e| anyhow::anyhow!("sqlrite prepare_cached: {e}\n sql: {sql}"))?;
let mut rows = stmt
.query()
.map_err(|e| anyhow::anyhow!("sqlrite query: {e}\n sql: {inlined}"))?;
.query_with_params(&bound)
.map_err(|e| anyhow::anyhow!("sqlrite query_with_params: {e}\n sql: {sql}"))?;
let row = rows
.next()
.map_err(|e| anyhow::anyhow!("sqlrite row read: {e}"))?
Expand All @@ -84,13 +97,13 @@ impl Driver for SQLRiteDriver {
sql: &str,
params: &[Value],
) -> Result<Vec<Vec<Value>>> {
let inlined = inline_params(sql, params)?;
let bound = to_engine_values(params);
let stmt = conn
.prepare(&inlined)
.map_err(|e| anyhow::anyhow!("sqlrite prepare: {e}\n sql: {inlined}"))?;
.prepare_cached(sql)
.map_err(|e| anyhow::anyhow!("sqlrite prepare_cached: {e}\n sql: {sql}"))?;
let mut rows = stmt
.query()
.map_err(|e| anyhow::anyhow!("sqlrite query: {e}\n sql: {inlined}"))?;
.query_with_params(&bound)
.map_err(|e| anyhow::anyhow!("sqlrite query_with_params: {e}\n sql: {sql}"))?;
let mut out = Vec::new();
while let Some(row) = rows
.next()
Expand All @@ -108,49 +121,19 @@ impl Driver for SQLRiteDriver {
}
}

/// Inline `?`-positional placeholders with literal values. Replaces the
/// first `?` with `params[0]`, the second with `params[1]`, etc. Errors
/// if the count doesn't match. Strings are SQL-escaped.
fn inline_params(sql: &str, params: &[Value]) -> Result<String> {
let mut out = String::with_capacity(sql.len() + params.len() * 16);
let mut iter = params.iter();
let mut in_string = false;
for ch in sql.chars() {
if ch == '\'' {
in_string = !in_string;
out.push(ch);
continue;
}
if ch == '?' && !in_string {
let p = iter
.next()
.context("inline_params: more `?` placeholders than params")?;
push_literal(&mut out, p);
} else {
out.push(ch);
}
}
if iter.next().is_some() {
anyhow::bail!("inline_params: more params than `?` placeholders");
}
Ok(out)
/// Map the bench harness's `Value` to SQLRite's engine `Value`. Both
/// enums carry the same logical shapes; this is just a name-mapping.
fn to_engine_values(params: &[Value]) -> Vec<sqlrite::Value> {
params.iter().map(to_engine_value).collect()
}

fn push_literal(out: &mut String, v: &Value) {
fn to_engine_value(v: &Value) -> sqlrite::Value {
match v {
Value::Null => out.push_str("NULL"),
Value::Integer(i) => out.push_str(&i.to_string()),
Value::Real(f) => out.push_str(&format!("{f}")),
Value::Text(s) => {
out.push('\'');
for ch in s.chars() {
if ch == '\'' {
out.push('\'');
}
out.push(ch);
}
out.push('\'');
}
Value::Null => sqlrite::Value::Null,
Value::Integer(i) => sqlrite::Value::Integer(*i),
Value::Real(f) => sqlrite::Value::Real(*f),
Value::Text(s) => sqlrite::Value::Text(s.clone()),
Value::Vector(v) => sqlrite::Value::Vector(v.clone()),
}
}

Expand All @@ -160,46 +143,10 @@ fn from_engine_value(v: sqlrite::Value) -> Value {
sqlrite::Value::Integer(i) => Value::Integer(i),
sqlrite::Value::Real(f) => Value::Real(f),
sqlrite::Value::Text(s) => Value::Text(s),
// Bench inputs don't include booleans / vectors / JSON yet —
// when a workload starts using them, this match grows.
sqlrite::Value::Vector(v) => Value::Vector(v),
// Bool / JSON aren't yet a bench `Value` variant — workloads
// don't surface them. If a future workload reads one back,
// grow this match alongside the harness `Value` enum.
other => Value::Text(format!("{other:?}")),
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn inline_params_replaces_in_order() {
let s = inline_params(
"SELECT * FROM t WHERE a = ? AND b = ? AND c = ?",
&[Value::Integer(1), Value::Text("x".into()), Value::Null],
)
.unwrap();
assert_eq!(s, "SELECT * FROM t WHERE a = 1 AND b = 'x' AND c = NULL");
}

#[test]
fn inline_params_preserves_question_marks_in_strings() {
let s =
inline_params("SELECT 'what?', * FROM t WHERE a = ?", &[Value::Integer(7)]).unwrap();
assert_eq!(s, "SELECT 'what?', * FROM t WHERE a = 7");
}

#[test]
fn inline_params_escapes_quotes() {
let s = inline_params(
"SELECT * FROM t WHERE name = ?",
&[Value::Text("O'Hara".into())],
)
.unwrap();
assert_eq!(s, "SELECT * FROM t WHERE name = 'O''Hara'");
}

#[test]
fn inline_params_arity_mismatch_errors() {
assert!(inline_params("SELECT ?", &[]).is_err());
assert!(inline_params("SELECT 1", &[Value::Integer(1)]).is_err());
}
}
19 changes: 14 additions & 5 deletions benchmarks/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,27 @@ pub mod workloads;

/// Driver-side value type. Tight enough that any of the engines under
/// test can map it onto their native bind types — rusqlite has
/// [`rusqlite::ToSql`], DuckDB has its own; SQLRite has no parameter
/// binding yet so the SQLRite driver inlines via SQL formatting.
/// [`rusqlite::ToSql`], DuckDB has its own. SQLRite gained parameter
/// binding in SQLR-23 (incl. `Value::Vector` for HNSW-eligible KNN
/// queries), so the SQLRite driver now binds through
/// `Statement::query_with_params` / `Statement::execute_with_params`
/// instead of formatting a SQL string per call.
///
/// Deliberately doesn't carry every type the engines support
/// (booleans, vectors, JSON, blobs); workload inputs only need these
/// four. New variants land alongside the workload that needs them.
/// `Vector` is SQLRite-only: SQLite-side drivers raise a clean error
/// rather than silently lying about the type, since the W10/W12
/// workloads that exercise it are explicitly SQLRite-only via
/// `driver_supports`.
#[derive(Debug, Clone, PartialEq)]
pub enum Value {
Null,
Integer(i64),
Real(f64),
Text(String),
/// Dense f32 query vector — bound directly into VECTOR columns
/// or distance-function args. SQLRite-only; comparator drivers
/// surface a typed error if a workload tries to bind a vector
/// against them.
Vector(Vec<f32>),
}

/// Engine-agnostic surface every workload binds to.
Expand Down
Loading
Loading