Skip to content

feat(sqlite): embedded SQLite via Turso (Phase 1 + Phase 2)#1502

Merged
chaliy merged 5 commits intomainfrom
claude/turso-sqlite-evaluation-dKt1F
May 2, 2026
Merged

feat(sqlite): embedded SQLite via Turso (Phase 1 + Phase 2)#1502
chaliy merged 5 commits intomainfrom
claude/turso-sqlite-evaluation-dKt1F

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 2, 2026

Summary

Adds a sqlite / sqlite3 builtin behind a new sqlite cargo feature, backed by turso_core — a pure-Rust SQLite-compatible engine. The CLI ships with the feature on by default (mirroring python and git), so bashkit -c "sqlite :memory: 'SELECT 1'" works out of the box.

Two IO backends share a single builtin surface:

  • Phase 1 (Backend::Memory, default) — turso's MemoryIO with whole-file load/flush against the bashkit VFS at command boundaries.
  • Phase 2 (Backend::Vfs) — custom BashkitVfsIO that plugs Arc<dyn FileSystem> into turso's IO trait. Same observable semantics, exercises the IO trait path. Selectable per-invocation (-backend vfs) or per-builder (SqliteLimits::backend(SqliteBackend::Vfs)).

Full sqlite3-shell-compatible CLI surface:

  • Output modes: -csv, -json, -tabs, -line, -column, -box, -markdown, -list (default)
  • Flags: -header/-noheader, -separator, -nullvalue, -cmd, -backend
  • Dot-commands: .tables, .schema, .indexes, .dump, .read, .headers, .mode, .separator, .nullvalue, .help, .quit/.exit

Why

LLM agents and scripts increasingly want a real SQL surface for caching, eval results, and structured intermediate state. Shipping a bashkit-native SQLite (rather than spawning a host sqlite3) keeps everything inside the sandbox and inside the VFS, with deterministic clocks and bounded resource usage.

How

  • crates/bashkit/src/builtins/sqlite/: builtin module (mod, engine, vfs_io, formatter, parser, dot_commands, tests)
  • BashBuilder::sqlite() / sqlite_with_limits() registers both sqlite and sqlite3 aliases
  • Runtime gated by BASHKIT_ALLOW_INPROCESS_SQLITE=1 (auto-injected by the CLI)
  • Pure-Rust dep tree: 180 transitive crates, all gated behind the sqlite feature
  • CLI default-on with --no-sqlite to disable per-run

Limits enforced

Limit Default Mechanism
max_script_bytes 4 MiB early return before splitting
max_rows_per_query 1,000,000 post-materialisation check
max_db_bytes 256 MiB at VFS load
max_duration 30 s shared Deadline checked on every step; calls Statement::interrupt()
max_statements 10,000 after splitting, before run
MAX_DOT_READ_DEPTH 16 (const) recursion guard in run_statements

Security posture

  • Off by default at the cargo level; runtime opt-in env var on top
  • All paths resolve through Arc<dyn FileSystem> — Phase 2 IO is bound to that FS only, no host filesystem access
  • No ATTACH/DETACH (sandbox isolation)
  • Bounded .read recursion + wall-clock deadline

Test coverage

  • 78 unit tests in src/builtins/sqlite/tests.rs (positive, negative, every flag, every dot-command, every output mode, opt-in gate, recursion cap, oversize input, backend equivalence, proptest splitter)
  • 14 integration tests in tests/sqlite_integration_tests.rs (Bash::exec end-to-end: pipelines, redirection, env expansion, .read of a heredoc-built VFS file, .dump/.read round-trip, both backends)
  • 9 threat-model tests in tests/sqlite_security_tests.rs (TM-SQL-001…009 incl. recursive .read overflow guard and host-fs-isolation checks)
  • 8 sqlite3 parity tests in tests/sqlite_compat_tests.rs
  • 4 proptest fuzz harnesses in tests/sqlite_fuzz_tests.rs (no panic on arbitrary SQL, no host file leak, CSV well-formedness, no :memory: artifacts on the VFS)
  • 2 CLI tests in crates/bashkit-cli/src/main.rs (default-on / --no-sqlite)

Examples runnable in CI

  • examples/sqlite_basic.rs — minimal create/insert/select round-trip
  • examples/sqlite_workflow.rs — 8-step end-to-end demo (CRUD → JSON → CSV pipelined into wc → .dump/.read round-trip → markdown reporting → schema introspection → tightened limits). Each step asserts on observed output.
  • Both wired into .github/workflows/ci.yml examples job; sqlite added to the test job's feature list.

Docs and specs

  • specs/sqlite-builtin.md — full design + threat table + test plan
  • crates/bashkit/docs/sqlite.md — rustdoc guide via pub mod sqlite_guide
  • README.md — experimental SQLite section + cargo add snippet + builtin table entry
  • docs/cli.md — feature table, default-enabled list, --no-sqlite flag, example
  • AGENTS.md — added sqlite-builtin row to specs table
  • specs/implementation-status.md — bumped feature-gated count, added sqlite/sqlite3 row
  • crates/bashkit/src/lib.rs — preamble feature list + builtins table

Test plan

  • CI green (lint, audit, test, examples, fuzz-check)
  • cargo test --features sqlite -p bashkit --lib → 2274 passed
  • cargo test --features sqlite -p bashkit --test sqlite_* → 35 passed
  • cargo test -p bashkit-cli → 95 passed
  • cargo run --example sqlite_basic --features sqlite -p bashkit → ok
  • cargo run --example sqlite_workflow --features sqlite -p bashkit → all 8 steps ok
  • cargo build -p bashkit-cli → CLI ships with sqlite default-on
  • ./target/debug/bashkit -c "sqlite :memory: 'SELECT 1'"1
  • cargo tree -p bashkit --no-default-features confirms turso is not pulled in without the feature
  • cargo fmt --all -- --check clean

Generated by Claude Code

chaliy added 3 commits May 2, 2026 03:56
Introduces `sqlite`/`sqlite3` builtins behind a new `sqlite` cargo feature,
backed by `turso_core` (pure-Rust SQLite-compatible engine). Two IO
backends ship together:

- Phase 1 (default): turso's `MemoryIO` with whole-file load/flush against
  the bashkit VFS at command boundaries.
- Phase 2: `BashkitVfsIO` plugs the bashkit `FileSystem` into turso via a
  custom `IO` impl, bridged sync→async through a per-call OS thread.

Full sqlite3-shell-compatible CLI surface (list/csv/tabs/line/box/column/
json/markdown modes, `-header`/`-separator`/`-nullvalue`/`-cmd`/`-backend`
flags, dot-commands `.tables` `.schema` `.indexes` `.dump` `.read`
`.headers` `.mode` `.separator` `.nullvalue` `.help` `.quit`/`.exit`).

Security:
- Off by default at the cargo level.
- Runtime-gated via `BASHKIT_ALLOW_INPROCESS_SQLITE=1`.
- Per-invocation caps on script size, result-set rows, and DB file size.
- `.read` recursion bounded to MAX_DOT_READ_DEPTH=16.
- All paths resolve through `Arc<dyn FileSystem>` — no host FS access.

Test coverage (new):
- 74 unit tests in `src/builtins/sqlite/tests.rs` (positive, negative,
  output formatting, opt-in gate, backend equivalence, proptest splitter).
- 14 integration tests driving `Bash::exec` end-to-end.
- 9 threat-model tests (TM-SQL-001…009) including stack-overflow guard
  and host-fs-isolation checks.
- 8 sqlite3 parity tests pinning behaviour against the reference shell.
- 4 proptest harnesses (no-panic on arbitrary SQL, no host file leak,
  CSV well-formedness, no `:memory:` artifacts).

Specs: new `specs/sqlite-builtin.md`, registered in `AGENTS.md`. Docs:
new `crates/bashkit/docs/sqlite.md` rustdoc guide and `sqlite_basic`
example.
… examples

Three follow-ups to the embedded SQLite work:

1. **More limits.** SqliteLimits now exposes:
   - max_duration (30 s default) — wall-clock budget shared across every
     statement in an invocation. The step loop checks the deadline on each
     iteration and calls Statement::interrupt() once it expires. Pass
     Duration::ZERO to opt out (useful for slow CI hosts).
   - max_statements (10 000 default) — guards against statement-flood DoS
     where the script stays under the byte cap with millions of `;`.
   New tests:
   - too_many_statements_rejected
   - deadline_zero_means_unlimited
   - deadline_already_expired_aborts_with_timeout
   - limits_builder_round_trips

2. **CLI default-on parity with python.**
   - bashkit-cli default features now include `sqlite`.
   - configure_bash() registers .sqlite() and auto-injects
     BASHKIT_ALLOW_INPROCESS_SQLITE=1 unless --no-sqlite is passed.
   - New tests: sqlite_enabled_by_default / sqlite_can_be_disabled.
   - `bashkit -c "sqlite :memory: 'SELECT 1'"` now works out of the box.

3. **CI-runnable examples.**
   - New examples/sqlite_workflow.rs: 8-step end-to-end demo (CRUD, JSON,
     CSV pipelining, .dump→.read round-trip, markdown reporting, schema
     introspection). Each step asserts on its observed output so a
     regression breaks the example, not just a smoke test.
   - .github/workflows/ci.yml builds examples with `--features
     "git,http_client,ssh,sqlite"` and runs both sqlite_basic and
     sqlite_workflow.
   - The unit/integration `cargo test` step also picks up `sqlite` in its
     feature list so the new tests run in CI.

Specs and docs updated with the new limits and the CLI section.
Surface the new sqlite/sqlite3 builtin in the user-facing docs that the
previous commits missed:

- README.md: add the experimental SQLite bullet, the `cargo add` snippet,
  the builtins table entry, and a full quick-start section with CLI flags
  + dot-commands.
- docs/cli.md: add `sqlite` to the default-feature table, list it under
  default-enabled builtins (noting the auto-injected env opt-in),
  document `--no-sqlite`, and add a one-shot example.
- specs/implementation-status.md: bump feature-gated builtin count from
  12 to 14 and append the sqlite/sqlite3 row.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 2, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 4b81d03 Commit Preview URL

Branch Preview URL
May 02 2026, 04:28 AM

chaliy added 2 commits May 2, 2026 04:13
CI feedback from PR #1502:

- Lint: clippy nags `quotes % 2 == 0` (the new fuzz-test invariant) with
  `manual_is_multiple_of` under `-D warnings`. Switch to
  `quotes.is_multiple_of(2)`.

- Audit: turso pulls 180 transitive crates that need supply-chain audits
  before `cargo vet --locked` accepts them. Regenerate the exemption
  list via `cargo vet regenerate exemptions` (`safe-to-deploy`).
  Verified locally: vetting succeeds (1 fully audited, 7 partially
  audited, 607 exempted).
CI runs on the just-released stable rustc 1.95, which adds
`collapsible_match` for an `if` inside a single match arm. Inline the
csv/tabs separator-restore guard into the outer match so the lint passes
while keeping the same behaviour (only restore `|` when the previous
mode had clobbered the separator).

Locally `cargo clippy --workspace --all-targets --all-features --
-D warnings` is green on rustc 1.95.0; sqlite dot-command tests still
pass.
@chaliy chaliy merged commit 62e62fe into main May 2, 2026
34 checks passed
@chaliy chaliy deleted the claude/turso-sqlite-evaluation-dKt1F branch May 2, 2026 14:54
@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

chaliy added a commit that referenced this pull request May 2, 2026
…1507)

## Summary

Three small, independent hardenings to the embedded sqlite builtin
(#1502 follow-up):

1. **Dependabot rule for turso** — `.github/dependabot.yml` excludes
`turso_*` from the weekly aggregated rollup so each bump (and its 30+
transitive crates) gets a standalone PR for manual review. Turso is BETA
upstream; we don't want it riding on an unattended weekly group.
2. **`ATTACH` / `DETACH` rejected by policy** — both keywords blocked
before SQL reaches turso. Cross-DB access would bypass VFS isolation.
The check is comment- and case-aware via a lightweight
`parser::leading_keyword` helper.
3. **`SqliteLimits::pragma_deny`** — pre-execution check refuses
resource/FS-shaped PRAGMAs by default: `cache_size`, `mmap_size`,
`page_size`, `max_page_count`, `temp_store_directory`,
`data_store_directory`, `compile_options`, `locking_mode`,
`shared_cache`. Common operational PRAGMAs (`user_version`,
`wal_checkpoint`, `foreign_keys`, `journal_mode`) pass through.
Schema-qualified PRAGMAs (`PRAGMA main.cache_size`) also match. Override
via `SqliteLimits::pragma_deny([])`.

## Why

These are the three lowest-risk items from the post-launch plan: a
Dependabot rule keeps a BETA dep on a tighter leash, and the two policy
checks close known bypasses (ATTACH escaping the VFS, PRAGMAs
sidestepping the resource caps).

## How

- New module-private helpers `parser::leading_keyword`,
`parser::pragma_name`, `parser::strip_leading_noise` — small lexer,
comment- and case-aware, used for *policy decisions only* (real parsing
stays with turso).
- `check_sql_policy(&sql, limits)` runs in the dispatch loop, rejecting
before `engine.execute`.
- `SqliteLimits` gains `pragma_deny: Vec<String>` (defaults populated
from `DEFAULT_PRAGMA_DENY`); builder method normalises to lower-case.
- TM rows added to spec; rustdoc guide section updated.

## Tests

- 11 new unit tests + 6 new parser helper tests in the sqlite module.
- 2 new TM cases (`tm_sql_009`, `tm_sql_010`) in
`tests/sqlite_security_tests.rs`.
- All previous green tests still green: **2289 lib + 14 integration + 11
security + 8 compat + 4 fuzz + 95 CLI**.

## Test plan

- [x] `cargo fmt --all -- --check`
- [x] `cargo clippy --workspace --all-targets --all-features -- -D
warnings`
- [x] `cargo test --features sqlite -p bashkit --lib` → 2289 passed
- [x] `cargo test --features sqlite -p bashkit --test sqlite_*` → 37
passed
- [x] `cargo test -p bashkit-cli` → 95 passed
- [x] `cargo run --example sqlite_basic --features sqlite -p bashkit` →
ok
- [x] `cargo run --example sqlite_workflow --features sqlite -p bashkit`
→ all 8 steps ok
- [x] `cargo vet --locked` → succeeds
- [ ] CI green

---
_Generated by [Claude
Code](https://claude.ai/code/session_018ERUz5RrSCoZ5Hjku3URK2)_
chaliy added a commit that referenced this pull request May 2, 2026
## Summary

Expose the embedded SQLite builtin (#1502) to the Python (PyO3) and
JavaScript (NAPI) SDKs by mirroring the existing `python=True` opt-in
pattern. PR B of the post-launch plan.

## Why

After #1502 / #1507, the builtin only reached users via the Rust API or
the `bashkit` CLI. Anyone consuming bashkit through the language
bindings (the primary surface for LLM agent integrations) had to
hand-roll `Bash::builder().sqlite()` or shell out to the CLI. This
closes the gap.

## How

### Python (`bashkit-python`)

- `Bash.__init__(..., sqlite: bool = False)` and the matching
`from_snapshot` static constructor accept the new keyword.
- New `apply_sqlite_config()` helper, called from both `new()` and the
`reset()` rebuild path. Sets up the builtin and injects
`BASHKIT_ALLOW_INPROCESS_SQLITE=1` so the runtime gate is satisfied
transparently — same shape as `apply_python_config()`.
- The flag survives across `reset()` (covered by
`test_sqlite_survives_reset`).
- `_bashkit.pyi` updated with the keyword + docstring describing the
default policy (4 MiB script cap, 256 MiB DB cap, 30 s deadline, ATTACH
and resource-PRAGMAs rejected).

### JavaScript (`bashkit-js`)

- `BashOptions.sqlite?: boolean` added to the public option struct.
- `default_opts()` includes it; `shared_state_from_opts()` threads it
through `SharedState`.
- Builder hook applies `.sqlite()` + env opt-in symmetrically to the
existing python branch.
- `.d.ts` regenerated by NAPI from the struct (no manual stub).

### Cargo features

Both `bashkit-python/Cargo.toml` and `bashkit-js/Cargo.toml` now include
`sqlite` in the `bashkit` feature list. Neither had previously enabled
it, so the binding now actually pulls in turso when built.

## Tests

- `tests/test_sqlite.py` (Python, 12 cases): opt-in gate, basic queries
(CRUD, headers, CSV, JSON), VFS persistence, dot-commands (`.tables`,
`.dump`/`.read` round-trip), security policy surfacing (ATTACH/DETACH,
PRAGMA deny), reset preservation.
- `__test__/runtime-compat/sqlite.test.mjs` (JS, 8 cases): same shape
from JS.
- All previously-green tests still green: **2289 lib + 95 CLI**.
- `ruff check` and `ruff format --check` clean.
- `cargo vet --locked` succeeds.

## Test plan

- [x] `cargo build -p bashkit-python` → ok
- [x] `cargo build -p bashkit-js` → ok
- [x] `cargo fmt --all -- --check`
- [x] `cargo clippy --workspace --all-targets --all-features -- -D
warnings`
- [x] `cargo test --features sqlite -p bashkit --lib` → 2289 passed
- [x] `cargo test -p bashkit-cli` → 95 passed
- [x] `cargo vet --locked` → succeeds
- [x] `ruff check crates/bashkit-python && ruff format --check
crates/bashkit-python` → clean
- [ ] CI green (Python + JS suites)

---
_Generated by [Claude
Code](https://claude.ai/code/session_018ERUz5RrSCoZ5Hjku3URK2)_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant