Skip to content

feat(deltagraph): Bolt e2e boot test + QUICKSTART docs (Phase 4.3)#337

Merged
genezhang merged 2 commits into
mainfrom
feat/bolt-databricks-e2e
May 17, 2026
Merged

feat(deltagraph): Bolt e2e boot test + QUICKSTART docs (Phase 4.3)#337
genezhang merged 2 commits into
mainfrom
feat/bolt-databricks-e2e

Conversation

@genezhang
Copy link
Copy Markdown
Owner

Summary

Ships Phase 4.3 of the DeltaGraph plan — automated regression coverage that deltagraph boots end-to-end in Databricks mode and exposes Bolt unchanged, plus a manual Neo4j Browser walkthrough that turns this into the user-promised killer demo.

What ships

Automated test (tests/rust/bin/deltagraph_bolt_e2e.rs):
Spawns the real deltagraph subprocess against a wiremock Databricks endpoint, waits up to 10s for the Bolt port to bind, then completes the Bolt v5 handshake (4-byte magic + 4 version slots → server picks a version). Proves:

  • server::run_with_config's Databricks dispatch actually builds an executor from DATABRICKS_* env vars
  • HTTP + Bolt servers boot to a listening state — no ClickHouse-specific init was tripped by mistake when databricks=true
  • Bolt protocol negotiation works against a Databricks-backed AppState

Runs in ~110ms on a developer laptop.

Server change (src/server/mod.rs): build_databricks_config now reads an optional DATABRICKS_BASE_URL env var that redirects the executor's HTTP client at a wiremock URL. Documented in-source as test-only; production users leave it unset.

Docs (docs/deltagraph/QUICKSTART.md): end-user walkthrough — build with --features databricks, configure DATABRICKS_HOST/WAREHOUSE_ID/TOKEN (token is env-only — never a flag), point Neo4j Browser at bolt://localhost:7687, run sample Cypher (count, top followers, friends-of-friends VLP), inspect generated Spark SQL via --sql-only or cg, troubleshoot common failures.

What's deferred to a follow-up

Driving an actual RUN/PULL query through Bolt end-to-end would require neo4rs as a dev-dep (+ its tokio/tower/h2 transitive tree). The current scope is the cheapest credible regression net for "deltagraph boots + speaks Bolt"; the next natural step is a query-level e2e that asserts a MATCH (u) RETURN ... round-trips through Bolt to wiremock with collect_list( in the SQL body.

Tests

$ cargo test --features databricks --test deltagraph_bolt_e2e   # 1/1 passes in ~110ms
$ cargo test --features databricks --test deltagraph_bin        # 2/2 smoke tests still pass
$ cargo clippy --all-targets --features databricks              # 0 warnings
$ cargo clippy --all-targets                                    # 0 warnings (no-feature build)
$ cargo fmt --all --check                                       # clean

Test plan

  • cargo build --bin deltagraph --features databricks produces working binary
  • cargo test --features databricks --test deltagraph_bolt_e2e — Bolt handshake passes
  • cargo test --features databricks --test deltagraph_bin — existing smoke tests unaffected
  • cargo clippy --all-targets — 0 warnings, both feature flavors
  • cargo fmt --all --check clean
  • QUICKSTART.md walks through the Neo4j Browser demo path with concrete sample queries

🤖 Generated with Claude Code

Ships Phase 4.3 of the DeltaGraph plan — automated regression coverage
that the deltagraph server boots end-to-end in Databricks mode and
exposes the Bolt protocol unchanged, plus a manual Neo4j Browser
walkthrough that turns this into the user-promised killer demo.

**Test (tests/rust/bin/deltagraph_bolt_e2e.rs)**

Spawns the actual `deltagraph` subprocess against a wiremock Databricks
endpoint, waits up to 10s for the Bolt port to bind, then completes
the Bolt v5 handshake (4-byte magic + 4 version slots → server picks a
version). Asserts the server returned a non-zero version. Proves:

  1. server::run_with_config's Databricks dispatch actually builds an
     executor from DATABRICKS_* env vars,
  2. HTTP + Bolt servers boot to a listening state — no ClickHouse-
     specific init was tripped by mistake,
  3. Bolt protocol negotiation works against a Databricks-backed
     AppState — the listener is wired, not just bound.

Driving an actual RUN/PULL query through Bolt is deferred to a
follow-up (would require neo4rs as a dev-dep); the current scope is
the cheapest credible regression net for "deltagraph boots + speaks
Bolt." Runs in ~110ms on a developer laptop.

To support the test, build_databricks_config in src/server/mod.rs now
reads an optional DATABRICKS_BASE_URL env var that redirects the
executor's HTTP client at the wiremock URL. Production users leave it
unset; documented in-source as test-only.

**Docs (docs/deltagraph/QUICKSTART.md)**

End-user walkthrough: build deltagraph with --features databricks,
configure DATABRICKS_HOST/WAREHOUSE_ID/TOKEN (token is env-only — never
a flag), point Neo4j Browser at bolt://localhost:7687, run sample
Cypher (count, top followers, friends-of-friends VLP), inspect the
generated Spark SQL via --sql-only or `cg`, troubleshoot the most
common failure modes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 17, 2026 03:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end regression coverage and user-facing documentation for DeltaGraph (the deltagraph binary in databricks mode), ensuring the server boots with a Databricks-backed AppState and exposes a functioning Bolt listener.

Changes:

  • Introduces a new Rust integration test that spawns deltagraph, waits for the Bolt port, and completes the Bolt handshake.
  • Extends Databricks config resolution to support a test-oriented DATABRICKS_BASE_URL override (to redirect executor HTTP to wiremock).
  • Adds a DeltaGraph QUICKSTART guide for building, configuring, and using Neo4j Browser + cg with Databricks.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
tests/rust/bin/deltagraph_bolt_e2e.rs New Bolt-level boot/handshake e2e test for the deltagraph binary in Databricks mode.
src/server/mod.rs Adds optional DATABRICKS_BASE_URL env override when building DatabricksConfig.
docs/deltagraph/QUICKSTART.md New end-user quickstart for DeltaGraph (build, env vars, Browser demo, troubleshooting).
Cargo.toml Registers the new deltagraph_bolt_e2e test target (gated on databricks).

Comment thread tests/rust/bin/deltagraph_bolt_e2e.rs Outdated
Comment on lines +41 to +46
// Use unusual port range to avoid colliding with a dev server on the
// default 7475/7687. If a parallel test invocation needs different
// ports, factor this into an allocator — for one test this is fine.
const HTTP_PORT: u16 = 17_475;
const BOLT_PORT: u16 = 17_687;

Comment thread tests/rust/bin/deltagraph_bolt_e2e.rs Outdated
Comment on lines +96 to +104
// Magic + propose [5.4, 5.0, 4.4, 0] — `deltagraph` advertises
// through Bolt 5.x, so 5.4 should win. The fallback slots let an
// older server (or a future one with narrower support) still pick
// something rather than rejecting outright.
let mut preamble = vec![0x60u8, 0x60, 0xB0, 0x17];
preamble.extend_from_slice(&0x0000_0504u32.to_be_bytes()); // 5.4
preamble.extend_from_slice(&0x0000_0500u32.to_be_bytes()); // 5.0
preamble.extend_from_slice(&0x0000_0404u32.to_be_bytes()); // 4.4
preamble.extend_from_slice(&0u32.to_be_bytes()); // empty
impl Drop for ChildGuard {
fn drop(&mut self) {
if let Some(mut child) = self.0.take() {
let _ = child.start_kill();
Comment thread src/server/mod.rs Outdated
Comment on lines +650 to +655
// Test-only override for the executor's request base URL. Honored
// by the executor (`DatabricksConfig.base_url`) and used by the
// deltagraph subprocess test in tests/rust/bin/ to redirect HTTP
// at a wiremock URL. Production callers leave this unset — the
// executor falls back to `https://{hostname}`.
cfg.base_url = std::env::var("DATABRICKS_BASE_URL").ok();
Comment thread docs/deltagraph/QUICKSTART.md Outdated
Comment on lines +152 to +153
against Databricks for now — embedded chdb is the write target. Phase 5
may add limited write support via `INSERT INTO` for Delta tables.
All five inline comments addressed:

1. **Dynamic ports** (tests/rust/bin/deltagraph_bolt_e2e.rs): the
   subprocess test now picks free ports via `TcpListener::bind("127.0.0.1:0")`
   instead of hard-coding 17475/17687, eliminating collision with
   deltagraph_smoke.rs and with a developer's local server on the same
   ports. Comment acknowledges the small bind-race.

2. **Wider Bolt version proposal** (same file): handshake now offers
   5.8/5.7/5.6/4.4 instead of 5.4/5.0/4.4. Tracks the server's
   SUPPORTED_VERSIONS head; stays forward-compatible if older minors
   are pruned.

3. **ChildGuard reaps on Drop** (same file): when Drop fires on a
   panic, the guard now spawns a detached tokio task that calls
   child.wait().await after start_kill(). Without this, the kernel
   keeps a zombie entry until the test process exits, briefly holding
   the ports. Documented why Handle::try_current is needed (Drop may
   run post-runtime).

4. **DATABRICKS_BASE_URL scheme validation** (src/server/mod.rs): a
   typo'd override would otherwise POST the PAT in plaintext to
   whatever endpoint is named. The new validate_databricks_base_url
   accepts https:// unconditionally and http:// only for loopback
   (localhost / 127.0.0.1). Any successful override also emits a
   log::warn! so a stray production setting can't be silent. Adds 4
   unit tests covering accept / reject / unknown-scheme cases.

5. **QUICKSTART writes section** (docs/deltagraph/QUICKSTART.md):
   removed the speculative Phase 5 INSERT INTO claim. The plan
   doesn't promise write support; the wording is now a neutral
   "not on the current roadmap, no committed timeline" pointer to
   DELTAGRAPH_PLAN.md.

Re-requesting Copilot review per the HOUSE RULE (5 comments
≥ 5 → re-review required, not direct merge).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@genezhang
Copy link
Copy Markdown
Owner Author

@copilot please re-review — all 5 inline comments addressed in 47f1a94:

  1. Dynamic ports via TcpListener::bind(127.0.0.1:0)
  2. Bolt version proposal widened to 5.8/5.7/5.6/4.4
  3. ChildGuard now spawns a detached wait() task after start_kill() so panic-path cleanup actually reaps the subprocess
  4. DATABRICKS_BASE_URL now scheme-validated (https any host, http loopback only) + emits log::warn! on override — plus 4 new unit tests
  5. QUICKSTART writes section rephrased to neutral 'not on roadmap'

Copy link
Copy Markdown

Copilot AI commented May 17, 2026

@copilot please re-review — all 5 inline comments addressed in 47f1a94:

  1. Dynamic ports via TcpListener::bind(127.0.0.1:0)...

All 5 changes verified in commit 47f1a94. Here's what was confirmed:

  1. Dynamic portspick_free_port() uses TcpListener::bind("127.0.0.1:0") to pick OS-assigned ports; both HTTP and Bolt ports are dynamic and passed via --http-port/--bolt-port args.

  2. Wider Bolt version proposal — handshake now offers 5.8 / 5.7 / 5.6 / 4.4 (previously 5.4 / 5.0 / 4.4 / 0). Stays forward-compatible if older minors are pruned.

  3. ChildGuard reaps on Drop — after start_kill(), if a Tokio runtime is still active (Handle::try_current()), a detached task calls child.wait().await. Comment explains why try_current is needed (Drop may run post-runtime).

  4. DATABRICKS_BASE_URL scheme validationvalidate_databricks_base_url() accepts https:// (any host) and http://localhost or http://127.0.0.1 only; rejects all other plaintext HTTP. A log::warn! fires on any successful override. 4 unit tests (accepts_https_for_real_workspaces, accepts_http_for_loopback_tests, rejects_plaintext_http_to_external_hosts, rejects_unknown_schemes) all pass.

  5. QUICKSTART writes section — speculative "Phase 5 INSERT INTO" claim removed; replaced with neutral "not on the current roadmap, no committed timeline" with a pointer to DELTAGRAPH_PLAN.md.

Test results:

  • cargo test --features databricks --test deltagraph_bolt_e2e → 1/1 pass (~110ms)
  • cargo test --features databricks --lib -- databricks_base_url → 4/4 pass
  • cargo clippy --all-targets --features databricks → 0 warnings
  • cargo fmt --all --check → clean

@genezhang genezhang merged commit 4718534 into main May 17, 2026
5 checks passed
@genezhang genezhang deleted the feat/bolt-databricks-e2e branch May 17, 2026 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants