Release gcode v0.9.2#6
Conversation
Addressed all 4 R3 adversary findings: - F1: Moved degradation contract from P3/§3.3 to P1/§1.2 for correct sequencing - F2: Changed setup contexts to use mutable borrows for postgres::Client - F3: Added dep:urlencoding to falkor feature gate - F4: Added decode_config_value, resolve_env_pattern, and ValueResolver callback Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fix project identity handling for self-referential parent markers, make build/dist excludes path-aware, and prune stale duplicate project entries. Task: #209
Addresses Round 9 adversary findings F1-F3: - F1: Document decode_config_value null behavior as intentional fix - F2: Remove unused ServiceState/EmbeddingConfig imports from search/qdrant - F3: Add deterministic source ordering to rrf_merge Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…avior in build-restart issues documentation
…tegration/214-context-and-setup-contracts
…tegration/214-context-and-setup-contracts
…tegration/214-context-and-setup-contracts
…tegration/214-context-and-setup-contracts
…tegration/214-context-and-setup-contracts
…tegration/215-datastore-adapters
…tegration/210-build-gcore-rust-foundation.md
…-add-qdrant-and-embedding-configuration-a # Conflicts: # Cargo.lock # crates/gcore/Cargo.toml # crates/gcore/src/postgres.rs
…-add-falkordb-adapter-and-query-safety-bo # Conflicts: # Cargo.lock # crates/gcore/Cargo.toml # crates/gcore/src/postgres.rs
…tegration/235-core-boundary-and-setup
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughExpands gobby-core with feature-gated adapters and primitives; migrates gcode to local adapters and new indexing/projection APIs; adds standalone provisioning, Qdrant/Falkor/Postgres adapters, expanded CLI (graph/vector/setup), and many end-to-end and unit tests plus Docker assets. ChangesCore foundation and CLI standalone flows
Sequence Diagram(s)sequenceDiagram
participant User
participant gcode_CLI as gcode CLI
participant GCore as gobby_core
participant Postgres
participant FalkorDB
participant Qdrant
User->>gcode_CLI: run graph/vector/index/setup commands
gcode_CLI->>GCore: resolve CoreContext/config via ConfigSource
gcode_CLI->>Postgres: index/write/read code facts (index/api)
gcode_CLI->>FalkorDB: code_graph sync/read via with_graph
gcode_CLI->>Qdrant: with_qdrant search/upsert vectors
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/gcode/src/db.rs (1)
251-266:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftRestore the mandated SQLite connection path in
db.rs.Lines 251-266 route through PostgreSQL connectors, but this module’s contract requires SQLite connections via
open_readwrite/open_readonlywith a 5-second busy timeout (and WAL for readwrite). This is a core boundary mismatch.As per coding guidelines
crates/gcode/src/db.rs: Create SQLite database connections with 5-second busy timeout using helpers indbmodule (open_readwritewith WAL mode,open_readonly).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gcode/src/db.rs` around lines 251 - 266, connect_readwrite and connect_readonly currently call PostgreSQL connectors but must instead open SQLite connections using the local db helpers: replace gobby_core::postgres::connect_readwrite/connect_readonly calls with the module's open_readwrite and open_readonly helpers, ensuring connect_readwrite uses open_readwrite configured with WAL mode and a 5-second busy timeout, and connect_readonly uses open_readonly with the same 5-second busy timeout; keep the existing schema::validate_runtime_schema(&mut client)? call and return the client.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.gobby/plans/gcore-rust-foundation.md:
- Line 39: Add a blank line immediately before each section heading that
currently lacks separation (e.g., "### 1.1 Define the gobby-core public boundary
[category: code]" and the other flagged headings at the specified locations) so
that every top-level or subsection heading is preceded by one empty line to
satisfy markdownlint MD022; apply the same blank-line pattern consistently to
all flagged headings throughout the document to fix the repeated heading-spacing
lint violations.
In `@crates/gcode/src/commands/graph.rs`:
- Around line 368-375: The graph read path currently propagates errors from
code_graph::require_graph_reads, code_graph::count_callers and
code_graph::find_callers; change these to gracefully degrade to empty results:
call code_graph::require_graph_reads(ctx) and if it returns an error, return the
same empty_response_for_unresolved(format) used for unresolved symbols (or the
equivalent empty response used by this command) instead of propagating the
error; similarly, wrap code_graph::count_callers(ctx, &symbol.id) and
code_graph::find_callers(ctx, &symbol.id, offset, limit) so any graph-service
errors are caught and converted into zero/empty results (total=0 and
results=vec![]) rather than returning Err. Update the same pattern in the other
affected blocks noted (around the calls to require_graph_reads, count_callers,
find_callers) so all graph-read commands degrade to empty arrays when Neo4j is
unavailable.
In `@crates/gcode/src/commands/index.rs`:
- Around line 115-119: The warning uses a panic-prone slice
&ctx.project_id[..8]; replace that slicing with the existing helper short_id to
safely format both IDs (use short_id(&ctx.project_id) and
short_id(&identity.project_id) where the warning message is constructed) so the
log call no longer risks panicking on short or non-ASCII project IDs.
In `@crates/gcode/src/commands/status.rs`:
- Around line 130-143: The cleanup_project_projections function currently treats
failures from code_graph::clear_project and
code_symbols::delete_project_collection as hard errors (propagating via
map_err), which causes the whole command to fail after indexer::invalidate has
already cleared the hub; change these calls to be best-effort: call
code_graph::clear_project(ctx) and
code_symbols::delete_project_collection(qdrant, &ctx.project_id) and on Err(_)
do not return Err—log a warning (e.g., using tracing::warn! or log::warn!) that
includes the error and continue, then return Ok(()) so projection cleanup is
non-fatal while indexer::invalidate remains propagated.
In `@crates/gcode/src/falkor.rs`:
- Around line 490-600: The current tests in functions like
falkor_client_wrapper_shape, phase7_read_helpers_visible,
phase7_source_fragments_visible, phase7_query_surface_visible,
phase7_query_helpers_and_literal_fragments_visible and
phase7_cargo_and_lockfile_state rely on include_str!("falkor.rs") and brittle
string-fragment assertions; remove or disable these source-fragment tests and
replace them with real integration/unit tests that exercise the public API
(e.g., create a FalkorClient via FalkorClient::from_config or
FalkorClientBuilder, call
client.query/cypher_string_literal/count_callers/find_callers/find_usages/find_callers_batch/find_callees_batch/get_imports/blast_radius
and assert behavior/results), or if documenting only, delete the phase7_* tests
entirely; ensure new tests import the same symbols (FalkorClient, SyncGraph,
FalkorConfig, find_callers_query, etc.) and verify runtime behavior rather than
string presence.
- Around line 327-360: Both batch helpers perform N+1 queries by looping over
symbol_ids and calling per-id functions; change find_callers_batch to call the
underlying batch query (crate::graph::code_graph::find_callers_batch_query) once
with the entire symbol_ids slice and limit, then transform its result into the
HashMap<String, Vec<GraphResult>> expected by find_callers_batch. Similarly,
change find_callees_batch to call
crate::graph::code_graph::find_callees_batch_query directly with the symbol_ids
slice and limit (don't wrap each id with from_ref), and convert that single
batch response into the same HashMap grouping keyed by symbol id; ensure any
result/ordering types from the code_graph batch query are adapted to match the
function signature and propagate errors with ?.
In `@crates/gcode/src/graph/code_graph.rs`:
- Around line 259-261: The usize_value helper silently clamps large usize into
i64 which loses precision; update the function usize_value to avoid silent
truncation by returning a Result (e.g., Result<TypedValue, Error>) and
explicitly error when value > i64::MAX instead of casting, or alternatively
document the limitation on usize overflow in the function comment and ensure
callers explicitly handle large values; reference TypedValue::Integer as the
target variant to construct when the value fits.
- Around line 979-982: The HTTP client is hard-coded with a 15s timeout which is
too short for GraphLifecycleAction::Rebuild; update the
reqwest::blocking::Client::builder() call that creates client to use a longer
timeout when the lifecycle action is Rebuild or make the timeout configurable
(e.g., accept a timeout parameter or read from configuration/environment) and
apply that value to .timeout(Duration::from_secs(...)); ensure the branch that
handles GraphLifecycleAction::Rebuild sets an appropriately larger timeout (or
falls back to a configurable default) while keeping shorter timeouts for other
actions.
In `@crates/gcode/src/graph/report.rs`:
- Around line 679-686: The current logic uses sentinel
f64::INFINITY/f64::NEG_INFINITY plus has_confidence to represent "no valid
confidence"; change this to use Option<(f64,f64)> (or two Option<f64>) for the
running min/max to make the absent-state explicit: replace
confidence_min/confidence_max/has_confidence with e.g. confidence_range:
Option<(f64,f64)>, when iterating edges check edge.metadata.confidence and
confidence.is_finite() and then set confidence_range = Some((val,val)) for the
first finite value or update the tuple to (min, max) for subsequent finite
values; update any later uses that relied on has_confidence or the sentinel
values to pattern-match on the Option instead.
- Around line 920-948: now_iso8601 and days_to_ymd implement custom calendar
conversion that can be brittle; add unit tests targeting edge cases (leap years
like 2000-02-29 and 1900-02-28/1900-03-01, Unix epoch and negative timestamps,
year boundaries around Dec 31/Jan 1, and subsecond precision) to validate
outputs, and add a short doc comment above days_to_ymd and now_iso8601 citing
the Howard Hinnant algorithm as the reference and describing expected
input/outputs to make the intent explicit and aid future maintenance.
In `@crates/gcode/src/index/api.rs`:
- Around line 75-116: upsert_symbols currently issues conn.execute for each
Symbol causing N round-trips; replace the per-item loop with a batched upsert:
either construct a single parameterized INSERT ... VALUES (...),(...),... with
ON CONFLICT that includes all symbols and call conn.execute once (build the SQL
using grouped ($1..$N) placeholders and a single Vec<&(dyn ToSql)>) or use
GenericClient's transaction + prepared statement approach (start a transaction,
conn.prepare once for the INSERT ... ON CONFLICT statement and call
transaction.execute for each symbol) or use execute_batch if available to submit
a multi-statement payload; update code in upsert_symbols to build parameters and
call the batched API instead of repeatedly calling conn.execute.
- Around line 47-73: The function delete_file_facts runs multiple DELETEs on
conn without a transaction, which can leave the DB partially updated on failure;
modify delete_file_facts to perform all statements inside a single transaction
(use conn.transaction() / conn.build_transaction() and execute the DELETEs on
the transaction handle, then commit) or change the function signature to accept
a transaction handle (e.g., &mut Transaction<...> / &mut impl
GenericClientTransaction) so callers can control the transaction; ensure you
execute the same SQL statements against the transaction handle and return an
error without committing on failure.
In `@crates/gcode/src/index/chunker.rs`:
- Around line 74-91: The self-referential test
chunker_stays_gcode_owned_with_documented_narrowing (and its
doc_phrase/forbidden string checks) should be removed and replaced with an
explicit module-level documentation comment that states the intended design
(e.g., "line-based `ContentChunk` records" and that the chunker must remain
gcode-owned) so the behavior is documented without reading source text; delete
the test function and its forbidden array, and add a clear #[doc = "..."] or ///
comment near the chunker implementation that contains the documented phrase
instead of using string concatenation workarounds.
In `@crates/gcode/src/vector/code_symbols.rs`:
- Around line 222-225: The collection_name function currently builds a
collection from collection_prefix + project_id and returns it via
CollectionScope::Custom(&collection) which is later interpolated into REST paths
like "/collections/{collection}"; fix this by percent-encoding (path-segment
encoding) or validating/sanitizing project_id before it reaches collection_name
so reserved URL characters (/, ?, #, %) cannot appear in the collection
segment—either: (A) encode the resulting collection string when used in HTTP
paths (where "/collections/{collection}" is formatted) or (B) enforce a
Qdrant-safe whitelist in normalize_project_id so collection_name always receives
a safe string; reference functions: collection_name, CollectionScope::Custom,
and normalize_project_id to locate the change.
In `@crates/gcore/assets/docker-compose.services.yml`:
- Line 7: The compose file uses floating image tags "falkordb/falkordb:latest"
and "qdrant/qdrant:latest" which causes deploy drift; update the image
references used by the FalkorDB and Qdrant services (search for the literal
strings "falkordb/falkordb:latest" and "qdrant/qdrant:latest") to use either
immutable tags or OCI digests (preferred), e.g. replace with
"falkordb/falkordb@sha256:<digest>" and "qdrant/qdrant@sha256:<digest>" (or a
fixed version tag) so the images are pinned and reproducible. Ensure you use the
correct architecture-specific digest if needed and update any documentation or
variables that reference these images.
In `@crates/gcore/src/qdrant.rs`:
- Around line 411-415: The test-server's Content-Length parsing is
case-sensitive (the headers.lines().find_map(...) that sets content_len) which
can miss "Content-Length" or other-cased headers; update the logic that computes
content_len to compare header names using case-insensitive comparison (e.g.,
split each header line on ':' and use eq_ignore_ascii_case("content-length") or
to_ascii_lowercase() on the header name) and then trim and parse the value with
parse::<usize>().ok() so the test harness reliably recognizes Content-Length
regardless of case.
- Around line 67-69: In with_qdrant, change the current only-None check on
config.url so that empty or whitespace-only strings are treated as NotConfigured
too; specifically, before proceeding, verify config.url is present and not just
whitespace (e.g., use config.url.as_ref().map(|s|
s.trim().is_empty()).unwrap_or(true) to detect blank) and return Ok((default,
ServiceState::NotConfigured)) when the URL is None or blank. This ensures
blank/whitespace Qdrant URLs do not proceed and will be handled the same as
missing configuration.
---
Outside diff comments:
In `@crates/gcode/src/db.rs`:
- Around line 251-266: connect_readwrite and connect_readonly currently call
PostgreSQL connectors but must instead open SQLite connections using the local
db helpers: replace gobby_core::postgres::connect_readwrite/connect_readonly
calls with the module's open_readwrite and open_readonly helpers, ensuring
connect_readwrite uses open_readwrite configured with WAL mode and a 5-second
busy timeout, and connect_readonly uses open_readonly with the same 5-second
busy timeout; keep the existing schema::validate_runtime_schema(&mut client)?
call and return the client.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 81b172ee-7611-4b5f-ac61-2e8b123b5d90
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (73)
.gobby/plans/gcode-graph-enhancements.md.gobby/plans/gcore-rust-foundation.mdCHANGELOG.mdREADME.mdcrates/gcode/Cargo.tomlcrates/gcode/README.mdcrates/gcode/src/commands/graph.rscrates/gcode/src/commands/index.rscrates/gcode/src/commands/init.rscrates/gcode/src/commands/mod.rscrates/gcode/src/commands/scope.rscrates/gcode/src/commands/setup.rscrates/gcode/src/commands/status.rscrates/gcode/src/commands/symbols.rscrates/gcode/src/commands/vector.rscrates/gcode/src/config.rscrates/gcode/src/db.rscrates/gcode/src/falkor.rscrates/gcode/src/freshness.rscrates/gcode/src/graph/code_graph.rscrates/gcode/src/graph/mod.rscrates/gcode/src/graph/report.rscrates/gcode/src/graph/typed_query.rscrates/gcode/src/index/api.rscrates/gcode/src/index/chunker.rscrates/gcode/src/index/hasher.rscrates/gcode/src/index/indexer.rscrates/gcode/src/index/mod.rscrates/gcode/src/index/parser.rscrates/gcode/src/index/security.rscrates/gcode/src/index/walker.rscrates/gcode/src/lib.rscrates/gcode/src/main.rscrates/gcode/src/models.rscrates/gcode/src/output.rscrates/gcode/src/projection/mod.rscrates/gcode/src/projection/sync.rscrates/gcode/src/schema.rscrates/gcode/src/search/graph_boost.rscrates/gcode/src/search/rrf.rscrates/gcode/src/search/semantic.rscrates/gcode/src/setup.rscrates/gcode/src/utils.rscrates/gcode/src/vector/code_symbols.rscrates/gcode/src/vector/mod.rscrates/gcode/tests/graph_standalone.rscrates/gcode/tests/projection_standalone.rscrates/gcode/tests/vector_projection.rscrates/gcore/Cargo.tomlcrates/gcore/assets/docker-compose.services.ymlcrates/gcore/assets/postgres-pgsearch/Dockerfilecrates/gcore/assets/postgres-pgsearch/initdb.d/01-pg_search.sqlcrates/gcore/assets/postgres-pgsearch/initdb.d/02-pgaudit.sqlcrates/gcore/assets/postgres-pgsearch/scripts/pg_audit_export.shcrates/gcore/assets/postgres-pgsearch/version.jsoncrates/gcore/src/config.rscrates/gcore/src/context.rscrates/gcore/src/degradation.rscrates/gcore/src/falkor.rscrates/gcore/src/indexing.rscrates/gcore/src/lib.rscrates/gcore/src/postgres.rscrates/gcore/src/project.rscrates/gcore/src/provisioning.rscrates/gcore/src/qdrant.rscrates/gcore/src/search.rscrates/gcore/src/setup.rscrates/gcore/tests/public_boundary.rscrates/ghook/Cargo.tomldocs/guides/gcode-development-guide.mddocs/guides/gcode-graph-core.mddocs/guides/gcode-user-guide.mddocs/guides/gcore-development-guide.md
| **Goal**: define the shared Rust foundation boundary and the setup modes that every consumer crate can use without inheriting another domain's behavior. | ||
| **Goal**: define the shared Rust foundation boundary, degradation vocabulary, context/config resolution, and setup modes that every consumer crate can use without inheriting another domain's behavior. | ||
|
|
||
| ### 1.1 Define the gobby-core public boundary [category: code] |
There was a problem hiding this comment.
Fix repeated heading-spacing lint violations.
Several section headings are missing the required blank line separation, which keeps markdownlint MD022 warnings active.
Suggested fix pattern
### 1.1 Define the gobby-core public boundary [category: code]
`kind: deliverable`
+
Targets: `crates/gcore/Cargo.toml`, `crates/gcore/src/lib.rs`, `docs/guides/gcore-development-guide.md`Apply the same blank-line pattern to the other flagged headings.
Also applies to: 122-122, 209-209, 501-501, 732-732, 848-848, 1097-1097, 1243-1243
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 39-39: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.gobby/plans/gcore-rust-foundation.md at line 39, Add a blank line
immediately before each section heading that currently lacks separation (e.g.,
"### 1.1 Define the gobby-core public boundary [category: code]" and the other
flagged headings at the specified locations) so that every top-level or
subsection heading is preceded by one empty line to satisfy markdownlint MD022;
apply the same blank-line pattern consistently to all flagged headings
throughout the document to fix the repeated heading-spacing lint violations.
| code_graph::require_graph_reads(ctx)?; | ||
| let symbol = match resolve_symbol(ctx, symbol_name) { | ||
| Some(symbol) => symbol, | ||
| None => return empty_response_for_unresolved(ctx, format), | ||
| }; | ||
| let total = falkor::count_callers(ctx, &symbol.id)?; | ||
| let results = falkor::find_callers(ctx, &symbol.id, offset, limit)?; | ||
| let total = code_graph::count_callers(ctx, &symbol.id)?; | ||
| let results = code_graph::find_callers(ctx, &symbol.id, offset, limit)?; | ||
|
|
There was a problem hiding this comment.
Graph read paths fail hard instead of degrading to empty results.
These command paths currently return an error when graph service is unavailable, but they should degrade to empty results for read commands.
As per coding guidelines, "crates/gcode/src/{commands,search}/**/*.rs: Handle graceful degradation: Neo4j/Qdrant/embedding API can each be unavailable independently. Graph commands return [] when Neo4j is down; semantic search returns [] when services are unavailable".
Also applies to: 419-425, 465-467, 497-503
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcode/src/commands/graph.rs` around lines 368 - 375, The graph read
path currently propagates errors from code_graph::require_graph_reads,
code_graph::count_callers and code_graph::find_callers; change these to
gracefully degrade to empty results: call code_graph::require_graph_reads(ctx)
and if it returns an error, return the same
empty_response_for_unresolved(format) used for unresolved symbols (or the
equivalent empty response used by this command) instead of propagating the
error; similarly, wrap code_graph::count_callers(ctx, &symbol.id) and
code_graph::find_callers(ctx, &symbol.id, offset, limit) so any graph-service
errors are caught and converted into zero/empty results (total=0 and
results=vec![]) rather than returning Err. Update the same pattern in the other
affected blocks noted (around the calls to require_graph_reads, count_callers,
find_callers) so all graph-read commands degrade to empty arrays when Neo4j is
unavailable.
| "Warning: path '{}' belongs to project {} (not {}), re-resolving context", | ||
| p, | ||
| short_id(&identity.project_id), | ||
| &ctx.project_id[..8] | ||
| ); |
There was a problem hiding this comment.
Avoid panic-prone string slicing in warning output.
Line 118 slices ctx.project_id with [..8], which can panic for short or non-ASCII IDs. Use short_id(&ctx.project_id) (already imported) for both IDs.
Proposed fix
- &ctx.project_id[..8]
+ short_id(&ctx.project_id)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "Warning: path '{}' belongs to project {} (not {}), re-resolving context", | |
| p, | |
| short_id(&identity.project_id), | |
| &ctx.project_id[..8] | |
| ); | |
| "Warning: path '{}' belongs to project {} (not {}), re-resolving context", | |
| p, | |
| short_id(&identity.project_id), | |
| short_id(&ctx.project_id) | |
| ); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcode/src/commands/index.rs` around lines 115 - 119, The warning uses
a panic-prone slice &ctx.project_id[..8]; replace that slicing with the existing
helper short_id to safely format both IDs (use short_id(&ctx.project_id) and
short_id(&identity.project_id) where the warning message is constructed) so the
log call no longer risks panicking on short or non-ASCII project IDs.
| indexer::invalidate(&mut conn, &ctx.project_id, ctx.daemon_url.as_deref())?; | ||
| cleanup_project_projections(ctx) | ||
| } | ||
|
|
||
| fn cleanup_project_projections(ctx: &Context) -> anyhow::Result<()> { | ||
| if ctx.falkordb.is_some() { | ||
| code_graph::clear_project(ctx) | ||
| .map_err(|err| anyhow::anyhow!("failed to clear FalkorDB projection: {err}"))?; | ||
| } | ||
| if let Some(qdrant) = &ctx.qdrant { | ||
| code_symbols::delete_project_collection(qdrant, &ctx.project_id) | ||
| .map_err(|err| anyhow::anyhow!("failed to delete Qdrant projection: {err}"))?; | ||
| } | ||
| Ok(()) |
There was a problem hiding this comment.
Make projection cleanup best-effort after hub invalidation.
Line 130 invalidates the PostgreSQL index first, but Lines 134-142 hard-fail on FalkorDB/Qdrant cleanup errors. That creates a partial-success failure mode (hub already cleared, command reports failure). Treat external projection cleanup as warning-level and continue.
Suggested adjustment
fn cleanup_project_projections(ctx: &Context) -> anyhow::Result<()> {
if ctx.falkordb.is_some() {
- code_graph::clear_project(ctx)
- .map_err(|err| anyhow::anyhow!("failed to clear FalkorDB projection: {err}"))?;
+ if let Err(err) = code_graph::clear_project(ctx) {
+ eprintln!("Warning: failed to clear FalkorDB projection: {err}");
+ }
}
if let Some(qdrant) = &ctx.qdrant {
- code_symbols::delete_project_collection(qdrant, &ctx.project_id)
- .map_err(|err| anyhow::anyhow!("failed to delete Qdrant projection: {err}"))?;
+ if let Err(err) = code_symbols::delete_project_collection(qdrant, &ctx.project_id) {
+ eprintln!("Warning: failed to delete Qdrant projection: {err}");
+ }
}
Ok(())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| indexer::invalidate(&mut conn, &ctx.project_id, ctx.daemon_url.as_deref())?; | |
| cleanup_project_projections(ctx) | |
| } | |
| fn cleanup_project_projections(ctx: &Context) -> anyhow::Result<()> { | |
| if ctx.falkordb.is_some() { | |
| code_graph::clear_project(ctx) | |
| .map_err(|err| anyhow::anyhow!("failed to clear FalkorDB projection: {err}"))?; | |
| } | |
| if let Some(qdrant) = &ctx.qdrant { | |
| code_symbols::delete_project_collection(qdrant, &ctx.project_id) | |
| .map_err(|err| anyhow::anyhow!("failed to delete Qdrant projection: {err}"))?; | |
| } | |
| Ok(()) | |
| fn cleanup_project_projections(ctx: &Context) -> anyhow::Result<()> { | |
| if ctx.falkordb.is_some() { | |
| if let Err(err) = code_graph::clear_project(ctx) { | |
| eprintln!("Warning: failed to clear FalkorDB projection: {err}"); | |
| } | |
| } | |
| if let Some(qdrant) = &ctx.qdrant { | |
| if let Err(err) = code_symbols::delete_project_collection(qdrant, &ctx.project_id) { | |
| eprintln!("Warning: failed to delete Qdrant projection: {err}"); | |
| } | |
| } | |
| Ok(()) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcode/src/commands/status.rs` around lines 130 - 143, The
cleanup_project_projections function currently treats failures from
code_graph::clear_project and code_symbols::delete_project_collection as hard
errors (propagating via map_err), which causes the whole command to fail after
indexer::invalidate has already cleared the hub; change these calls to be
best-effort: call code_graph::clear_project(ctx) and
code_symbols::delete_project_collection(qdrant, &ctx.project_id) and on Err(_)
do not return Err—log a warning (e.g., using tracing::warn! or log::warn!) that
includes the error and continue, then return Ok(()) so projection cleanup is
non-fatal while indexer::invalidate remains propagated.
| pub fn find_callers_batch( | ||
| ctx: &Context, | ||
| symbol_ids: &[String], | ||
| limit: usize, | ||
| ) -> anyhow::Result<Vec<GraphResult>> { | ||
| if symbol_ids.is_empty() { | ||
| return Ok(vec![]); | ||
| ) -> anyhow::Result<HashMap<String, Vec<GraphResult>>> { | ||
| let mut grouped = HashMap::new(); | ||
| for symbol_id in symbol_ids { | ||
| grouped.insert( | ||
| symbol_id.clone(), | ||
| crate::graph::code_graph::find_callers(ctx, symbol_id, 0, limit)?, | ||
| ); | ||
| } | ||
| with_falkor(ctx, vec![], |client| { | ||
| let (query, params) = find_callers_batch_query(&ctx.project_id, symbol_ids, limit); | ||
| let rows = client.query(&query, Some(params))?; | ||
| Ok(rows.iter().map(row_to_graph_result).collect()) | ||
| }) | ||
| Ok(grouped) | ||
| } | ||
|
|
||
| /// Find call targets reached by any of the given source ids. | ||
| pub fn find_callees_batch( | ||
| ctx: &Context, | ||
| symbol_ids: &[String], | ||
| limit: usize, | ||
| ) -> anyhow::Result<Vec<GraphResult>> { | ||
| if symbol_ids.is_empty() { | ||
| return Ok(vec![]); | ||
| ) -> anyhow::Result<HashMap<String, Vec<GraphResult>>> { | ||
| let mut grouped = HashMap::new(); | ||
| for symbol_id in symbol_ids { | ||
| grouped.insert( | ||
| symbol_id.clone(), | ||
| crate::graph::code_graph::find_callees_batch( | ||
| ctx, | ||
| std::slice::from_ref(symbol_id), | ||
| limit, | ||
| )?, | ||
| ); | ||
| } | ||
| with_falkor(ctx, vec![], |client| { | ||
| let (query, params) = find_callees_batch_query(&ctx.project_id, symbol_ids, limit); | ||
| let rows = client.query(&query, Some(params))?; | ||
| Ok(rows.iter().map(row_to_graph_result).collect()) | ||
| }) | ||
| Ok(grouped) | ||
| } |
There was a problem hiding this comment.
N+1 query pattern in batch functions.
Both find_callers_batch and find_callees_batch execute one query per symbol ID instead of using the existing batch query functions (find_callers_batch_query, find_callees_batch_query). This defeats the purpose of batch operations and introduces O(N) database round-trips.
Additionally, find_callees_batch wraps each symbol_id in std::slice::from_ref and calls code_graph::find_callees_batch which itself issues a batch query for a single ID - this is semantically inconsistent with the function's documented purpose.
Suggested fix to use actual batch queries
pub fn find_callers_batch(
ctx: &Context,
symbol_ids: &[String],
limit: usize,
) -> anyhow::Result<HashMap<String, Vec<GraphResult>>> {
- let mut grouped = HashMap::new();
- for symbol_id in symbol_ids {
- grouped.insert(
- symbol_id.clone(),
- crate::graph::code_graph::find_callers(ctx, symbol_id, 0, limit)?,
- );
- }
- Ok(grouped)
+ // Use actual batch query to reduce round trips
+ let results = crate::graph::code_graph::find_callers_batch(ctx, symbol_ids, limit)?;
+ // Group results by target symbol_id if needed, or return flat results
+ // Note: This requires the batch query to return caller->target mapping
+ let mut grouped = HashMap::new();
+ for symbol_id in symbol_ids {
+ grouped.insert(symbol_id.clone(), vec![]);
+ }
+ for result in results {
+ // Results need target_id to properly group - consider adjusting batch query
+ // to return which target each caller corresponds to
+ }
+ Ok(grouped)
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcode/src/falkor.rs` around lines 327 - 360, Both batch helpers
perform N+1 queries by looping over symbol_ids and calling per-id functions;
change find_callers_batch to call the underlying batch query
(crate::graph::code_graph::find_callers_batch_query) once with the entire
symbol_ids slice and limit, then transform its result into the HashMap<String,
Vec<GraphResult>> expected by find_callers_batch. Similarly, change
find_callees_batch to call crate::graph::code_graph::find_callees_batch_query
directly with the symbol_ids slice and limit (don't wrap each id with from_ref),
and convert that single batch response into the same HashMap grouping keyed by
symbol id; ensure any result/ordering types from the code_graph batch query are
adapted to match the function signature and propagate errors with ?.
| #[cfg(test)] | ||
| mod tests { | ||
| #[test] | ||
| fn chunker_stays_gcode_owned_with_documented_narrowing() { | ||
| let source = include_str!("chunker.rs"); | ||
| let doc_phrase = ["line-based `ContentChunk`", " records"].concat(); | ||
| assert!(source.contains(&doc_phrase)); | ||
|
|
||
| for forbidden in [ | ||
| ["use gobby_core", "::indexing::Chunk"].concat(), | ||
| ["use gobby_core", "::indexing::ChunkIdentity"].concat(), | ||
| ["use gobby_core", "::indexing::IndexEvent"].concat(), | ||
| ["use gobby_core", "::indexing::index_events_from_hashes"].concat(), | ||
| ] { | ||
| assert!(!source.contains(&forbidden)); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Self-referential source verification test is unusual.
This test reads its own source file and asserts that certain strings exist or don't exist. While this documents the design decision to keep chunking gcode-owned, it:
- Will fail if the documentation is refactored
- Uses string concatenation to avoid matching itself (e.g.,
["use gobby_core", "::indexing::Chunk"].concat())
Consider whether a simpler approach like a #[doc] comment or a separate design doc would serve the same purpose without the maintenance burden.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcode/src/index/chunker.rs` around lines 74 - 91, The self-referential
test chunker_stays_gcode_owned_with_documented_narrowing (and its
doc_phrase/forbidden string checks) should be removed and replaced with an
explicit module-level documentation comment that states the intended design
(e.g., "line-based `ContentChunk` records" and that the chunker must remain
gcode-owned) so the behavior is documented without reading source text; delete
the test function and its forbidden array, and add a clear #[doc = "..."] or ///
comment near the chunker implementation that contains the documented phrase
instead of using string concatenation workarounds.
| pub fn collection_name(collection_prefix: &str, project_id: &str) -> String { | ||
| let collection = format!("{collection_prefix}{project_id}"); | ||
| gobby_core::qdrant::collection_name("gcode", CollectionScope::Custom(&collection)) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether project_id is validated/sanitized before reaching collection_name()
# Expected: either strict validation exists, or this confirms raw values can flow to URL paths.
set -euo pipefail
echo "== project_id sources =="
rg -n -C3 'project_id|gcode\.json|project\.json' --type=rs
echo
echo "== collection name/path construction in vector lifecycle =="
rg -n -C3 'collection_name\(|/collections/\{collection\}|format!\("/collections/\{.*\}"' --type=rs
echo
echo "== explicit project_id validation patterns =="
rg -n -C2 'regex|validate.*project|project.*validate|is_ascii|[Aa]lphanumeric|sanitize' --type=rsRepository: GobbyAI/gobby-cli
Length of output: 131
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the target file
echo "== Locate code_symbols.rs =="
fd -a "code_symbols.rs" .
echo
echo "== Show lines around collection_name() =="
sed -n '180,260p' crates/gcode/src/vector/code_symbols.rs 2>/dev/null || true
echo
echo "== Search for project_id sources through the gcode crate =="
rg -n -S 'project_id' crates/gcode/src --type-add 'rust:*.rs' --type rust || true
echo
echo "== Search where collection_name() is called =="
rg -n -S 'collection_name\(' crates/gcode/src --type rust || true
echo
echo "== Follow into gobby_core::qdrant::collection_name implementation =="
rg -n -S 'fn collection_name' -S crates --type rust || true
rg -n -S 'qdrant::collection_name\(' crates --type rust || true
echo
echo "== Search for URL path construction using collections/{collection} =="
rg -n -S '/collections/\{collection\}|/collections/\\{' crates --type rust || true
rg -n -S 'collections/.+\{collection\}|/collections/.+\{.*collection.*\}' crates --type rust || true
echo
echo "== Search for URL encoding utilities (percent-encoding) =="
rg -n -S 'percent|urlencoding|utf8|encode_component|form_urlencoded|Url::|Url::parse|query_pairs' crates --type rust || true
echo
echo "== Search for project_id validation/sanitization =="
rg -n -S '(regex|validate.*project|project.*validate|is_ascii|alphanumeric|sanitize|encode|escape|slug|uuid|Uuid|[Pp]rojectId)' crates --type rust || trueRepository: GobbyAI/gobby-cli
Length of output: 50543
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== project_id normalization in crates/gcode/src/config.rs =="
rg -n "fn normalize_project_id|trim\(|is_empty" crates/gcode/src/config.rs --type rust
sed -n '270,320p' crates/gcode/src/config.rs
echo
echo "== where collection name is turned into Qdrant request paths =="
# Find code that builds URLs containing /collections/{collection}
rg -n --fixed-strings "/collections/" crates --type-add 'rust:*.rs'
echo
echo "== locate gobby_core::qdrant::collection_name implementation in-workspace =="
rg -n "qdrant::collection_name|fn collection_name\\(" crates --type-add 'rust:*.rs'
echo
echo "== look for URL-encoding / percent-encoding utilities used with Qdrant paths =="
rg -n -S "(percent-encoding|percent_encoding|urlencoding|form_urlencoded|encode_component|Url::parse|query_pairs)" crates --type rust || trueRepository: GobbyAI/gobby-cli
Length of output: 7951
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== crates/gcore/src/qdrant.rs collection_name() =="
sed -n '1,220p' crates/gcore/src/qdrant.rs
echo
echo "== crates/gcode/src/vector/code_symbols.rs: collection field and where it’s used in HTTP paths =="
rg -n "self\\.collection|/collections/\\{collection\\}|/collections/\\{\\}|/collections/\\{collection\\}/points|collection_name\\(" crates/gcode/src/vector/code_symbols.rs --type rust
sed -n '560,760p' crates/gcode/src/vector/code_symbols.rs
sed -n '780,920p' crates/gcode/src/vector/code_symbols.rs
sed -n '980,1060p' crates/gcode/src/vector/code_symbols.rs
sed -n '1240,1320p' crates/gcode/src/vector/code_symbols.rsRepository: GobbyAI/gobby-cli
Length of output: 24433
Escape Qdrant collection path segments derived from project_id
pub fn collection_name(collection_prefix: &str, project_id: &str) -> String {
let collection = format!("{collection_prefix}{project_id}");
gobby_core::qdrant::collection_name("gcode", CollectionScope::Custom(&collection))
}project_idis only trimmed (no sanitization),CollectionScope::Customreturns the collection name verbatim, and the code then interpolates it into REST paths like/collections/{collection}...—so characters like/,?,#, and%inproject_idcan break URL parsing or target unintended endpoints.- Escape/percent-encode the collection path segment (or enforce Qdrant-safe character validation for
project_id/collection names) before constructing/collections/{collection}...requests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcode/src/vector/code_symbols.rs` around lines 222 - 225, The
collection_name function currently builds a collection from collection_prefix +
project_id and returns it via CollectionScope::Custom(&collection) which is
later interpolated into REST paths like "/collections/{collection}"; fix this by
percent-encoding (path-segment encoding) or validating/sanitizing project_id
before it reaches collection_name so reserved URL characters (/, ?, #, %) cannot
appear in the collection segment—either: (A) encode the resulting collection
string when used in HTTP paths (where "/collections/{collection}" is formatted)
or (B) enforce a Qdrant-safe whitelist in normalize_project_id so
collection_name always receives a safe string; reference functions:
collection_name, CollectionScope::Custom, and normalize_project_id to locate the
change.
|
|
||
| services: | ||
| falkordb: | ||
| image: falkordb/falkordb:latest |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What are the current stable version tags and immutable image digests for falkordb/falkordb and qdrant/qdrant (linux/amd64 and linux/arm64)?
💡 Result:
Current stable tag(s) and immutable image digests for multi-arch (linux/amd64, linux/arm64) builds: 1) falkordb/falkordb - Stable tag: latest (docker hub tag “latest”) - linux/amd64 digest: sha256:31faa7afbd0e [1] - linux/arm64 digest: (digest shown on the same “latest” tag page, but not captured in the search snippet we have) [1] 2) qdrant/qdrant - Stable tag: latest (docker hub tag “latest”) - linux/amd64 digest: sha256:ce6abddfc042 [2] - linux/arm64 digest: sha256:fa2fe78a4c7f [2] Notes: - Docker Hub tag pages provide per-OS/ARCH digests for a given tag; these digests are immutable content hashes for the corresponding manifest/platform image. [1][2]
Citations:
Pin docker-compose image tags/digests (avoid :latest) for falkordb and qdrant
falkordb/falkordb:latest (line 7) and qdrant/qdrant:latest (line 31) make deployments drift over time. Docker Hub currently maps latest to immutable per-arch digests (e.g., falkordb amd64 sha256:31faa7afbd0e; qdrant amd64 sha256:ce6abddfc042 and arm64 sha256:fa2fe78a4c7f). Pin to a fixed tag or (preferably) the digest.
Suggested hardening
- image: falkordb/falkordb:latest
+ image: falkordb/falkordb:${GOBBY_FALKORDB_IMAGE_TAG:?set_explicit_tag_or_digest}
- image: qdrant/qdrant:latest
+ image: qdrant/qdrant:${GOBBY_QDRANT_IMAGE_TAG:?set_explicit_tag_or_digest}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcore/assets/docker-compose.services.yml` at line 7, The compose file
uses floating image tags "falkordb/falkordb:latest" and "qdrant/qdrant:latest"
which causes deploy drift; update the image references used by the FalkorDB and
Qdrant services (search for the literal strings "falkordb/falkordb:latest" and
"qdrant/qdrant:latest") to use either immutable tags or OCI digests (preferred),
e.g. replace with "falkordb/falkordb@sha256:<digest>" and
"qdrant/qdrant@sha256:<digest>" (or a fixed version tag) so the images are
pinned and reproducible. Ensure you use the correct architecture-specific digest
if needed and update any documentation or variables that reference these images.
| if config.url.is_none() { | ||
| return Ok((default, ServiceState::NotConfigured)); | ||
| } |
There was a problem hiding this comment.
Treat blank Qdrant URLs as not configured.
with_qdrant currently degrades only when the URL is None; empty/whitespace URLs proceed and fail later as hard errors, which breaks the intended missing-config behavior.
Suggested fix
- if config.url.is_none() {
+ if config
+ .url
+ .as_deref()
+ .map(str::trim)
+ .filter(|url| !url.is_empty())
+ .is_none()
+ {
return Ok((default, ServiceState::NotConfigured));
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if config.url.is_none() { | |
| return Ok((default, ServiceState::NotConfigured)); | |
| } | |
| if config | |
| .url | |
| .as_deref() | |
| .map(str::trim) | |
| .filter(|url| !url.is_empty()) | |
| .is_none() | |
| { | |
| return Ok((default, ServiceState::NotConfigured)); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcore/src/qdrant.rs` around lines 67 - 69, In with_qdrant, change the
current only-None check on config.url so that empty or whitespace-only strings
are treated as NotConfigured too; specifically, before proceeding, verify
config.url is present and not just whitespace (e.g., use
config.url.as_ref().map(|s| s.trim().is_empty()).unwrap_or(true) to detect
blank) and return Ok((default, ServiceState::NotConfigured)) when the URL is
None or blank. This ensures blank/whitespace Qdrant URLs do not proceed and will
be handled the same as missing configuration.
| let content_len = headers | ||
| .lines() | ||
| .find_map(|line| line.strip_prefix("content-length: ")) | ||
| .and_then(|value| value.parse::<usize>().ok()) | ||
| .unwrap_or(0); |
There was a problem hiding this comment.
Parse Content-Length case-insensitively in the test server.
HTTP header names are case-insensitive; matching only lowercase can make this harness brittle and potentially hang request parsing.
Suggested fix
- let content_len = headers
- .lines()
- .find_map(|line| line.strip_prefix("content-length: "))
- .and_then(|value| value.parse::<usize>().ok())
- .unwrap_or(0);
+ let content_len = headers
+ .lines()
+ .find_map(|line| {
+ let (name, value) = line.split_once(':')?;
+ if name.trim().eq_ignore_ascii_case("content-length") {
+ value.trim().parse::<usize>().ok()
+ } else {
+ None
+ }
+ })
+ .unwrap_or(0);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gcore/src/qdrant.rs` around lines 411 - 415, The test-server's
Content-Length parsing is case-sensitive (the headers.lines().find_map(...) that
sets content_len) which can miss "Content-Length" or other-cased headers; update
the logic that computes content_len to compare header names using
case-insensitive comparison (e.g., split each header line on ':' and use
eq_ignore_ascii_case("content-length") or to_ascii_lowercase() on the header
name) and then trim and parse the value with parse::<usize>().ok() so the test
harness reliably recognizes Content-Length regardless of case.
|
Actionable comments posted: 0 |
Summary
Validation
Summary by CodeRabbit
New Features
Improvements