Resync Rust port to graphify-py v0.8.30 (724f1e3)#17
Conversation
Port the applicable v0.8.27..v0.8.30 changes and bump the workspace version to 0.8.30 in lockstep with the submodule pointer. Ports: - cluster: total-order community-ID tiebreak so an identical grouping always gets identical integer IDs run-to-run (#1090 follow-up) - detect: F2 office/PDF resource caps - 50 MiB on-disk screen plus a zip-bomb guard (512 MiB decompressed, 200:1 ratio, chunked streaming ceiling) gating PDF/DOCX/XLSX text extraction - extract: rewrite the Dart extractor (inheritance, mixins, interfaces, generics, annotations, Riverpod/Bloc codegen, extensions, typedefs, records/destructuring, navigation, `part of` redirection, generic type-lookups); F5 - pass `cpp` an absolute path so an attacker-named Fortran file cannot be parsed as an option - llm: F1 - validate custom-provider `base_url` (reject non-http(s), warn on plaintext egress) and gate a project-local `providers.json` behind `GRAPHIFY_ALLOW_LOCAL_PROVIDERS`; F3 - hard-block an `OLLAMA_BASE_URL` that is, or resolves to, a link-local/cloud-metadata address while still only warning for a general LAN host - hooks: Read/Glob `PreToolUse` nudge for Claude Code (#1114); Kilo Code platform - native skill, `/graphify` command, `.kilo` plugin and JSONC-safe `kilo.json` registration; Antigravity project-scoped installs now write the full rules/workflow layer; `claude uninstall` removes the orphaned user-scope skill tree (#1121) Divergences from graphify-py (documented in code + notes): the skillgen progressive-disclosure split and per-platform `references/` sidecars stay collapsed to one canonical skill; the `_backend_pkg_hint` install message has no Rust equivalent (all backends compile in). Glory to the Omnissiah
Fixes:
- kilo: `strip_json_comments` built the output by pushing each input
byte as a `char`, corrupting multibyte UTF-8 in a `.kilo/kilo.jsonc`;
accumulate raw bytes and decode once at the end (major)
- llm: parse the `127.0.0.0/8` loopback case as an `Ipv4Addr` instead of
a `starts_with("127.")` prefix, so a hostname like `127.evil.com` is
correctly treated as non-loopback (in both `provider_base_url_ok` and
`validate_ollama_base_url`); strip brackets from an IPv6 host so a
bracketed link-local literal like `[fe80::1]` is caught
- llm: the `detect_backend` ollama gate now validates with `warn=false`
so a non-loopback LAN host is not warned about twice
- extract: `resolve_cpp_path` falls back to a `./`-prefixed path when the
cwd is unavailable, so the result never looks like a `cpp` option
- hooks: `#[must_use]` on `claude_user_skill_dst` and `read_settings_hook`;
drop redundant `serde_json`/`url` dev-deps (already normal deps)
- tests: IPv6 (`[fe80::1]`/`[::1]`/fe80 resolver) ollama cases, a
`127.evil.com` provider case, a `#[cfg(unix)]` guard on the leading-`/`
cpp assertion, and a `HomeGuard` RAII in the kilo tests
Disputes (documented in code comments):
- `cmd_kilo` intentionally ignores the shared `--project` flag: graphify-py's
`kilo` command has no project-scope variant
- the `# ...` markers in the Dart test fixture are an intentional
byte-faithful copy of graphify-py's test_dart.py (they exercise that
non-`//` lines survive comment-stripping)
By the will of the Machine God
Fixes: - llm: strip IPv6 brackets from the provider `base_url` host (`[::1]`) before the loopback check, matching the same fix already applied to `validate_ollama_base_url` - detect: drop the redundant `zip` dev-dependency (already a normal dep, so it is available to the `office_limits` integration test) - extract: `#[must_use]` on `split_types`; correct the `resolve_cpp_path` doc to state the `./`-prefixed fallback when the cwd is unreadable - hooks: clarify the `KILO_PLUGIN_JS` doc (structurally mirrors the OpenCode plugin but the echo text differs; byte-identical to the Python `_KILO_PLUGIN_JS`, not to OpenCode) Dispute (documented in code): the Read/Glob hook command is kept as one whole literal rather than decomposed into fragments, so it stays byte-identical to graphify-py's `_READ_SETTINGS_HOOK["command"]`; its behaviour is validated by executing it via `sh -c` in tests/read_hook.rs. Ave Deus Mechanicus
Fixes: - llm: detect loopback via `IpAddr::is_loopback` (covers `127.0.0.0/8` and `::1`, including IPv6-bracket and IPv6-loopback forms) in both `provider_base_url_ok` and `validate_ollama_base_url` - hooks: fold the Codex hook removal into the shared agents-uninstall helper so `.codex/hooks.json` is cleaned in every branch (a missing or markerless AGENTS.md no longer orphans it), alongside the OpenCode / Kilo plugins - detect: stream office-zip members in 64 KiB chunks (lower peak memory; the chunk size never affects the cap result) - tests: a `call_llm` call-site test proving a metadata `OLLAMA_BASE_URL` is refused end-to-end on the no-key ollama path Disputes (documented in code / here): - `tiktoken-rs = "0.12"` is published (confirmed via `cargo search`, it resolves in `Cargo.lock`, and the full suite builds and passes), so the "may not be published" concern does not apply - the claude skill-removal status message is emitted before the best-effort, infallible `remove_skill`, matching `gemini_uninstall` - ollama URL validation runs only on the no-API-key path (graphify-py parity); the realistic ollama path has no key, and that path is now covered end-to-end By the Omnissiah's grace
- kilo: `plugin_uri` now falls back to a lexical absolute path when the parent directory can't be canonicalized, so a stale `kilo.json` plugin entry can still be deregistered (matches Python's always-compute `resolve().as_uri()`); `#[must_use]` on `plugin_uri` and `display_rel` - extract: document that the Dart `part of` redirect falls back to standalone extraction when the referenced parent file is missing (mirrors Python's `resolve()` + `exists()` fallback) Ave Omnissiah
- extract: apply the Dart `variable_type` blacklist to the *cleaned* type, so a generic primitive like `Map<String, int>` is correctly skipped instead of emitting a spurious `map` reference (fixes a blacklist-bypass bug graphify-py shares; intentional divergence noted) - llm: drop a redundant `host.to_string()` in `validate_ollama_base_url` (use `&str`, allocate only for the error path); add IPv6 loopback / link-local provider test cases; document why the `local != global` comparison is a plain path compare (Python parity, avoids extra I/O) - tests: generalise the cli_install HOME-isolation comment Disputes (documented in code / here): - read_hook.rs keeps the file-top `#![allow(clippy::expect_used)]`: AGENTS.md explicitly sanctions this for test files, and the helpers orchestrate process spawning + fixture setup where `expect` with a message is the crate's established test pattern The Machine God wills it
Document that the third `provider_base_url_ok` test argument is `warn = true`, so the test intentionally exercises the warning paths. Glory to the Omnissiah
Use exact-equality (not substring) for the destructure-key assertion in the Dart roadmap test, so an unrelated label can never false-match. By the will of the Machine God
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (12)
🚧 Files skipped from review as they are similar to previous changes (7)
📝 WalkthroughWalkthroughThis PR introduces Kilo Code editor platform support with global skill/command installation and project-local plugin registration, implements SSRF and zip-bomb security gates for Ollama and untrusted documents, refactors Dart extraction to support part-of redirection and richer pattern detection, hardens Fortran C-preprocessor invocations, adds Claude Read|Glob hook nudging and user-scope skill management, improves cluster community determinism, and updates dependencies and documentation accordingly. ChangesKilo Code Platform Integration
Security Hardening and Extractor Improvements
Claude Hook Enhancements and User-Scope Skill Management
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
|
Resolve the findings from the latest CodeRabbit review of the v0.8.30 resync. - `xlsx_extract_structure` now runs the same `zip_within_caps` screen as `xlsx_to_markdown` so every public XLSX parsing path honours the F2 decompression/ratio caps. graphify-py leaves this path unguarded but flags (F-035) that it needs a bomb audit before use; Rust applies the guard now. - F3 ollama validation in `call_llm` and `extract_files_direct` now hard-blocks link-local / cloud-metadata `OLLAMA_BASE_URL` for every ollama call, not only the no-key path. Reaching a real local ollama needs `GRAPHIFY_TEST_ALLOW_PRIVATE_IPS`, which also disarms the downstream SSRF guard, so this check is the only metadata defence on the ollama path. The LAN warning stays on the no-key path. Fixes a gap shared with graphify-py. - `uninstall_claude_hook` now inspects nested `hooks[].command` strings via a shared `hook_targets_graphify` helper instead of stringifying the whole entry, mirroring the install path's precise matching. - An empty `CLAUDE_CONFIG_DIR` is treated as unset via a shared `claude_config_dir` helper, so it never collapses to a stray relative path that install writes but uninstall cannot find. Applied to both install and uninstall sites. - Add `#[must_use]` to the pure `PathBuf`-returning helpers in `kilo.rs`. Test improvements: - `pdf_over_cap_returns_empty` builds a genuinely valid PDF and proves the size cap (not a parse error) yields the empty string by extracting the same file under a large vs. tiny cap, via a new `extract_pdf_text_with` seam. Adds `lopdf` as a dev-dependency. - `converters_return_empty_for_bomb` uses structurally valid DOCX/XLSX fixtures with the bomb payload in a real internal part. - New `structure_extraction_returns_empty_for_bomb` covers the new `xlsx_extract_structure` guard. - New non-empty-key assertion in `call_llm_blocks_metadata_ollama_url_at_call_site`. The two file-top `expect_used`/`unwrap_used` nitpicks are declined: the blanket allow in test files is the sanctioned project convention; the rationale is recorded in code comments at each site rather than here. Glory to the Omnissiah
Read `OLLAMA_BASE_URL` once into `ollama_base_url` before the backend branching in `extract_files_direct_mode`, and reuse that single value for both the F3 hard-block validation and the `ollama` dispatch arm. Reading the environment twice could validate a different value than the one actually sent. This mirrors the pattern already used in `call_llm`. By the will of the Machine God
e8c400c to
0af1036
Compare
Advances the
graphify-pysubmodule to v0.8.30 (724f1e3) and ports the applicablev0.8.27..v0.8.30changes to the Rust workspace. Workspace version bumped to0.8.30.Ports
(-len, sorted members)) so an identicalgrouping always gets identical integer IDs run-to-run (#1090 follow-up).
.docx/.xlsxzip-bombguard (512 MiB decompressed, 200:1 ratio, chunked streaming ceiling).
Bloc/Riverpod/Navigator patterns, annotations, typedefs, records/destructuring,
part ofredirection, generic type-lookups); F5 —
cppis passed an absolute path so anattacker-named Fortran file can't be parsed as an option.
base_url(reject non-http(s), warn on plaintextegress) and gate a project-local
providers.jsonbehindGRAPHIFY_ALLOW_LOCAL_PROVIDERS;F3: hard-block an
OLLAMA_BASE_URLthat is, or resolves to, a link-local/cloud-metadataaddress while still only warning for a general LAN host.
PreToolUsenudge for Claude Code (#1114); Kilo Code platform(native skill +
/graphifycommand +.kiloplugin + JSONC-safekilo.jsonregistration);Antigravity project-scoped installs now write the full rules/workflow layer;
claude uninstallremoves the orphaned user-scope skill tree (#1121).
Divergences from graphify-py (intentional)
references/sidecars staycollapsed to one canonical
skill.md(the Rust binary runs the whole pipeline as a singlegraphify extract, with no host-driven subagent fan-out)._backend_pkg_hinthas no Rust equivalent — all backends compile in, so there is nooptional-package import that can fail.
Map<String, int>is correctly skipped (graphify-py checks the raw string and leaks aspurious
mapreference — a bug fixed here, not replicated).Verification
cargo nextest run --workspace— 1715 pass (2 net-prefixed skipped).cargo clippy --all-targets --all-features --workspace— clean.hk check— clean.--type committed), every finding fixed or documented as a disputein-code. The final zero-findings re-verify was blocked by an org usage-credit rate-limit; the
last actionable finding is fixed in
29d23ce.Summary by CodeRabbit
New Features
Security
Improvements
Documentation
Version