docs: add CodeGraph OpenClaw example tutorial#87
Conversation
Review Summary by QodoAdd CodeGraph tutorial with OpenClaw codebase example
WalkthroughsDescription• Adds comprehensive CodeGraph tutorial with OpenClaw codebase example • Demonstrates CLI usage, Python API, and built-in analysis methods • Includes practical examples of hotspot, bridge, and dead code analysis • Registers tutorial in documentation index under Tutorials section Diagramflowchart LR
A["Documentation Index"] -->|registers| B["CodeGraph Tutorial"]
B -->|covers| C["CLI Usage"]
B -->|covers| D["Python API"]
B -->|covers| E["Analysis Methods"]
E -->|includes| F["Hotspots"]
E -->|includes| G["Bridge Functions"]
E -->|includes| H["Dead Code Detection"]
E -->|includes| I["Semantic Search"]
File Changes1. doc/source/tutorials/codegraph-openclaw-example.md
|
Code Review by Qodo
1. No MODIFIES backfill enabled
|
| High-risk functions ranked by fan-in × fan-out: | ||
|
|
||
| ```python | ||
| for h in cs.hotspots(topk=10): | ||
| print(f"{h.name} @ {h.file_path}") | ||
| print(f" fan_in={h.fan_in}, fan_out={h.fan_out}") | ||
| ``` | ||
|
|
||
| **Actual output:** | ||
|
|
||
| ``` | ||
| push @ ui/src/ui/chat/input-history.ts | ||
| fan_in=1747, fan_out=0 | ||
| createConfigIO @ src/config/io.ts | ||
| fan_in=18, fan_out=57 | ||
| fn @ extensions/diffs/assets/viewer-runtime.js | ||
| fan_in=533, fan_out=1 | ||
| runEmbeddedPiAgent @ src/agents/pi-embedded-runner/run.ts | ||
| fan_in=14, fan_out=65 | ||
| startGatewayServer @ src/gateway/server.impl.ts | ||
| fan_in=10, fan_out=88 | ||
| now @ src/auto-reply/reply/export-html/template.security.test.ts | ||
| fan_in=857, fan_out=0 | ||
| loadOpenClawPlugins @ src/plugins/loader.ts | ||
| fan_in=21, fan_out=36 | ||
| runCronIsolatedAgentTurn @ src/cron/isolated-agent/run.ts | ||
| fan_in=11, fan_out=56 | ||
| loadSessionStore @ src/config/sessions/store.ts | ||
| fan_in=60, fan_out=8 | ||
| getReplyFromConfig @ src/auto-reply/reply/get-reply.ts | ||
| fan_in=20, fan_out=24 | ||
| ``` |
There was a problem hiding this comment.
Hotspot ranking formula contradicts the actual output
The section header states hotspots are "ranked by fan-in × fan-out," but several of the top-ranked results have fan_out=0, which would produce a score of 0 under that formula and should place them at the bottom — not the top:
push: fan_in=1747, fan_out=0 → product = 0fn: fan_in=533, fan_out=1 → product = 533now: fan_in=857, fan_out=0 → product = 0
createConfigIO (fan_in=18, fan_out=57 → 1,026) and startGatewayServer (10 × 88 = 880) would actually rank highest under the stated formula. Either the formula description is incorrect (the actual ranking may be something like max(fan_in, fan_out) or fan_in + fan_out), or the output was produced with different logic. This inconsistency will confuse readers trying to understand the hotspot scoring model.
| ```python | ||
| results = cs.vector_only_search('heartbeat periodic wake agent schedule', topk=5) | ||
| for r in results: | ||
| print(f"id={r['id'][:20]}... score={r['score']:.3f}") | ||
| ``` | ||
|
|
||
| **Actual output:** | ||
|
|
||
| ``` | ||
| id=59744ec14e23575012c1... score=0.514 | ||
| id=0b27570192377b7077cd... score=0.481 | ||
| id=11fad68a6ba0d7fa0228... score=0.478 | ||
| id=b33f6f3241c0a61d7118... score=0.477 | ||
| id=8221fa3eb46b7e06e561... score=0.473 | ||
| ``` |
There was a problem hiding this comment.
Semantic search output shows only opaque IDs — function names are missing
The example prints truncated vector IDs and scores, but gives readers no way to identify which functions were matched:
id=59744ec14e23575012c1... score=0.514
As a tutorial, this output is not actionable — a user can't act on an ID alone without knowing the corresponding function name and file path. If the vector_only_search result dict contains those fields (e.g., name, file_path), the example should include them. For instance:
results = cs.vector_only_search('heartbeat periodic wake agent schedule', topk=5)
for r in results:
print(f"{r.get('name', r['id'][:20])} @ {r.get('file_path', '?')} — score={r['score']:.3f}")Even if the API only returns IDs, the tutorial should explain what a reader should do next (e.g., query the graph to resolve the ID to a function node).
| ... | ||
| ``` | ||
|
|
||
| > **Note**: Dead code detection may include external dependencies. Filter by `is_external = 0` for project-specific results. |
There was a problem hiding this comment.
Dead-code filter note lacks a code example
The note advises filtering by is_external = 0 to exclude virtual-environment files, but the tutorial is code-driven and this is exactly the kind of practical pitfall a reader will hit first. Providing the actual query would make this note actionable:
# Filter out external/vendored dependencies
for d in cs.dead_code()[:10]:
if not getattr(d, 'is_external', 0):
print(f"{d.name} @ {d.file_path}")Or, if dead_code() accepts a parameter:
for d in cs.dead_code(is_external=False)[:10]:
print(f"{d.name} @ {d.file_path}")Without a working snippet, the reader must guess the attribute name and the filtering mechanism.
| ```bash | ||
| # Create index (first time) | ||
| codegraph init --repo /path/to/your/project --lang auto --commits 100 | ||
|
|
||
| # Check index status | ||
| codegraph status --db $CODESCOPE_DB_DIR | ||
| ``` |
There was a problem hiding this comment.
1. No modifies backfill enabled 🐞 Bug ✓ Correctness
The tutorial’s codegraph init command omits --backfill-limit, so function-level MODIFIES edges are never computed and evolution features relying on them won’t work (your own sample output shows MODIFIES: 0). This contradicts the documented requirement that MODIFIES edges require backfill.
Agent Prompt
### Issue description
The tutorial instructs running `codegraph init` without `--backfill-limit`, which prevents generating function-level `MODIFIES` edges and breaks evolution workflows that depend on them.
### Issue Context
Repository CodeGraph docs state `MODIFIES` edges require backfill and the CLI supports this via `--backfill-limit`.
### Fix Focus Areas
- doc/source/tutorials/codegraph-openclaw-example.md[52-58]
- doc/source/tutorials/codegraph-openclaw-example.md[74-78]
### What to change
- Update the `codegraph init` example to include an explicit `--backfill-limit` value (or add an adjacent note explaining that without backfill `MODIFIES` remains 0 and evolution queries will be limited).
- Ensure the sample `codegraph status` output is consistent with the updated command (either show non-zero MODIFIES, or explicitly explain why it may be 0).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Co-authored-by: Longbin Lai <longbin.lai@gmail.com>
* add java sdk * add test cases * Update tools/java_driver/USAGE.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tools/java_driver/USAGE.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix some issues * add ClientTest * update doc * fix doc * Update tools/java_driver/pom.xml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tools/java_driver/src/test/java/org/alibaba/neug/driver/InternalResultSetTest.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * format * rename org to com * fix doc * add result metadata * fix * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * add tests * add doc * add maven * Update tools/java_driver/src/main/java/com/alibaba/neug/driver/utils/Client.java Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tools/java_driver/src/main/java/com/alibaba/neug/driver/internal/InternalResultSet.java Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * add e2e ci * add param test * format * Update tools/java_driver/src/main/java/com/alibaba/neug/driver/internal/InternalSession.java Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update InternalSession.java * remove pb generated * fix doc * fix doc * fix doc * fix workflows * fix version * fix generator * fix maven action * fix: catch OSError in neug-cli readline history loading on macOS (#75) * fix: catch OSError in neug-cli readline history loading on macOS On macOS, Python's readline module is backed by libedit instead of GNU readline. When ~/.neug_history was written by a GNU readline session (e.g. from Docker/Linux), libedit raises OSError (errno 22 EINVAL) instead of silently handling the incompatible format. The original code only caught FileNotFoundError, causing neug-cli to crash on startup. Broaden the exception handler to also catch OSError so the history file is simply skipped, matching the intended behavior. Fixes #74 * fix: scope OSError catch to errno.EINVAL for libedit incompatibility Per greptile review: catching the full OSError base class could silently swallow unrelated errors such as PermissionError or IsADirectoryError. Narrow the catch to only suppress errno.EINVAL (22), which is the specific error raised by macOS libedit when it encounters a GNU readline history file. All other OSError variants are re-raised so users see genuine problems. Also add 'import errno' to top-level imports. * Update tools/java_driver/src/main/java/com/alibaba/neug/driver/internal/InternalDriver.java Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix getBigDecimal * Update tools/java_driver/src/main/java/com/alibaba/neug/driver/internal/InternalResultSet.java Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tools/java_driver/src/main/java/com/alibaba/neug/driver/internal/InternalResultSet.java Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix getObject * feat: Support Export Query Results to JSON/JSONL file (#60) * support export arrow table to csv format Committed-by: Xiaoli Zhou from Dev container * export query response PB to csv format Committed-by: Xiaoli Zhou from Dev container * minor fix according to review Committed-by: Xiaoli Zhou from Dev container * fix according to review Committed-by: Xiaoli Zhou from Dev container * minor fix Committed-by: Xiaoli Zhou from Dev container * support export query results to json format Committed-by: Xiaoli Zhou from Dev container * minor fix Committed-by: Xiaoli Zhou from Dev container * remove 'newline_delimited' settings and detect jsonl format from path Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container * minor fix Committed-by: Xiaoli Zhou from Dev container * add export to json tests in CI Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container * Update extension/json/src/json_export_function.cc Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update extension/json/src/json_export_function.cc Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update extension/json/src/json_export_function.cc Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * minor fix Committed-by: Xiaoli Zhou from Dev container * minor fix Committed-by: Xiaoli Zhou from Dev container * refine extension tests anotation Committed-by: Xiaoli Zhou from Dev container * minor fix Committed-by: Xiaoli Zhou from Dev container * rename INSTALL_EXTENSIONS to CI_INSTALL_EXTENSIONS to avoid conflict Committed-by: Xiaoli Zhou from Dev container * refine json extension tests ci Committed-by: Xiaoli Zhou from Dev container * minor fix Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * remove bytearray * add codegraph-qa skill (#78) * fix: Fix default value support for all type of properties (#63) Refactor the default value support for storage, avoid exposing default_value on column and mmap_array --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: Fix incorrect edge table state when transforming between bundled and unbundled (#28) Fix incorrect edge table state when transforming between bundled and unbundled, include special case for string properties * fix: make the dedup operator cover all column types (#80) * make dedup operator cover all column types * format * fix * Correct the is_optional interface behavior for certain columns (#90) * add a codegraph example (#87) Co-authored-by: Longbin Lai <longbin.lai@gmail.com> * add checkRowIndex * add update_was_null * update doc * fix * update doc * fix * Implement the iteration method for QueryResult * update query_result.md * update * update doc * format example * format --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Longbin Lai <longbin.lai@gmail.com> Co-authored-by: Xiaoli Zhou <yihe.zxl@alibaba-inc.com> Co-authored-by: BingqingLyu <bingqing.lbq@alibaba-inc.com> Co-authored-by: Zhang Lei <xiaolei.zl@alibaba-inc.com>
Adds a CodeGraph use case tutorial to the documentation.
Changes
doc/source/tutorials/codegraph-openclaw-example.md: end-to-end walkthrough of CodeGraph with the OpenClaw codebase, covering indexing, CLI usage, Python API, hotspot/bridge/dead-code analysis, and semantic searchdoc/source/index.rstunder the Tutorials sectionGreptile Summary
This PR adds an end-to-end tutorial (
codegraph-openclaw-example.md) for the CodeGraph skill — a code analysis tool built on NeuG and a vector database — and registers it in the documentation's Tutorials toctree. The tutorial walks through installation, CLI usage, Python API queries, and built-in analysis methods (hotspots, bridge functions, dead code, semantic search) using the OpenClaw codebase as a concrete example.Key issues found:
line 229): The section states hotspots are "ranked by fan-in × fan-out," but the example output places functions withfan_out=0(e.g.,pushwith fan_in=1747, fan_out=0 → product=0) at the very top, which contradicts the stated formula. The ranking criterion should be corrected or the formula description should be updated.line 323): The example only prints truncated vector IDs and scores. Readers have no way to identify which functions matched from the output alone — function names and file paths should be shown.line 317): The note recommends filtering byis_external = 0to exclude vendored/virtual-env files, but provides no snippet demonstrating how to apply that filter in practice.Confidence Score: 3/5
index.rstchange is trivial and correct. The tutorial itself is well-structured and covers the feature thoroughly, but the hotspot description contains a demonstrable factual error (items with fan_out=0 cannot rank first under a fan-in × fan-out formula), which would mislead readers about how the scoring actually works. The semantic search and dead-code issues are usability gaps rather than blockers.Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Source Repository] -->|codegraph init| B[Indexing Pipeline] B --> C[(NeuG Graph DB\nFile, Function, Class,\nModule, Commit nodes)] B --> D[(zvec Vector DB\nFunction Embeddings)] C & D --> E[CodeGraph API / CLI] E --> F[CLI Commands] F --> F1[codegraph status] F --> F2[codegraph query NL] F --> F3[codegraph analyze] E --> G[Python API — CodeScope] G --> G1[cs.conn.execute\nCypher Queries] G --> G2[cs.hotspots\nfan-in × fan-out ranking] G --> G3[cs.bridge_functions\ncross-module callers] G --> G4[cs.dead_code\nzero-caller functions] G --> G5[cs.vector_only_search\nsemantic similarity] G1 --> H[Call Chain / Impact Analysis] G2 --> I[Architecture Hotspots] G3 --> J[Bridge Functions Report] G4 --> K[Dead Code Report] G5 --> L[Semantic Search Results]Last reviewed commit: "add a codegraph exam..."