feat(graph,encoder): persist DependencyEdge symbols and add cross-boundary excerpts by amondnet · Pull Request #156 · pleaseai/soop

amondnet · 2026-03-04T02:43:50Z

Summary

DependencyEdge symbol persistence (packages/graph/src/adapters.ts): edgeToAttrs() now persists symbol → dep_symbol and targetSymbol → dep_target_symbol; attrsToEdge() restores them on round-trip
Cross-boundary code excerpts (packages/encoder/src/encoder.ts): new getCrossBoundaryExcerpts() method reads actual import/call lines from source files for cross-area dependency edges; wired into analyzeCrossAreaDataFlows() replacing the empty '' TODO

Test plan

Round-trip test: symbol/targetSymbol survive serialize → deserialize
Cross-boundary excerpts: positive case with temp files, empty case with same-area edges
bun run typecheck — clean
Graph tests: 18/18 pass

Closes #150, closes #153

Summary by cubic

Persist DependencyEdge symbols across serialize/deserialize and add safe cross-area code excerpts in the encoder with symlink-aware path checks and backtick escaping, fulfilling #150 and #153.

New Features
- Graph adapters serialize/deserialize dep_symbol and dep_target_symbol with null checks to preserve empty strings.
- Encoder adds getCrossBoundaryExcerpts and uses it in analyzeCrossAreaDataFlows to read import/call lines for up to 50 cross-area edges, with memoized area lookup, path traversal and symlink containment checks, 300-char truncation, and warning logs.
Bug Fixes
- Adapters use null checks for dependencyType and now throw on unknown edge types to avoid silent fallthrough.
- Removed DataFlowEdge handling on this branch where data_flow edges don’t exist.
- Escaped backticks in cross-boundary excerpts to prevent prompt injection in markdown prompts.

^{Written for commit 5f98480. Summary will update on new commits.}

gemini-code-assist · 2026-03-04T02:44:07Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the repository planning graph's capabilities by improving how dependency information is stored and presented. It introduces the persistence of symbol and targetSymbol attributes for dependency edges, ensuring richer graph data. Additionally, it implements a mechanism to generate concrete code excerpts for cross-boundary dependencies, providing more actionable insights into how different domain areas interact.

Highlights

Dependency Edge Symbol Persistence: The symbol and targetSymbol properties of DependencyEdge objects are now correctly serialized and deserialized, ensuring their persistence across graph operations.
Cross-Boundary Code Excerpt Generation: A new method, getCrossBoundaryExcerpts, has been introduced to extract actual code lines for dependency edges that span different domain areas, replacing a previous placeholder.

Changelog

packages/encoder/src/encoder.ts
- Implemented getCrossBoundaryExcerpts to read import/call lines for cross-area dependency edges.
- Integrated getCrossBoundaryExcerpts into analyzeCrossAreaDataFlows to populate cross-boundary code excerpts.
packages/encoder/tests/encoder.test.ts
- Added a test suite for RPGEncoder.getCrossBoundaryExcerpts to verify its functionality.
- Included tests for both successful excerpt generation and cases where no cross-area edges exist.
packages/graph/src/adapters.ts
- Updated edgeToAttrs to serialize symbol and targetSymbol from DependencyEdge to dep_symbol and dep_target_symbol attributes.
- Updated attrsToEdge to deserialize dep_symbol and dep_target_symbol attributes back into symbol and targetSymbol for DependencyEdge.
packages/graph/tests/graph.test.ts
- Added a test case to confirm that symbol and targetSymbol properties of DependencyEdge objects are preserved during graph serialization and deserialization.

Activity

Verified symbol/targetSymbol persistence through a round-trip test.
Confirmed cross-boundary excerpts generation with positive and empty test cases.
Ensured type-checking passes (bun run typecheck).
Confirmed all graph tests pass (18/18).

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

cubic-dev-ai

3 issues found across 4 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/graph/src/adapters.ts">

<violation number="1" location="packages/graph/src/adapters.ts:136">
P2: Using truthy checks for `symbol`/`targetSymbol` can silently drop valid empty-string values and break exact serialize→deserialize round-trips. Use null/undefined checks instead of truthiness.</violation>
</file>

<file name="packages/encoder/tests/encoder.test.ts">

<violation number="1" location="packages/encoder/tests/encoder.test.ts:630">
P2: Avoid hardcoded `/tmp/...` paths in tests; derive a temp path via `os.tmpdir()`/`path.join(...)` so the test runs consistently across platforms.

(Based on your team's feedback about avoiding hardcoded absolute paths in tests.) [FEEDBACK_USED]</violation>
</file>

<file name="packages/encoder/src/encoder.ts">

<violation number="1" location="packages/encoder/src/encoder.ts:960">
P2: Do not silently swallow excerpt read failures; log a non-fatal warning so partial analysis degradation is observable.

(Based on your team's feedback about avoiding silent catch-block failures in recoverable paths.) [FEEDBACK_USED]</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/graph/src/adapters.ts

packages/encoder/tests/encoder.test.ts

packages/encoder/src/encoder.ts

gemini-code-assist

Code Review

This pull request introduces two main features: persisting symbol and targetSymbol for DependencyEdge during serialization, and implementing the generation of cross-boundary code excerpts for dependency analysis. While the new getCrossBoundaryExcerpts method enhances data flow analysis, it introduces two critical security concerns: a potential path traversal vulnerability via symbolic links and a risk of indirect prompt injection by including unsanitized file content in LLM prompts. These issues must be addressed to ensure the tool can safely handle untrusted repositories. Furthermore, there is an opportunity for performance improvement in getCrossBoundaryExcerpts that could benefit large repositories.

packages/encoder/src/encoder.ts

…ndary excerpts - Add symbol/targetSymbol persistence in edgeToAttrs/attrsToEdge adapters so these fields survive serialize→deserialize round-trips (#150) - Add getCrossBoundaryExcerpts() to RPGEncoder that reads import/call lines from source files for cross-area dependency edges, providing real code context to the LLM data-flow analysis prompt (#153) - Add round-trip test for symbol/targetSymbol in graph.test.ts - Add tests for cross-boundary excerpts in encoder.test.ts Closes #150 Closes #153

The data_flow edge type doesn't exist on this branch (it's added in the DataFlowEdge migration PR). Remove the DataFlowEdge import and attrs handling that was incorrectly included, keeping only the DependencyEdge symbol persistence changes (dep_symbol, dep_target_symbol).

- Use null checks instead of truthy for DependencyEdge symbol/targetSymbol in adapters.ts to avoid silently dropping empty-string values - Replace hardcoded /tmp/test path with os.tmpdir() in encoder tests for cross-platform consistency - Add warning log to catch block in getCrossBoundaryExcerpts instead of silently swallowing read failures - Add memoization cache to findDomainArea for performance improvement on large repositories

- Add path traversal and symlink containment check in getCrossBoundaryExcerpts using path.resolve pre-check and realpath verification, hoisted outside loop - Truncate source code lines at 300 chars before embedding in LLM prompt - Extract magic number 50 to named MAX_CROSS_BOUNDARY_EXCERPTS constant - Fix dependencyType guard from truthy to null-check in edgeToAttrs - Change attrsToEdge to throw on unknown edge types instead of silent fallthrough - Remove unused createLogger import from adapters.ts - Add direct unit tests for edgeToAttrs/attrsToEdge covering symbol fields - Add test for file-read failure path in getCrossBoundaryExcerpts - Strengthen output format assertion with regex in excerpt tests - Fix cross-platform temp path using Date.now() suffix in tests

cubic-dev-ai

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/encoder/src/encoder.ts">

<violation number="1" location="packages/encoder/src/encoder.ts:956">
P1: Path traversal pre-check is broken when `repoPath` is a symlink: `preResolved` uses `path.resolve` (no symlink resolution) but is compared against `resolvedRepo` from `realpath` (symlinks resolved). When `repoPath` is itself a symlink, the two will never share a prefix, causing all legitimate files to be silently skipped. The pre-check should compare against `path.resolve(repoPath)` instead of `resolvedRepo`.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/encoder/src/encoder.ts

The pre-check compared preResolved (path.resolve, no symlink resolution) against resolvedRepo (realpath, symlinks resolved). When repoPath itself is a symlink, the two never share a prefix, causing all legitimate files to be silently skipped. Fix: compare preResolved against path.resolve(repoPath) instead of resolvedRepo so both sides use consistent non-symlink resolution. The actual symlink boundary check (after realpath) remains unchanged. Addresses P1 review from cubic-dev-ai on PR #156.

… prompt injection Source code lines passed directly to LLM prompts could contain backtick sequences that break out of markdown code spans or embed instructions. Escape backticks in the excerpt after truncation so the string is safe to embed in a markdown-formatted LLM prompt. Addresses security feedback from gemini-code-assist on PR #156.

sonarqubecloud · 2026-03-04T11:19:25Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cubic-dev-ai bot reviewed Mar 4, 2026

View reviewed changes

packages/graph/src/adapters.ts Outdated Show resolved Hide resolved

packages/encoder/tests/encoder.test.ts Outdated Show resolved Hide resolved

packages/encoder/src/encoder.ts Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Mar 4, 2026

View reviewed changes

packages/encoder/src/encoder.ts Outdated Show resolved Hide resolved

packages/encoder/src/encoder.ts Outdated Show resolved Hide resolved

packages/encoder/src/encoder.ts Outdated Show resolved Hide resolved

amondnet added 2 commits March 4, 2026 16:32

amondnet force-pushed the feat/dependency-edge-symbols-cross-excerpts branch from 019c730 to 02c5dc1 Compare March 4, 2026 07:41

This was referenced Mar 4, 2026

Security: validate symlink paths in getCrossBoundaryExcerpts against repoPath #158

Open

Security: consider prompt injection risk from source code in getCrossBoundaryExcerpts #159

Open

cubic-dev-ai bot reviewed Mar 4, 2026

View reviewed changes

packages/encoder/src/encoder.ts Outdated Show resolved Hide resolved

amondnet self-assigned this Mar 4, 2026

amondnet merged commit 0fadc4f into main Mar 4, 2026
7 checks passed

amondnet deleted the feat/dependency-edge-symbols-cross-excerpts branch March 4, 2026 11:23

pleaeai-bot bot mentioned this pull request Mar 4, 2026

chore: release #157

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(graph,encoder): persist DependencyEdge symbols and add cross-boundary excerpts#156

feat(graph,encoder): persist DependencyEdge symbols and add cross-boundary excerpts#156
amondnet merged 6 commits intomainfrom
feat/dependency-edge-symbols-cross-excerpts

amondnet commented Mar 4, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amondnet commented Mar 4, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by cubic

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 4, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

amondnet commented Mar 4, 2026 •

edited by cubic-dev-ai bot

Loading