Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]

### Fixed
- **`codegraph_files` now returns the whole project when an agent passes `path="/"`, `"."`, `"./"`, `""`, or a Windows-style `"\\"` — instead of "No files found matching the criteria."** Indexed file paths are stored as project-relative POSIX (e.g. `src/foo.ts`), but the path filter used a plain `startsWith`, so a leading slash or any of the other root-ish shapes an agent might guess matched nothing and pushed the agent back to Read/Glob — the exact opencode + Gemini Flash regression reported on Windows 11. Subdirectory filters are now equally forgiving: `"/src"`, `"./src"`, `"src/"`, `"src\\components"`, etc. all resolve correctly. Sibling-prefix bleed (`"src"` was previously matching `src-utils/...`) is also fixed — the filter now requires either an exact match or a `<filter>/` boundary. Closes #426.
- **File watcher no longer marks edited files as fresh when another process holds the index lock.** When a second writer (concurrent `codegraph index`, a git hook, another MCP daemon) held `.codegraph/codegraph.lock`, `CodeGraph.sync()` returned a zero-shape no-op instead of throwing. The file watcher took that as a successful sync and cleared `pendingFiles` — so the per-file staleness signal MCP tools surface to agents (issue #403) dropped immediately, even though the edit was never indexed. `CodeGraph.watch()` now converts that no-op into a typed `LockUnavailableError` thrown into the watcher; the existing retry path preserves `pendingFiles` and reschedules until the lock becomes available. The error is logged at debug only (no `onSyncError` callback) so a long-running external indexer doesn't spam stderr every debounce cycle. Closes #449.
- **Watch sync no longer aborts with `FOREIGN KEY constraint failed`.** PR #62 plugged this FK violation at the extraction layer (empty-named nodes whose containment edges had no target), but the same violation kept reappearing on v0.9.5 during the daemon's *watch sync* — not on initial index. Once an agent's daemon had been running long enough to accumulate edits, a resolver lookup that crossed a framework-specific cache could hand back a node whose row had been removed by a recent file rewrite, and the FK check then aborted the entire resolution batch, leaving the user's daemon log filling with `Watch sync failed { error: 'FOREIGN KEY constraint failed' }`. `QueryBuilder.insertEdges` now validates every batch's endpoints against the `nodes` table directly (one fresh `SELECT id IN (...)` per batch, no cache) and silently skips edges with missing source or target — so a stale lookup result drops one edge instead of aborting the whole sync. Surfaces as a fresh `codegraph init`/`index` cycle now surviving its first watch-sync cycle without the FK error, and the daemon recovering naturally instead of compounding into further failures. Closes #455.
- **Hermes Agent: `codegraph install --target hermes` no longer corrupts `~/.hermes/config.yaml`.** Hermes serializes its config with PyYAML's default block style, which writes list items at the *same* indent as the parent mapping key (`cli:` and `- hermes-cli` both at column 2). The previous line-based YAML patcher mistook that first ` - hermes-cli` for the next sibling key, truncated the `cli:` block, and then spliced `- mcp-codegraph` at indent 4 *before* the existing items — leaving subsequent entries (`- browser`, `- clarify`, …) and even other platforms (`telegram:`, `discord:`) appearing at the `platform_toolsets:` level, which is no longer parseable YAML. The installer now recognizes the same-indent list style, finds the real end of the block at the next sibling key, and appends `- mcp-codegraph` at whatever indent the existing items already use. Re-installing on an already-corrupted file (or a 4-space-nested config that worked before) still produces a clean, parseable result. Closes #456.
Expand Down
113 changes: 113 additions & 0 deletions __tests__/mcp-files-path-normalization.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
/**
* codegraph_files path-filter normalization (#426)
*
* Stored file paths are project-relative POSIX (e.g. "src/foo.ts"). Some
* agents pass project-root variants like "/", ".", "./" or "" when they want
* "the whole project", and Windows-style backslashes or leading "/" / "./"
* prefixes when they want a subtree. The old filter used a plain
* `startsWith(pathFilter)`, so any of those buried the agent at "no files
* found" and pushed it back to Read/Glob — the exact opencode regression in
* #426. These tests pin every branch of the normalization.
*/

import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import * as fs from 'fs';
import * as path from 'path';
import * as os from 'os';
import CodeGraph from '../src/index';
import { ToolHandler } from '../src/mcp/tools';

describe('codegraph_files path normalization', () => {
let tempDir: string;
let cg: CodeGraph;
let handler: ToolHandler;

beforeEach(async () => {
tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-files-paths-'));
fs.mkdirSync(path.join(tempDir, 'src', 'components'), { recursive: true });
fs.mkdirSync(path.join(tempDir, 'tests'), { recursive: true });
fs.writeFileSync(path.join(tempDir, 'src', 'index.ts'), `export const x = 1;\n`);
fs.writeFileSync(
path.join(tempDir, 'src', 'components', 'Button.ts'),
`export const Button = () => 1;\n`
);
fs.writeFileSync(path.join(tempDir, 'tests', 'a.test.ts'), `export const t = 1;\n`);
cg = await CodeGraph.init(tempDir, {
config: { include: ['**/*.ts'], exclude: [] },
});
await cg.indexAll();
handler = new ToolHandler(cg);
});

afterEach(() => {
if (cg) cg.destroy();
if (fs.existsSync(tempDir)) {
fs.rmSync(tempDir, { recursive: true, force: true });
}
});

async function listed(pathFilter: string | undefined): Promise<string> {
const result = await handler.execute('codegraph_files', {
...(pathFilter !== undefined ? { path: pathFilter } : {}),
format: 'flat',
includeMetadata: false,
});
expect(result.isError).toBeFalsy();
return result.content[0]!.text as string;
}

// Root-ish filters: every shape an agent might guess for "whole project"
// must list the same files as no filter at all.
for (const rootish of ['/', '.', './', '', '\\', '//', './/']) {
it(`treats path=${JSON.stringify(rootish)} as project root`, async () => {
const output = await listed(rootish);
expect(output).toContain('src/index.ts');
expect(output).toContain('src/components/Button.ts');
expect(output).toContain('tests/a.test.ts');
});
}

it('matches a real subdirectory prefix', async () => {
const output = await listed('src');
expect(output).toContain('src/index.ts');
expect(output).toContain('src/components/Button.ts');
expect(output).not.toContain('tests/a.test.ts');
});

it('tolerates a leading slash on a real subdirectory', async () => {
const output = await listed('/src');
expect(output).toContain('src/index.ts');
expect(output).not.toContain('tests/a.test.ts');
});

it('tolerates a leading "./" on a real subdirectory', async () => {
const output = await listed('./src');
expect(output).toContain('src/index.ts');
expect(output).not.toContain('tests/a.test.ts');
});

it('tolerates a trailing slash on a real subdirectory', async () => {
const output = await listed('src/');
expect(output).toContain('src/index.ts');
expect(output).not.toContain('tests/a.test.ts');
});

it('normalizes Windows backslashes', async () => {
const output = await listed('src\\components');
expect(output).toContain('src/components/Button.ts');
expect(output).not.toContain('src/index.ts');
});

// Old code matched on raw `startsWith`, so a filter "src" would also
// return a sibling like "src-utils/...". The new code requires either an
// exact match or a "<filter>/" boundary, so prefixes don't bleed.
it('does not match sibling directories that share a prefix', async () => {
fs.mkdirSync(path.join(tempDir, 'src-utils'), { recursive: true });
fs.writeFileSync(path.join(tempDir, 'src-utils', 'helper.ts'), `export const h = 1;\n`);
await cg.indexAll();

const output = await listed('src');
expect(output).toContain('src/index.ts');
expect(output).not.toContain('src-utils/helper.ts');
});
});
17 changes: 14 additions & 3 deletions src/mcp/tools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2248,9 +2248,20 @@ export class ToolHandler {
return this.textResult('No files indexed. Run `codegraph index` first.');
}

// Filter by path prefix
let files = pathFilter
? allFiles.filter(f => f.path.startsWith(pathFilter) || f.path.startsWith('./' + pathFilter))
// Filter by path prefix. Stored paths are project-relative POSIX (e.g.
// "src/foo.ts"), but agents commonly pass project-root variants like "/",
// ".", "./", "" or Windows-style "src\foo" — and prefixes with leading
// "/", "./" or "\". Normalize all of those before matching so the agent
// gets results instead of falling back to Read/Glob (see #426).
const normalizedFilter = pathFilter
? pathFilter
.replace(/\\/g, '/')
.replace(/^(?:\.?\/+)+/, '')
.replace(/^\.$/, '')
.replace(/\/+$/, '')
: '';
let files = normalizedFilter
? allFiles.filter(f => f.path === normalizedFilter || f.path.startsWith(normalizedFilter + '/'))
: allFiles;

// Filter by glob pattern
Expand Down