Context
I've been building a tool that consumes .codegraph/ output to understand project structure. The data in .codegraph/ (communities, hub nodes, entry points) is exactly what I need — but I'm reading it by guessing at file names and JSON shapes rather than relying on a documented contract.
The ask
Would you consider documenting a stable schema for .codegraph/index.json (or whichever file is meant to be the primary output)?
Specifically: the fields that are guaranteed to be present, their types, and whether the format is considered stable across versions.
Even a brief note in the README like "The .codegraph/ directory contains X. The top-level keys are Y and Z. This format is stable/experimental." would be enough for downstream tools to consume it reliably.
Why it matters
Right now anyone wanting to build on codegraph output has to reverse-engineer the schema from examples or source code. A documented, stable schema would let:
- IDE extensions index it without re-running the graph
- Documentation tools render architecture summaries
- Context generators (like mine) include structural info in AI context files
- Scripts in CI surface community/hub data without shelling out
Happy to help
If this is something you'd find useful, I'm happy to contribute:
- A JSON Schema or TypeScript type definition for the output format
- A brief section in the README documenting the fields
- A versioning note (e.g. "format is stable as of v0.x")
Just let me know what would be most helpful. Thanks for building codegraph — the community detection approach is genuinely useful for large codebases.
Context
I've been building a tool that consumes
.codegraph/output to understand project structure. The data in.codegraph/(communities, hub nodes, entry points) is exactly what I need — but I'm reading it by guessing at file names and JSON shapes rather than relying on a documented contract.The ask
Would you consider documenting a stable schema for
.codegraph/index.json(or whichever file is meant to be the primary output)?Specifically: the fields that are guaranteed to be present, their types, and whether the format is considered stable across versions.
Even a brief note in the README like "The
.codegraph/directory contains X. The top-level keys are Y and Z. This format is stable/experimental." would be enough for downstream tools to consume it reliably.Why it matters
Right now anyone wanting to build on codegraph output has to reverse-engineer the schema from examples or source code. A documented, stable schema would let:
Happy to help
If this is something you'd find useful, I'm happy to contribute:
Just let me know what would be most helpful. Thanks for building codegraph — the community detection approach is genuinely useful for large codebases.