Graph-informed structural refactoring control plane for Python codebases.
codescaffold helps coding agents and humans reorganize Python projects safely by combining repository graph analysis, explicit refactor plans, mechanical rope/LibCST rewrites, sandboxed git worktrees, staged validation, and optional import-linter contracts.
It is not a fully autonomous architecture fixer. It is a control plane that makes large refactors inspectable, reviewable, and mechanically safer.
Coding agents are good at judgment-heavy work:
- deciding names
- interpreting intent
- explaining architecture
- writing documentation
- resolving ambiguous design choices
They are weaker at structural bookkeeping:
- remembering every import edge
- tracking file moves across a repo
- updating references consistently
- preserving package importability
- enforcing architecture after a refactor
- safely rolling back failed changes
codescaffold is designed to cover that mechanical side.
graphify / graph data
↓
codescaffold analysis
↓
reviewable placement decisions
↓
approved file/symbol moves
↓
rope / LibCST rewrites
↓
sandbox validation
↓
contracts / audit trail
Experimental but functional.
The current project focus is:
- MCP-first workflow
- sandboxed structural refactoring
- graph-derived placement evidence
- staged validation
- import-linter contract generation
- agent-assisted rename/docstring workflow
- better use of graphify graph data beyond simple community-to-file grouping
codescaffold can:
- analyze a Python repository graph
- identify communities/clusters in the codebase
- surface placement decisions for an agent to review
- show graph evidence for clusters and symbols
- approve selected file moves
- apply approved moves in a git worktree sandbox
- create required package
__init__.pyfiles - rewrite imports after moves
- validate structurally and behaviorally
- apply package/module/symbol rename maps
- insert or replace docstrings
- generate import-linter contracts
- validate contracts
- merge or discard sandbox branches
codescaffold does not try to infer perfect architecture by itself.
It should not be used as:
- a blind auto-refactor button
- a replacement for code review
- proof that graph communities are architecturally correct
- proof that similar-looking functions are semantically equivalent
- a substitute for human/API compatibility judgment
The graph is evidence, not authority.
Requires Python 3.11+.
git clone https://github.com/Xopher00/codescaffold.git
cd codescaffold
pip install -e ".[dev]"The package installs the MCP server entry point:
codescaffold-mcpRegister codescaffold-mcp with your MCP-capable coding agent.
Example MCP server configuration:
{
"mcpServers": {
"codescaffold": {
"command": "codescaffold-mcp",
"args": []
}
}
}Exact configuration depends on the agent or client you use.
analyze
get_cluster_context
approve_moves
apply
apply_rename_map
validate
merge_sandbox
discard_sandbox
reset
contracts
validate_contracts
update_contract
propose_violation_fix
| Tool | Purpose |
|---|---|
analyze |
Run graphify, propose move candidates, persist a plan |
get_cluster_context |
Show graph evidence for a community/cluster |
approve_moves |
Record agent-approved moves into the persisted plan |
apply |
Execute approved moves in a sandboxed worktree; runs compileall + pytest |
apply_rename_map |
Batch-rename symbols/modules across the repo in a sandbox (single rope session) |
validate |
Re-run compileall + pytest inside an existing sandbox branch |
merge_sandbox |
Merge a completed sandbox branch into HEAD with --no-ff |
discard_sandbox |
Discard a sandbox branch and remove its worktree |
reset |
Delete the persisted plan and audit records |
contracts |
Generate .importlinter from current graph layers; surface cycle-break moves if cyclic |
validate_contracts |
Run lint-imports and return a formatted pass/fail report |
update_contract |
Regenerate .importlinter from a sandbox's post-move state |
propose_violation_fix |
Suggest alternative move targets that satisfy the existing contract layers |
1. analyze repository
2. inspect cluster context
3. approve selected moves
4. apply approved moves in sandbox
5. inspect result
6. apply rename map if placeholder names remain
7. validate
8. generate / validate contracts
9. merge or discard sandbox
A typical agent-guided flow:
analyze
→ get_cluster_context
→ approve_moves
→ apply
→ apply_rename_map
→ validate
→ contracts
→ validate_contracts
→ merge_sandbox
The agent should make placement and naming decisions. codescaffold should perform the mechanical work.
Destructive operations default to sandbox mode.
The sandbox mechanism uses git worktrees under paths like:
/tmp/codescaffold_<timestamp>
The intended behavior is:
create worktree branch
→ apply approved changes
→ validate
→ commit branch on success
→ keep branch for review/merge
→ discard on failure if requested
This makes large refactors auditable and reversible.
Validation is split into phases.
Structural validation:
compileall
syntax/import-shape checks where safe
Installability validation:
import smoke checks
package import checks
entry point import checks where applicable
Behavioral validation:
pytest or project test suite, when present
A project does not need to have tests for codescaffold to perform useful structural validation. However, existing human-written tests remain the strongest behavioral signal.
Generated tests, if added later, should be treated as smoke or characterization scaffolding, not as proof of correctness.
codescaffold uses graph-derived structure as the perception layer.
Graph evidence may include:
- files
- symbols
- communities
- imports
- calls
- source locations
- relation types
- edge confidence
- god/high-degree nodes
- bridge nodes
- cross-cluster edges
- surprising connections
- shortest paths
The project should not reduce graph data to only:
community_id → files
The goal is to expose graph evidence in a way that helps agents make better placement and naming decisions.
Current directory layout is history, not proof of correct architecture.
A cluster should not be considered correct merely because its files are already co-located.
Placement decisions should consider:
- internal cohesion
- dependency direction
- incoming vs outgoing edges
- relation types
- bridge files
- god nodes
- cross-cluster coupling
- surprising connections
- import cycles
- edge confidence
Useful distinction:
[co-located] means files are currently together.
It does not mean they are correctly placed.
codescaffold can generate import-linter contracts from graph-derived structure.
Supported contract concepts include:
- forbidden imports
- layers
- independence
Contracts are intended to turn discovered structure into enforceable architecture.
Generated contracts are durable architecture guards, not temporary sandbox artifacts. They survive merge and are validated after every move or rename via validate_contracts. Use update_contract to regenerate them after a sandbox changes the layer structure.
Initial structural moves may use neutral placeholder names such as:
pkg_000
mod_000.py
This keeps structural placement separate from semantic naming.
After the structural move succeeds, an agent can inspect graph context and apply a rename map through apply_rename_map.
This allows the agent to handle judgment-heavy naming while codescaffold handles mechanical rename and import updates.
Docstrings follow the same pattern:
agent writes or revises docstring
→ codescaffold inserts it mechanically
codescaffold writes reviewable artifacts such as:
.importlinter — import-linter contract (durable, survives merge)
<repo>/.claude/plans/ — persisted plan JSON (graph hash + approved moves/renames)
Exact artifacts may vary by workflow stage.
Generated artifacts should make the refactor auditable:
- what graph was used
- what moves were proposed
- what moves were approved
- what validation passed
- what contracts were generated
- what branch was produced
- what still needs review
-
Graph evidence first Use repository structure, not vibes, to identify candidate boundaries.
-
Agent judgment where needed Let the agent decide names, intent, and ambiguous placement.
-
Mechanical changes through tools Use rope and LibCST for deterministic edits.
-
Sandbox before merge Destructive changes should happen in a git worktree branch first.
-
Validation in phases Do not fuse file moves, import rewrites, package creation, and pytest into one opaque step.
-
Contracts should not go stale If contracts are generated, they need a lifecycle: generate, validate, refresh, preserve or explicitly discard.
-
No hidden architecture assumptions Source roots, test paths, and package layout should be detected from config where possible and made explicit where not.
Install dev dependencies:
pip install -e ".[dev]"Run tests:
pytestRun linting if configured:
ruff check .Run the MCP server locally:
codescaffold-mcpsrc/codescaffold/
mcp/ thin MCP interface (tools.py, server.py)
graphify/ graphify integration and graph snapshots
candidates/ graph-informed move candidate proposals
bridge/ rope preflight resolution (Layer 1)
plans/ plan schema, lifecycle, approval, staleness
operations/ typed mechanical Rope operations
sandbox/ git worktree isolation
validation/ compileall + pytest + import-linter checks
contracts/ import-linter contract generation and violation recovery (grimp-based DAG)
audit/ result summaries and durable records
Near-term:
- Pynguin test generation —
generate_tests(source_file, repo_path)MCP tool wrapping pynguin for automatic test generation. Design constraints: run pynguin via subprocess into the repo's own venv (never import it into the MCP process — it instruments modules at runtime); write generated tests into a sandbox worktree so the agent can review and approve before committing, mirroring theapply → audit → merge_sandboxflow. - richer
get_cluster_context— show inter-community edge weights and dominant source files
Later:
- contract staleness detection and incremental update
- docstring insertion (LibCST-based)
- rollback/manifest for applied moves
- read-only duplicate-logic reports
MIT.
::contentReference[oaicite:1]{index=1}