codescaffold

Graph-informed structural refactoring control plane for Python codebases.

codescaffold helps coding agents and humans reorganize Python projects safely by combining repository graph analysis, explicit refactor plans, mechanical rope/LibCST rewrites, sandboxed git worktrees, staged validation, and optional import-linter contracts.

It is not a fully autonomous architecture fixer. It is a control plane that makes large refactors inspectable, reviewable, and mechanically safer.

Core idea

Coding agents are good at judgment-heavy work:

deciding names
interpreting intent
explaining architecture
writing documentation
resolving ambiguous design choices

They are weaker at structural bookkeeping:

remembering every import edge
tracking file moves across a repo
updating references consistently
preserving package importability
enforcing architecture after a refactor
safely rolling back failed changes

codescaffold is designed to cover that mechanical side.

graphify / graph data
    ↓
codescaffold analysis
    ↓
reviewable placement decisions
    ↓
approved file/symbol moves
    ↓
rope / LibCST rewrites
    ↓
sandbox validation
    ↓
contracts / audit trail

Status

Experimental but functional.

The current project focus is:

MCP-first workflow
sandboxed structural refactoring
graph-derived placement evidence
staged validation
import-linter contract generation
agent-assisted rename/docstring workflow
better use of graphify graph data beyond simple community-to-file grouping

What it does

codescaffold can:

analyze a Python repository graph
identify communities/clusters in the codebase
surface placement decisions for an agent to review
show graph evidence for clusters and symbols
approve selected file moves
apply approved moves in a git worktree sandbox
create required package __init__.py files
rewrite imports after moves
validate structurally and behaviorally
apply package/module/symbol rename maps
insert or replace docstrings
generate import-linter contracts
validate contracts
merge or discard sandbox branches

What it is not

codescaffold does not try to infer perfect architecture by itself.

It should not be used as:

a blind auto-refactor button
a replacement for code review
proof that graph communities are architecturally correct
proof that similar-looking functions are semantically equivalent
a substitute for human/API compatibility judgment

The graph is evidence, not authority.

Installation

Requires Python 3.11+.

git clone https://github.com/Xopher00/codescaffold.git
cd codescaffold
pip install -e ".[dev]"

The package installs the MCP server entry point:

codescaffold-mcp

MCP usage

Register codescaffold-mcp with your MCP-capable coding agent.

Example MCP server configuration:

{
  "mcpServers": {
    "codescaffold": {
      "command": "codescaffold-mcp",
      "args": []
    }
  }
}

Exact configuration depends on the agent or client you use.

Current MCP tools

analyze
get_cluster_context
approve_moves
apply
apply_rename_map
validate
merge_sandbox
discard_sandbox
reset
contracts
validate_contracts
update_contract
propose_violation_fix

Tool roles

Tool	Purpose
`analyze`	Run graphify, propose move candidates, persist a plan
`get_cluster_context`	Show graph evidence for a community/cluster
`approve_moves`	Record agent-approved moves into the persisted plan
`apply`	Execute approved moves in a sandboxed worktree; runs compileall + pytest
`apply_rename_map`	Batch-rename symbols/modules across the repo in a sandbox (single rope session)
`validate`	Re-run compileall + pytest inside an existing sandbox branch
`merge_sandbox`	Merge a completed sandbox branch into HEAD with --no-ff
`discard_sandbox`	Discard a sandbox branch and remove its worktree
`reset`	Delete the persisted plan and audit records
`contracts`	Generate `.importlinter` from current graph layers; surface cycle-break moves if cyclic
`validate_contracts`	Run lint-imports and return a formatted pass/fail report
`update_contract`	Regenerate `.importlinter` from a sandbox's post-move state
`propose_violation_fix`	Suggest alternative move targets that satisfy the existing contract layers

Typical workflow

1. analyze repository
2. inspect cluster context
3. approve selected moves
4. apply approved moves in sandbox
5. inspect result
6. apply rename map if placeholder names remain
7. validate
8. generate / validate contracts
9. merge or discard sandbox

A typical agent-guided flow:

analyze
→ get_cluster_context
→ approve_moves
→ apply
→ apply_rename_map
→ validate
→ contracts
→ validate_contracts
→ merge_sandbox

The agent should make placement and naming decisions. codescaffold should perform the mechanical work.

Sandboxed apply model

Destructive operations default to sandbox mode.

The sandbox mechanism uses git worktrees under paths like:

/tmp/codescaffold_<timestamp>

The intended behavior is:

create worktree branch
→ apply approved changes
→ validate
→ commit branch on success
→ keep branch for review/merge
→ discard on failure if requested

This makes large refactors auditable and reversible.

Staged validation

Validation is split into phases.

Structural validation:

compileall
syntax/import-shape checks where safe

Installability validation:

import smoke checks
package import checks
entry point import checks where applicable

Behavioral validation:

pytest or project test suite, when present

A project does not need to have tests for codescaffold to perform useful structural validation. However, existing human-written tests remain the strongest behavioral signal.

Generated tests, if added later, should be treated as smoke or characterization scaffolding, not as proof of correctness.

Graphify integration

codescaffold uses graph-derived structure as the perception layer.

Graph evidence may include:

files
symbols
communities
imports
calls
source locations
relation types
edge confidence
god/high-degree nodes
bridge nodes
cross-cluster edges
surprising connections
shortest paths

The project should not reduce graph data to only:

community_id → files

The goal is to expose graph evidence in a way that helps agents make better placement and naming decisions.

Placement review principles

Current directory layout is history, not proof of correct architecture.

A cluster should not be considered correct merely because its files are already co-located.

Placement decisions should consider:

internal cohesion
dependency direction
incoming vs outgoing edges
relation types
bridge files
god nodes
cross-cluster coupling
surprising connections
import cycles
edge confidence

Useful distinction:

[co-located] means files are currently together.
It does not mean they are correctly placed.

Import-linter contracts

codescaffold can generate import-linter contracts from graph-derived structure.

Supported contract concepts include:

forbidden imports
layers
independence

Contracts are intended to turn discovered structure into enforceable architecture.

Generated contracts are durable architecture guards, not temporary sandbox artifacts. They survive merge and are validated after every move or rename via validate_contracts. Use update_contract to regenerate them after a sandbox changes the layer structure.

Rename and docstring workflow

Initial structural moves may use neutral placeholder names such as:

pkg_000
mod_000.py

This keeps structural placement separate from semantic naming.

After the structural move succeeds, an agent can inspect graph context and apply a rename map through apply_rename_map.

This allows the agent to handle judgment-heavy naming while codescaffold handles mechanical rename and import updates.

Docstrings follow the same pattern:

agent writes or revises docstring
→ codescaffold inserts it mechanically

Generated artifacts

codescaffold writes reviewable artifacts such as:

.importlinter          — import-linter contract (durable, survives merge)
<repo>/.claude/plans/  — persisted plan JSON (graph hash + approved moves/renames)

Exact artifacts may vary by workflow stage.

Generated artifacts should make the refactor auditable:

what graph was used
what moves were proposed
what moves were approved
what validation passed
what contracts were generated
what branch was produced
what still needs review

Design principles

Graph evidence first Use repository structure, not vibes, to identify candidate boundaries.
Agent judgment where needed Let the agent decide names, intent, and ambiguous placement.
Mechanical changes through tools Use rope and LibCST for deterministic edits.
Sandbox before merge Destructive changes should happen in a git worktree branch first.
Validation in phases Do not fuse file moves, import rewrites, package creation, and pytest into one opaque step.
Contracts should not go stale If contracts are generated, they need a lifecycle: generate, validate, refresh, preserve or explicitly discard.
No hidden architecture assumptions Source roots, test paths, and package layout should be detected from config where possible and made explicit where not.

Development

Install dev dependencies:

pip install -e ".[dev]"

Run tests:

pytest

Run linting if configured:

ruff check .

Run the MCP server locally:

codescaffold-mcp

Project layout

src/codescaffold/
    mcp/           thin MCP interface (tools.py, server.py)
    graphify/      graphify integration and graph snapshots
    candidates/    graph-informed move candidate proposals
    bridge/        rope preflight resolution (Layer 1)
    plans/         plan schema, lifecycle, approval, staleness
    operations/    typed mechanical Rope operations
    sandbox/       git worktree isolation
    validation/    compileall + pytest + import-linter checks
    contracts/     import-linter contract generation and violation recovery (grimp-based DAG)
    audit/         result summaries and durable records

Roadmap

Near-term:

Pynguin test generation — generate_tests(source_file, repo_path) MCP tool wrapping pynguin for automatic test generation. Design constraints: run pynguin via subprocess into the repo's own venv (never import it into the MCP process — it instruments modules at runtime); write generated tests into a sandbox worktree so the agent can review and approve before committing, mirroring the apply → audit → merge_sandbox flow.
richer get_cluster_context — show inter-community edge weights and dominant source files

Later:

contract staleness detection and incremental update
docstring insertion (LibCST-based)
rollback/manifest for applied moves
read-only duplicate-logic reports

License

MIT.

::contentReference[oaicite:1]{index=1}

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
docs		docs
src/codescaffold		src/codescaffold
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
STATUS.md		STATUS.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codescaffold

Core idea

Status

What it does

What it is not

Installation

MCP usage

Current MCP tools

Tool roles

Typical workflow

Sandboxed apply model

Staged validation

Graphify integration

Placement review principles

Import-linter contracts

Rename and docstring workflow

Generated artifacts

Design principles

Development

Project layout

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

codescaffold

Core idea

Status

What it does

What it is not

Installation

MCP usage

Current MCP tools

Tool roles

Typical workflow

Sandboxed apply model

Staged validation

Graphify integration

Placement review principles

Import-linter contracts

Rename and docstring workflow

Generated artifacts

Design principles

Development

Project layout

Roadmap

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages