The source of truth for codebase understanding & governance.
This document is the primary technical reference for CodePlug (@dinyangetoh/codeplug-cli), a professional-grade, local-first CLI tool designed to detect, enforce, and document coding conventions across polyglot codebases. It covers installation, configuration, usage commands, and architecture to help developers integrate CodePlug into their workflows. CodePlug combines signal extraction with on-device ML inference to provide automated convention governance, semantic drift detection, living documentation generation, and structured AI agent context export.
CodePlug acts as the source of truth for codebase understanding. It uses signal extractors and on-device ML models (via @huggingface/transformers, with no Python dependency) to identify patterns across naming, structure, frameworks, components, and semantic role/layer behavior. It generates "living documentation" and structured rule files for AI agents (Claude, Cursor, Copilot), and provides a compliance scoring engine to prevent architectural drift over time.
- Multi-Dimensional AST Analysis: Six to seven specialized visitors identify naming patterns, component styles, test organization, error handling, and import conventions.
- Baseline Lifecycle: Supports observed → normalized → locked baseline flow to separate inferred behavior from enforced policy.
- Drift & Compliance Scoring: Classifies git diffs against stored conventions, scores compliance with severity-weighted metrics, and tracks trends over time.
- Semantic Drift (ML-first): Uses code embeddings and role centroids to flag role drift, layer leakage, and abstraction mismatch.
- Auto-Fix: Supports automated renaming and provides guided remediation for complex violations.
- Living Documentation: Automatically maintains
ARCHITECTURE.md,CONVENTIONS.md, andONBOARDING.mdusing ML summarization pipelines (BART, BERT). - AI Agent Export: Generates context files including
CLAUDE.md,.cursor/rules, and.github/copilot-instructions.md. - CI/CD Ready: Exports SARIF-format reports for integration with GitHub Actions and other CI pipelines.
| Category | Feature | Command | Notes |
|---|---|---|---|
| CLI | Interactive Wizard | codeplug start |
Single entry point for all menus |
| Audit | Compliance Score | codeplug convention score |
Severity-weighted, time-series tracking |
| Baseline | Normalize & Lock | codeplug convention baseline normalize/lock |
Observed vs enforced convention lifecycle |
| Explainability | Finding Evidence | codeplug convention explain <id> |
Shows ML confidence and evidence payload |
| Docs | ML Generation | codeplug docs generate |
5 core docs; optional LLM enhancement |
| Export | AI Context | codeplug export --all |
Supports Claude, Cursor, Copilot, JSON |
| Config | Model Validation | codeplug config validate-models |
Verifies local HF model-role readiness |
| Requirement | Version | Notes |
|---|---|---|
| Node.js | >=20.0.0 |
Required for native fetch and modern ESM support |
| Git | Any modern version | Required for drift detection and history analysis |
| Ollama | Optional | For local LLM-enhanced documentation; any OpenAI-compatible provider is also supported |
npm install -g @dinyangetoh/codeplug-cligit clone <repository-url>
cd codeplug
npm install
npm run build
node dist/cli/index.js --help# Initialize and detect conventions in your project
cd your-project
codeplug convention init
# Run a compliance audit
codeplug convention audit
# Normalize and lock your baseline policy
codeplug convention baseline normalize
codeplug convention baseline lock
# Launch the interactive wizard
codeplug startThe codeplug start command provides a guided interface organized into four areas:
| Area | Description |
|---|---|
| Config | Manage LLM providers, API keys, and model tiers |
| Convention | Initialize detection, run audits, and apply auto-fixes |
| Docs | Generate and update stale documentation |
| Export | Refresh AI agent rule files |
# Audit changes from the last 7 days
codeplug convention audit --since 7d
# CI mode — exits non-zero if score is below threshold
codeplug convention audit --ci
# Explain a specific finding with model evidence
codeplug convention explain <violation-id>
# Validate all configured local model roles
codeplug config validate-models
# Generate docs for a junior audience in concise style
codeplug docs generate --audience junior --style concise
# Export rules for Cursor
codeplug export --target cursor
# Export all AI agent context files
codeplug export --allCodePlug stores project-level settings in .codeplug/config.json.
Convention detection and drift analysis run on-device. Documentation prose generation can optionally be enhanced via a local or cloud LLM provider:
- Local: Ollama (default)
- Cloud: OpenAI, Anthropic, Gemini, OpenRouter, Groq, DeepSeek, Grok
codeplug config set llm.provider openai
codeplug config set llm.apiKey sk-...| Tier | RAM | Disk | Use Case |
|---|---|---|---|
| Default | 8 GB+ | ~1.2 GB | Production quality, CI pipelines |
| Lite | 4 GB+ | ~450 MB | Constrained local hardware |
codeplug config set models.tier lite
codeplug config validate-models --tier liteCodePlug uses role-based local models and keeps ML as the primary signal interpreter:
classifier/frameworkClassifier:onnx-community/codebert-base-ONNXcodeEmbedding:onnx-community/codebert-base-ONNX(fallback toXenova/all-MiniLM-L6-v2)languageIdentifier:Xenova/distilbert-base-uncased-mnlizeroShot:Xenova/distilbert-base-uncased-mnlisentenceSimilarity:Xenova/all-MiniLM-L6-v2extractor:distilbert-base-cased-distilled-squadner:dslim/bert-base-NER(lite:dslim/distilbert-NER)summarizer:facebook/bart-large-cnn(lite:sshleifer/distilbart-cnn-6-6)
Define regex-based rules in .codeplug/rules.json to extend or override detected conventions:
[{
"id": "no-index-export",
"pattern": "export \\* from",
"scope": "content",
"message": "Prefer named exports",
"severity": "low"
}]./ # 119 files total
src/ # 88 source files
├── cli/
│ └── commands/ # Command handlers (7 files)
├── config/ # Zod schemas and provider presets (4 files)
├── core/
│ ├── analyzer/ # AST analysis — specialized visitors (29 files)
│ ├── classifier/ # Drift classification & confidence gating (2 files)
│ ├── convention/ # Convention management (1 file)
│ ├── exporter/ # Formatters — Claude, Cursor, SARIF (9 files)
│ ├── generator/ # ML pipelines (BART, BERT) & LLM client (19 files)
│ ├── git/ # Diff analysis & hook management (2 files)
│ └── scorer/ # Compliance engine & violation detection (5 files)
├── models/ # Tier-aware ML model registry (2 files)
├── storage/ # sql.js (scores) and JSON stores (7 files)
└── templates/ # Export templates for AI agents
tests/ # 28 test files
├── integration/ # Integration tests (1 file)
├── unit/ # Unit tests by module (22 files)
│ ├── analyzer/
│ ├── config/
│ ├── cli/
│ ├── exporter/
│ ├── models/
│ ├── scorer/
│ ├── generator/
│ └── storage/
└── fixtures/
└── sample-react-app/ # Test fixtures (5 files)
| Layer | Technology |
|---|---|
| ML Inference | @huggingface/transformers — Python-free, Node.js native |
| Summarization | BART (documentation prose) |
| Embeddings | BERT (pattern classification) |
| Database | sql.js (compliance time-series) |
| Schema Validation | Zod |
| Test Runner | Vitest |
These are the conventions CodePlug enforces and itself follows internally.
| Target | Convention | Example |
|---|---|---|
| Class / service files | PascalCase |
ConfigManager.ts |
| Utility files | camelCase |
formatDate.ts |
| Class names | PascalCase |
class ConventionScorer |
| Interface names | PascalCase |
interface RuleConfig |
| Type aliases | PascalCase |
type SeverityLevel |
| Factory functions | camelCase |
createAnalyzer() |
Note: Kebab-case filenames are not used anywhere in this project.
- All I/O operations use
async/await; callbacks and raw Promise chains are avoided. - All backend requests should go through a dedicated API layer, not raw fetch() calls scattered in component code.
- Use explicit
builder.query<Result, Arg>andbuilder.mutation<Result, Arg>to prevent incorrect call signatures and unsafe data assumptions.
- Keep dependencies one-way: presentation → business → data → models. Models must not depend on UI.
- Page components should remain thin orchestration shells that delegate feature logic to dedicated modules.
- Server data belongs in RTK Query, not regular slices.
- Prefer typed hooks (
useAppDispatch,useAppSelector) over raw react-redux hooks for type safety.
- Integration tests live in the root
tests/directory. - Unit tests are co-located in
__tests__/subdirectories alongside source files. - Test files follow the
.test.{ext}naming convention.
- Maintain strict TypeScript — eliminate explicit
anyannotations in source code. - Components typically annotate return types with
React.JSX.Element(or Promise variant for server components).
npm run build # Compile TypeScript
npm run dev # Watch mode for active development
npm run test # Run the full Vitest suite
npm run coverage # Generate test coverage report
npm run lint # Check code style
npm run typecheck # Verify TypeScript typesMIT