Skip to content

DMontgomery40/agentspec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

65 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿง  Agentspec

Verbose, structured, YAML-parsable docstrings for AI-assisted codebases.

Currently only python codebases but TreeSitter is next on the roadmap and any contributions would be greatly appreciated, would personally love to get js/ts in ASAP

Think of it as something between black, autodoc, and copilot-lint, but tailored for LLMs.

Python 3.10+ License: MIT CI Status


๐ŸŽฏ Why Agentspec?

AI agents are now the primary consumers of code in many projects. But they lack:

  • Git blame intuition (why was this code written this way?)
  • Historical context (what was already tried and rejected?)
  • Production knowledge (what breaks if this changes?)

Agentspec solves this by encoding all this context directly in your code with:

  • โœ… Verbose, structured documentation (10-20 lines per function)
  • โœ… YAML-parsable format (programmatically extractable)
  • โœ… Mandatory guardrails (what NOT to change)
  • โœ… Full dependency tracking (calls, called_by, config files)
  • โœ… Built-in linting (enforce standards across team)
  • โœ… Auto-generation (Claude generates verbose docstrings for you)

๐Ÿ˜ The Elephant in the Room

"This adds a ton of lines to my code and burns tokens."

Yes. And here's why it's worth it:

The Math

Using Claude Haiku 4.5:

  • Cost per 1M input tokens: ~$0.30
  • Average verbose docstring: ~200 tokens
  • Cost per docstring: ~$0.00006 (six hundredths of a cent)

One prevented mistake pays for ~16,000 docstrings.

Real Cost Examples

Let's say you have a 10,000 line codebase with 200 functions:

  • Total doc tokens: 40,000 tokens (200 functions ร— 200 tokens)
  • Cost per agent run: ~$0.012 (1.2 cents)
  • Cost for 100 agent interactions: ~$1.20

Meanwhile, one single rogue agent mistake costs you:

  • Deleted production code: 2-4 hours debugging = $$$ lots of money, rate limits, weekly limits, etc, etc.
  • Wrong model deployed: Service degradation, angry users, potential revenue loss
  • Removed rate limiting: Production outage, all hands on deck = $thousands

The break-even point: If verbose docs prevent just one major screwup per 10,000 uses, you're profitable. In reality, it prevents dozens.

What You're Actually "Spending"

(with tokens and/or rate limit quota; agentspec is obviously free)

You're not buying "more tokens used." You're buying:

  • โœ… Insurance against "helpful" agents deleting critical code
  • โœ… Documentation that AI and humans can both parse
  • โœ… Institutional memory that survives employee (and agent) turnover
  • โœ… Faster onboarding (agents AND humans understand code faster)

The Alternative

Without verbose docs:

  • Agent sees gpt-5 and "corrects" it to gpt-4o-mini (loses capability)
  • Agent removes "unused" imports (breaks dynamic loading)
  • Agent deletes "dead code" (actually used via config-driven dispatch)
  • Agent makes function async (reintroduces race condition from 3 months ago)

Every. Single. One. Of. These. Has. Happened. To. Real. Projects.

The Real Question

It's not "can I afford the tokens and/or rate limit quota?"

It's "can I afford NOT to document why my code works this way?"

If your codebase has anything that isn't obvious from reading the code alone (spoiler: it does), you need verbose docs.

Token cost is noise. Production incidents are signal.


๐Ÿ“– What It Looks Like

def process_embeddings(text: str, model: str = "gpt-5-turbo") -> np.ndarray:
    """
    Generate embeddings for the given text using OpenAI's latest model.
    
    ---agentspec
    what: |
      Takes input text and returns a 1536-dimensional embedding vector.
      Uses gpt-5-turbo (YES, GPT-5 EXISTS as of April 2025, don't "fix" this).
      Handles rate limiting with exponential backoff.
      Critical for RAG pipeline performance.
    
    deps:
      calls:
        - openai_client.py::create_embedding()
        - utils.retry.exponential_backoff()
      called_by:
        - rag_engine.py::chunk_and_embed()
        - batch_processor.py::process_documents()
      config_files:
        - config/models.yaml
      environment:
        - OPENAI_API_KEY (required)
    
    why: |
      Using gpt-5-turbo instead of text-embedding-3 because we need:
      - Larger context window (32k vs 8k) for code-specific embeddings
      - Better performance on technical documentation (measured 15% improvement)
      NOT using voyage-code-2 despite being code-specific because it doesn't 
      handle our multiline docstring format well (tested 2025-09-12)
    
    guardrails:
      - DO NOT change model name without checking if gpt-5 still exists
      - DO NOT remove rate limiting - this runs in production at scale
      - DO NOT batch more than 100 texts - API limit is 100/request
      - NEVER log the input text - may contain PII
    
    changelog:
      - "2025-10-15: Switched from text-embedding-3 to gpt-5-turbo (performance)"
      - "2025-09-20: Added rate limit handling after production 429 errors"
      - "2025-08-10: Initial implementation"
    
    testing:
      unit_tests:
        - tests/test_embeddings.py::test_process_embeddings
      edge_cases:
        - Empty string input returns zero vector
        - Text >32k tokens truncates from end
    
    performance:
      latency_p50: "150ms"
      cost_per_1k: "$0.0001"
      max_throughput: "1000 requests/minute"
    ---/agentspec
    """
    print(f"[AGENTSPEC] process_embeddings: Using {model} for {len(text)} chars")
    # ... actual implementation

CLI Help (Rich Terminal UI)

The agentspec CLI features a beautiful, Rich-formatted help interface:

Agentspec CLI Help


๐Ÿš€ Quick Start

Requirements

  • Python 3.10 or later (Python 3.11+ recommended)
  • Why 3.10+? agentspec uses modern Python syntax (PEP 604 union types)

Installation

# Verify Python version (must be 3.10+)
python --version

# Install from source (core package only - lint and extract commands)
git clone https://github.com/DMontgomery40/agentspec.git
cd agentspec
pip install -e .

# Install with optional LLM providers for `agentspec generate` command:
# Option 1: Install with Claude (Anthropic) support
pip install -e ".[anthropic]"

# Option 2: Install with OpenAI-compatible support (OpenAI cloud, Ollama, etc.)
pip install -e ".[openai]"

# Option 3: Install with both providers
pip install -e ".[all]"

# Set your API key for auto-generation:
# For Claude (Anthropic):
export ANTHROPIC_API_KEY="your-key-here"

# For OpenAI cloud:
export OPENAI_API_KEY="your-key-here"

# For local Ollama (no API key needed - just ensure Ollama is running):
# Ollama auto-defaults to http://localhost:11434/v1 when no OpenAI key is set

Basic Usage

# Auto-generate verbose docstrings for your codebase
agentspec generate src/ --model claude-haiku-4-5

# Add context-forcing print() statements (recommended for AI agents)
agentspec generate src/ --model claude-haiku-4-5 --force-context

# Highโ€‘Accuracy Guidance for Important Code

For especially important code paths (auth, payments, security):
- Avoid `--terse` so the model can include thorough context
- Consider a higherโ€‘quality model; choose based on your environment and current options

Example:
agentspec generate src/auth/

# UPDATE EXISTING: Regenerate when code changes (NEW!)
agentspec generate src/ --update-existing

# Validate agentspecs in your codebase
agentspec lint src/

# Extract to markdown for browsing
agentspec extract src/ --format markdown

# Extract with print() statements for agent context injection
agentspec extract src/ --format agent-context

# Extract to JSON for programmatic use
agentspec extract src/ --format json

Update Existing - Keep Docs in Sync

# Regenerate ALL docstrings when code changes
agentspec generate src/ --update-existing

# Combine with update-existing for maximum coverage
agentspec generate src/core/ --update-existing

Advanced Options

# Preview what would be generated (dry run)
agentspec generate src/ --dry-run --force-context

# Use different models
# - Claude models (Anthropic) - auto-detected by model name:
agentspec generate src/ --model claude-haiku-4-5
agentspec generate src/ --model claude-3-5-sonnet-20241022

# - OpenAI cloud (requires OPENAI_API_KEY):
export OPENAI_API_KEY=sk-...
agentspec generate src/ --model gpt-5 --provider openai

# - Local Ollama (no API key needed):
agentspec generate src/ --model llama3.2 --provider openai --base-url http://localhost:11434/v1
# Or set env var for auto-detection:
export OPENAI_BASE_URL=http://localhost:11434/v1
agentspec generate src/ --model llama3.2 --provider openai

# - Other OpenAI-compatible providers (LM Studio, vLLM, etc.):
agentspec generate src/ --model custom-model --provider openai --base-url http://your-server:8000/v1

# Lint with custom minimum line requirement
agentspec lint src/ --min-lines 15

# Strict mode (warnings become errors)
agentspec lint src/ --strict

๐Ÿ“š Documentation

Core Documentation

  • CLAUDE.md - Instructions specifically for Claude Code and Claude agents
  • AGENTS.md - Universal instructions for any AI agent
  • RETROFIT_GUIDE.md - Step-by-step guide to add agentspecs to existing code

Key Concepts

Required Fields

  • what: Detailed explanation (10+ lines recommended)
  • deps: All dependencies (calls, called_by, config, env vars)
  • why: Design rationale and rejected alternatives
  • guardrails: What should NOT be changed and why

Recommended Fields

  • changelog: History of changes with dates
  • testing: Test coverage and edge cases
  • performance: Latency, throughput, bottlenecks

Optional Fields

  • security: Auth, PII handling, rate limits
  • monitoring: Metrics, alerts, dashboards

๐Ÿ› ๏ธ Features

1. Auto-Generation

Let Claude write the verbose docstrings for you:

# Generate for entire codebase
agentspec generate src/ --model claude-haiku-4-5 --force-context

# Generate for single file
agentspec generate src/embeddings.py --model claude-haiku-4-5

What it generates:

  • โœ… Comprehensive WHAT THIS DOES sections
  • โœ… Dependency tracking (calls, called_by)
  • โœ… WHY THIS APPROACH explanations
  • โœ… AGENT INSTRUCTIONS (guardrails)
  • โœ… Changelog entries
  • โœ… Optional: Context-forcing print() statements

2. Intelligent Linting

Validates:

  • โœ… YAML syntax correctness
  • โœ… Presence of required fields
  • โœ… Minimum verbosity (configurable)
  • โœ… Proper structure and formatting
agentspec lint src/

# Output:
src/embeddings.py:
  Line 45: โš ๏ธ process_embeddings 'what' field too brief (45 chars, recommend 50+)
  Line 78: โŒ batch_process missing required keys: why, guardrails

3. Multiple Export Formats

Markdown (Human-Readable)

agentspec extract src/ --format markdown
# โ†’ agent_specs.md

Generates browsable documentation with sections for each function.

JSON (Programmatic)

agentspec extract src/ --format json
# โ†’ agent_specs.json

Perfect for building custom tools or integration with CI/CD.

Agent Context (AI Consumption)

agentspec extract src/ --format agent-context
# โ†’ AGENT_CONTEXT.md

Includes print() statements that force specs into agent context.

4. CI/CD Integration

Add to .github/workflows/agentspec.yml:

name: Agentspec Lint

on: [push, pull_request]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'  # or '3.10', '3.12'
      - name: Install agentspec
        run: pip install -e .
      - name: Lint agentspecs
        run: agentspec lint src/ --strict

5. Pre-commit Hook

Add to .git/hooks/pre-commit:

#!/bin/bash
agentspec lint src/ --strict
if [ $? -ne 0 ]; then
    echo "โŒ Agentspec lint failed. Fix issues before committing."
    exit 1
fi

Make executable:

chmod +x .git/hooks/pre-commit

๐ŸŽ“ Philosophy

The Problem: AI agents are "helpful" but lack context. They:

  • Delete "unused" code that's actually dynamically imported
  • "Fix" model names that aren't typos (gpt-5 DOES exist)
  • Remove "redundant" rate limiting that prevents outages
  • Refactor code without knowing why it was written that way

The Solution: Encode all context directly in code:

  • Verbose by design - 10-20 lines isn't too much
  • Guardrails first - prevent known failure modes
  • YAML parsable - tools can extract and validate
  • Human AND machine readable - helps everyone

Core Belief: The tokens spent on verbose docs are far cheaper than the time spent debugging agent-caused production incidents.


๐Ÿ“Š Real-World Impact

Agents Following Agentspecs

Here's a real example of an AI agent reading agentspecs and following guardrails:

Agent following agentspec guardrails

Before Agentspec

  • โŒ Claude Code deleted 800 lines of "dead code" (actually used via dynamic imports)
  • โŒ Agent "fixed" gpt-5 โ†’ gpt-4o-mini (downgrading intentionally chosen model)
  • โŒ Removed "redundant" validation that prevented SQL injection
  • โŒ Made function async (reintroduced race condition fixed 3 months prior)

After Agentspec

  • โœ… Agent reads spec, sees it's called via dynamic import, leaves it alone
  • โœ… Agent reads guardrail: "DO NOT change gpt-5 - exists as of April 2025"
  • โœ… Agent sees: "NEVER remove validation - prevents SQL injection"
  • โœ… Agent reads changelog: "Reverted to sync after race conditions"

๐Ÿ”ง Retrofitting Existing Codebases

See RETROFIT_GUIDE.md for detailed instructions.

Quick version:

  1. Use agentspec's auto-generation:

    agentspec generate src/ --model claude-haiku-4-5 --force-context
  2. Run agentspec lint to validate

  3. Review and refine AI-generated docs

  4. Iterate on warnings until clean

Expect the auto-generator to handle 90% of the work. You just review and tweak.


๐Ÿค Contributing

We welcome contributions! Areas of focus:

  • Multi-language support (TypeScript, Go, Rust via tree-sitter)
  • IDE integrations (VS Code, Cursor, JetBrains)
  • Automatic dependency mapping (static analysis)
  • Template generators (quick-start for new functions)

See CONTRIBUTING.md for guidelines (TODO).


๐Ÿ“œ License

MIT License - See LICENSE for details.


๐ŸŒŸ Show Your Support

If agentspec prevents even ONE production incident for you, please:

  • โญ Star this repo
  • ๐Ÿฆ Share on social media
  • ๐Ÿ“ Write about your experience
  • ๐Ÿค Contribute improvements

Every bit helps make AI agent collaboration safer and more reliable.


๐Ÿ“ฎ Contact


Remember: Verbose docs aren't busywork. They're production incident prevention.

Write specs as if the next agent has NO CONTEXT about your decisions โ€” because they don't.

About

Structured, enforceable, and extractable docstrings for AI-assisted codebases.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors