Skip to content

bkitduy/characterization-test-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Characterization Test Generator

Lock your legacy code behavior before AI touches it.

A Claude Code plugin that generates characterization tests (also known as approval tests, golden master tests, or snapshot tests) for existing code. These tests document what your code actually does β€” not what it should do β€” creating a safety net before refactoring or AI-assisted modification.

Why?

AI coding agents are powerful but dangerous with legacy code:

  • πŸ› They "fix" behavior that users depend on
  • πŸ—‘οΈ They delete tests to make them pass (Kent Beck's warning)
  • πŸ”€ They refactor beyond the requested scope

Characterization tests prevent all three by locking current behavior before any changes.

"When a system goes into production, it becomes its own specification." β€” Michael Feathers, Working Effectively with Legacy Code

Installation

Claude Code (Plugin Marketplace)

/plugin install characterization-test-generator

Manual Installation

git clone https://github.com/duybv/characterization-test-generator.git
cp -r characterization-test-generator/skills/characterize ~/.claude/skills/

Cursor

/add-plugin characterization-test-generator

Usage

In Claude Code, invoke:

/characterize src/services/optimizer.go

Or describe what you need:

Generate characterization tests for the route optimization service before I refactor it

The skill will:

  1. Identify all public functions/methods in the target
  2. Analyze code paths, inputs, and outputs
  3. Generate characterization tests with realistic data
  4. Scrub unstable fields (timestamps, IDs, random values)
  5. Suggest mutations to verify test effectiveness

Supported Languages

Language Test Framework Snapshot Method
Go testing Golden files (testdata/golden/)
Python pytest approvaltests or manual golden files
TypeScript jest toMatchSnapshot()
JavaScript jest toMatchSnapshot()
Kotlin JUnit ApprovalTests
Java JUnit ApprovalTests

How It Works

Based on the Feathers Method (Michael Feathers, 2004):

1. Write test named "x" with expected = null
2. Run test β†’ fails, revealing actual output
3. Paste actual output as expected value
4. Rename test to describe discovered behavior
5. Repeat for different inputs and code paths

This plugin automates steps 1-5 using AI:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   AI reads code  │────▢│ Generates tests   │────▢│ Human reviews   β”‚
β”‚   (understands   β”‚     β”‚ with realistic    β”‚     β”‚ and approves    β”‚
β”‚    all paths)    β”‚     β”‚ inputs + scrubbingβ”‚     β”‚ the tests       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                          β”‚
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
                         β”‚ Safe to refactor β”‚β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ or use AI agents β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Example Output

Go

func TestCharacterize_OptimizeRoute_BasicInput(t *testing.T) {
    input := loadFixture(t, "testdata/10_stops_2_warehouses.json")

    result, err := solver.Optimize(input)

    require.NoError(t, err)
    golden := filepath.Join("testdata", "golden", t.Name()+".json")
    actual := toJSON(t, scrub(result))

    if *update {
        os.WriteFile(golden, actual, 0644)
        return
    }

    expected, _ := os.ReadFile(golden)
    assert.JSONEq(t, string(expected), string(actual))
}

Python

from approvaltests import verify_as_json

def test_characterize_calculate_distance():
    result = calculate_distance(lat1=10.7, lon1=106.7, lat2=21.0, lon2=105.8)
    verify_as_json(scrub(result))

TypeScript

test("characterize formatAddress", () => {
  const result = formatAddress({ street: "Nguyen Hue", city: "HCM" });
  expect(scrub(result)).toMatchSnapshot();
});

Characterization Tests vs Other Tests

Characterization Test Unit Test E2E Test
Purpose Document current behavior Verify correctness Verify user flow
When Before changing legacy code When building new features After building features
Fail means Behavior changed Code is wrong User flow broken
Written by AI (reviewed by human) Developer QA/Developer

The Three-Step Fast Recipe

From understandlegacycode.com:

  1. πŸ“Έ Snapshot β€” Capture what the code produces
  2. βœ… Coverage β€” Use coverage reports to find untested paths, add more inputs
  3. πŸ‘½ Mutations β€” Deliberately break code to verify tests catch it

Recommended Workflow with AI Agents

Phase 1: PROTECT
  /characterize src/services/        ← this plugin
  Commit characterization tests

Phase 2: CHANGE
  AI refactors code
  Run characterization tests
  β†’ All pass? βœ… Behavior preserved
  β†’ Some fail? ⚠️ Review what changed

Phase 3: EVOLVE
  Write proper unit tests (TDD)      ← use superpowers/test-driven-development
  Gradually replace characterization tests with intent-based tests

Theoretical Background

This plugin is based on research and practices from:

Plugin Structure

characterization-test-generator/
β”œβ”€β”€ .claude-plugin/
β”‚   └── plugin.json              # Plugin metadata
β”œβ”€β”€ skills/
β”‚   └── characterize/
β”‚       β”œβ”€β”€ SKILL.md             # Main skill instructions
β”‚       └── references/
β”‚           β”œβ”€β”€ theory.md        # Background theory & citations
β”‚           └── language-patterns.md  # Code patterns per language
β”œβ”€β”€ docs/
β”‚   └── workflow.md              # Detailed workflow guide
β”œβ”€β”€ tests/                       # Plugin tests
β”œβ”€β”€ README.md
└── LICENSE (MIT)

Contributing

  1. Fork the repository
  2. Create a branch for your improvement
  3. Add support for new languages or improve existing patterns
  4. Submit a PR

License

MIT β€” see LICENSE for details.


Built for developers who maintain legacy code in the age of AI.

If AI is going to touch your code, make sure you have a safety net first.

About

Claude Code plugin: Generate characterization tests for legacy code before AI modification. Supports Go, Python, TypeScript, JavaScript, Kotlin, Java.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors