Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 195 additions & 0 deletions internal/mapper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# JSON Schema Error Mapper

The `internal/mapper` package provides functionality to map JSON Schema validation errors to precise YAML text spans using RFC6901 JSON Pointers and goccy/go-yaml AST position metadata.

## Overview

When JSON Schema validation fails, it typically provides:
- An `instancePath` (RFC6901 JSON Pointer) indicating where the error occurred
- An error type/kind (e.g., "type", "required", "additionalProperties")
- Additional metadata about the error

This package translates these errors into precise line/column ranges in the original YAML source file, enabling IDEs and tools to highlight the exact problematic tokens.

## API

### Core Function

```go
func MapErrorToSpans(yamlBytes []byte, instancePath string, meta ErrorMeta) ([]Span, error)
```

Maps a JSON Schema validation error to one or more candidate YAML spans, ordered by confidence.

**Parameters:**
- `yamlBytes`: The original YAML content as bytes
- `instancePath`: RFC6901 JSON Pointer (e.g., "/jobs/build/steps/0/uses")
- `meta`: Error metadata including kind and property information

**Returns:**
- `[]Span`: Ordered list of candidate spans (highest confidence first)
- `error`: Parse error if YAML is invalid

### Types

```go
// Span describes a location in the source YAML file
type Span struct {
StartLine int // 1-based line number
StartCol int // 1-based column number
EndLine int // 1-based end line
EndCol int // 1-based end column
Confidence float64 // 0.0 - 1.0 confidence score
Reason string // Human-readable explanation
}

// ErrorMeta contains validator-provided metadata
type ErrorMeta struct {
Kind string // Error type: "type", "required", "additionalProperties", etc.
Property string // Property name for property-specific errors
SchemaSnippet string // Optional schema fragment for context
}
```

## Algorithm

The mapper uses a multi-step approach:

### 1. JSON Pointer Decoding
- Decodes RFC6901 pointers into path segments
- Handles escaping (`~0` → `~`, `~1` → `/`)
- Validates pointer format

### 2. AST Traversal
- Parses YAML using goccy/go-yaml with position metadata
- Traverses AST following pointer segments:
- **Mapping nodes**: Match child keys to segment names
- **Sequence nodes**: Use numeric indices
- Returns target node, parent, and key node

### 3. Error-Kind Mapping

#### Type Errors (`"type"`)
- **Direct hit**: Highlight the value token with high confidence (0.95)
- **Fallback**: Highlight entire node with good confidence (0.9)

#### Missing Required Properties (`"required"`)
- Find parent mapping that should contain the property
- Compute insertion anchor (position after last sibling)
- Confidence: 0.7-0.75 depending on context

#### Additional Properties (`"additionalProperties"`)
- Find the specific key node using `meta.Property`
- Highlight the key token (not the value)
- Confidence: 0.98 for exact key matches

#### Other Error Types
- Generic node highlighting with moderate confidence (0.8)

### 4. Fallback Heuristics

When exact traversal fails:

1. **Text Search**: Search for property names in raw YAML text (confidence: 0.6)
2. **Parent Context**: Find nearest existing parent node (confidence: 0.4)
3. **Document Fallback**: Return document-level span (confidence: 0.2)

## Confidence Scoring

Confidence scores guide tooling on which spans to prefer:

- **0.9-1.0**: Exact matches, high certainty
- **0.7-0.9**: Good matches with minor ambiguity
- **0.4-0.7**: Reasonable guesses, contextual matches
- **0.2-0.4**: Fallback heuristics
- **0.0-0.2**: Last resort, document-level

## Usage Examples

### Type Mismatch Error

```go
yaml := `config:
port: "8080" # Should be integer
host: "localhost"`

spans, err := MapErrorToSpans([]byte(yaml), "/config/port", ErrorMeta{
Kind: "type",
})
// Returns: Span{StartLine: 2, StartCol: 9, EndLine: 2, EndCol: 14, Confidence: 0.95}
// Highlights the value "8080"
```

### Missing Required Property

```go
yaml := `config:
host: "localhost"`

spans, err := MapErrorToSpans([]byte(yaml), "/config/port", ErrorMeta{
Kind: "required",
Property: "port",
})
// Returns: Span{StartLine: 3, StartCol: 18, ...} with insertion anchor
```

### Additional Property Error

```go
yaml := `config:
port: 8080
extra: "not allowed"`

spans, err := MapErrorToSpans([]byte(yaml), "/config/extra", ErrorMeta{
Kind: "additionalProperties",
Property: "extra",
})
// Returns: Span highlighting the key "extra" with high confidence
```

## Supported YAML Features

- **Block style** mappings and sequences
- **Flow style** (`{key: value}`, `[item1, item2]`)
- **Mixed styles** within the same document
- **Nested structures** (arbitrarily deep)
- **Special characters** in keys
- **Numeric keys**
- **Empty string keys**

## Limitations

- **Multi-document YAML**: Only processes first document
- **Anchors/Aliases**: Basic support, may return multiple candidates
- **Complex merge keys**: Limited support for YAML merge semantics
- **Position accuracy**: Depends on goccy/go-yaml token positions

## Error Handling

- **Invalid YAML**: Returns parse error
- **Invalid JSON Pointer**: Returns validation error
- **Missing paths**: Returns fallback spans with low confidence
- **Empty input**: Returns document-level fallback span

## Testing

The package includes comprehensive tests covering:

- All supported error kinds
- Complex YAML structures (nested, flow style, mixed)
- Edge cases (invalid YAML, special characters, large indices)
- Confidence scoring validation
- Position accuracy verification

Run tests with:
```bash
go test ./internal/mapper -v
```

## Performance Considerations

- **Parse cost**: Full YAML parse required for each mapping
- **Memory usage**: AST held in memory during traversal
- **Caching**: Consider caching parsed AST for multiple error mappings

For high-frequency usage, parse YAML once and reuse the AST for multiple error mappings.
171 changes: 171 additions & 0 deletions internal/mapper/cli.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
package mapper

import (
"encoding/json"
"fmt"
"os"
)

// CLIError represents an error in CLI format for testing
type CLIError struct {
InstancePath string `json:"instancePath"`
Meta ErrorMeta `json:"meta"`
}

// TestRunner provides a simple interface for testing the mapper
type TestRunner struct {
Verbose bool
}

// NewTestRunner creates a new test runner
func NewTestRunner(verbose bool) *TestRunner {
return &TestRunner{Verbose: verbose}
}

// RunTest executes a mapping test with the given YAML and error
func (tr *TestRunner) RunTest(yamlContent string, cliError CLIError) {
if tr.Verbose {
fmt.Printf("=== Testing Mapper ===\n")
fmt.Printf("YAML Content:\n%s\n", yamlContent)
fmt.Printf("Instance Path: %s\n", cliError.InstancePath)
fmt.Printf("Error Kind: %s\n", cliError.Meta.Kind)
if cliError.Meta.Property != "" {
fmt.Printf("Property: %s\n", cliError.Meta.Property)
}
fmt.Printf("\n")
}

spans, err := MapErrorToSpans([]byte(yamlContent), cliError.InstancePath, cliError.Meta)
if err != nil {
fmt.Printf("ERROR: %v\n", err)
return
}

if len(spans) == 0 {
fmt.Printf("No spans returned\n")
return
}

fmt.Printf("Results (%d spans):\n", len(spans))
for i, span := range spans {
fmt.Printf(" %d. Line %d:%d - %d:%d (confidence: %.2f)\n",
i+1, span.StartLine, span.StartCol, span.EndLine, span.EndCol, span.Confidence)
fmt.Printf(" Reason: %s\n", span.Reason)
}

if tr.Verbose {
fmt.Printf("\n=== End Test ===\n\n")
}
}

// RunTestFromJSON runs a test from JSON-formatted input
func (tr *TestRunner) RunTestFromJSON(yamlContent, errorJSON string) error {
var cliError CLIError
if err := json.Unmarshal([]byte(errorJSON), &cliError); err != nil {
return fmt.Errorf("failed to parse error JSON: %w", err)
}

tr.RunTest(yamlContent, cliError)
return nil
}

// PrintHelp prints usage information
func (tr *TestRunner) PrintHelp() {
fmt.Printf(`JSON Schema Error Mapper Test Tool

USAGE:
Set up YAML content and error JSON, then call RunTest methods.

ERROR JSON FORMAT:
{
"instancePath": "/path/to/property",
"meta": {
"kind": "type|required|additionalProperties|oneOf|anyOf",
"property": "property_name",
"schemaSnippet": "optional schema info"
}
}

EXAMPLE ERROR TYPES:
- Type mismatch: {"instancePath": "/config/port", "meta": {"kind": "type"}}
- Missing required: {"instancePath": "/config/missing", "meta": {"kind": "required", "property": "missing"}}
- Additional property: {"instancePath": "/config/extra", "meta": {"kind": "additionalProperties", "property": "extra"}}
`)
}

// Example usage function that can be called from tests or a CLI
func ExampleUsage() {
runner := NewTestRunner(true)

// Example 1: Type error
yaml1 := `config:
port: "8080"
host: "localhost"`

error1 := CLIError{
InstancePath: "/config/port",
Meta: ErrorMeta{Kind: "type"},
}

fmt.Println("Example 1: Type Error")
runner.RunTest(yaml1, error1)

// Example 2: Missing required property
yaml2 := `config:
host: "localhost"`

error2 := CLIError{
InstancePath: "/config/port",
Meta: ErrorMeta{Kind: "required", Property: "port"},
}

fmt.Println("Example 2: Missing Required Property")
runner.RunTest(yaml2, error2)

// Example 3: Additional property
yaml3 := `config:
port: 8080
host: "localhost"
extra: "not allowed"`

error3 := CLIError{
InstancePath: "/config/extra",
Meta: ErrorMeta{Kind: "additionalProperties", Property: "extra"},
}

fmt.Println("Example 3: Additional Property")
runner.RunTest(yaml3, error3)
}

// RunInteractiveExample runs examples and can be called from main or tests
func RunInteractiveExample() {
fmt.Println("JSON Schema Error Mapper - Interactive Examples")
fmt.Println("=" + fmt.Sprintf("%50s", "="))

ExampleUsage()
}

// SaveExampleToFile saves a test case to a file for external testing
func SaveExampleToFile(filename, yamlContent, errorJSON string) error {
type TestCase struct {
YAML string `json:"yaml"`
Error CLIError `json:"error"`
}

var cliError CLIError
if err := json.Unmarshal([]byte(errorJSON), &cliError); err != nil {
return err
}

testCase := TestCase{
YAML: yamlContent,
Error: cliError,
}

data, err := json.MarshalIndent(testCase, "", " ")
if err != nil {
return err
}

return os.WriteFile(filename, data, 0644)
}
Loading
Loading