Skip to content

fix(sdk): pack validator spec compliance (#933)#935

Merged
chaholl merged 15 commits intomainfrom
fix/933-pack-validator-spec-compliance
Apr 10, 2026
Merged

fix(sdk): pack validator spec compliance (#933)#935
chaholl merged 15 commits intomainfrom
fix/933-pack-validator-spec-compliance

Conversation

@chaholl
Copy link
Copy Markdown
Contributor

@chaholl chaholl commented Apr 10, 2026

Summary

Fixes #933 — the SDK silently replaced every response with DefaultBlockedMessage when a pack declared any validator, due to a JSON tag mismatch that dropped validator params.

This PR delivers the immediate fix and a set of prevention mechanisms so the class of bug cannot recur.

The bug

sdk/internal/pack.Validator had json:"config" where the promptpack spec uses json:"params", plus forbidden monitor/message fields the spec's additionalProperties:false rejects. Params silently unmarshalled to nil, every declared validator ran with empty params, the max_length handler reported "missing or zero 'max'/'max_characters' param" every turn, and the guardrail adapter replaced every response with DefaultBlockedMessage.

The pack in the issue is fully valid per the promptpack spec. The SDK was the defect.

What changed

Immediate fix (e5c66290): SDK Validator struct aligned to $defs/Validator in runtime/prompt/schema/promptpack.schema.jsontype, enabled (required), fail_on_violation (optional), params (optional). convertPackValidatorsToHooks updated to skip disabled validators, honor the spec default of monitor-only when fail_on_violation is absent, and pass Params (not Config) to the guardrail factory.

Prevention mechanisms:

  • Spec-parity reflection test (2538c7d3) — walks pack.Validator via reflection and diffs against the embedded schema's $defs/Validator. Any future tag rename, extra field, or required-with-omitempty fails the build.
  • Schema rejection test (a02bed79) — pins that ValidateAgainstSchema rejects validators with forbidden monitor/config/message fields before struct unmarshal.
  • ParamValidator interface (4edfb07b, 580cfe11) — new optional interface in runtime/evals that handlers can implement to expose required-param contracts. Implemented on MaxLengthHandler, MinLengthHandler, WorkflowStateIsHandler, GuardrailTriggeredHandler.
  • Guardrail factory validates params at construction (a32e8170) — NewGuardrailHookFromRegistry normalises params (ApplyDefaults + NormalizeParams) and calls ValidateParams, so pack validators with unusable params fail at SDK load instead of silently enforcing every turn.
  • ValidateEvalTypes extended (af6ad2b3) — the existing function used by Arena fail-fast and SDK eval preflight now also invokes ParamValidator. Arena gains stricter checks for free.
  • SDK warn-and-skip for evals (d1e76e49) — sdk/eval_middleware.go filters bad eval defs at middleware creation with a warning, mirroring the existing validator warn-and-skip in convertPackValidatorsToHooks.
  • Public sdk.ValidatePack(path, skipSchemaValidation) preflight API (8849cb5b, 2809df11, a868f1a2) — lets CI gates and operator tools check whether a pack will load cleanly before sdk.Open() is called. Strict-by-default (uses the embedded promptpack schema). Returns []PackIssue for semantic problems and error for structural/schema failures.

Tests:

  • End-to-end regression (77c070ad, b88a385f) — full path sdk.Open() → pack.Load() → Send() with a mock provider, asserts content unchanged AND validator actually registered (not silently dropped by the warn-and-skip layer). Empirically verified to fail if the struct tag regresses.
  • Ticket-exact reproducer (9f74362c) — literal mirror of the reproducer in Pack validators silently run with empty params due to SDK/runtime struct tag mismatch #933 (fail_on_violation: false, 25-char response).
  • Arena fail-fast test (7517857c) — pins that Arena's existing fail-fast picks up ParamValidator errors.
  • ValidatePack tests covering valid pack, unknown validator type, missing required params, disabled skip, unknown eval type, missing eval params, multiple issues, file-not-found, structural error, strict schema rejection of non-spec fields, and the skip-schema opt-out.

Latent bugs surfaced

The stricter validation revealed two pre-existing test-fixture bugs, both fixed in the same commits that introduced the stricter check:

  • TestNewGuardrailHook_AllTypes declared max_length with only max_tokens (no max_characters) — was relying on silent runtime tolerance.
  • TestEvalMiddleware_ResolvesPackAndPromptEvals used fabricated eval types (regex_override) and bare length with empty params — was bypassing validation entirely.

API addition

// strict (default — zero value)
issues, err := sdk.ValidatePack("pack.json", false)

// opt-out for tooling that wants only handler-level checks
issues, err := sdk.ValidatePack("pack.json", true)

type PackIssue struct {
    Severity string  // "error"
    Kind     string  // "validator" or "eval"
    PromptID string  // owning prompt, or "" for pack-level evals
    Index    int     // position within validators/evals slice
    Type     string  // handler type
    ID       string  // eval def ID (empty for validators)
    Reason   string  // human-readable explanation
}

Test plan

  • Unit tests for ParamValidator on all four handlers with happy paths, missing keys, nil params, zero, negative, and alias key names
  • Unit tests for guardrail factory happy path, missing params, unknown type, handlers without ParamValidator
  • Unit tests for ValidateEvalTypes extension with stub handler
  • SDK warn-and-skip filter tests (unknown type, missing params, all-valid)
  • Spec-parity reflection test against embedded schema
  • Schema rejection test for additionalProperties:false fields
  • JSON round-trip fidelity test for Validator struct
  • End-to-end happy path: short response not blocked (empirically verified to fail on struct-tag regression via log-capture assertion)
  • End-to-end enforcement: long response truncated with fail_on_violation:true
  • End-to-end monitor-only: long response passes through with fail_on_violation:false
  • Ticket-exact reproducer matching the issue's literal JSON
  • Arena fail-fast regression test for missing required params
  • sdk.ValidatePack tests covering 12 scenarios including strict default and opt-out
  • Full test matrix (runtime, sdk, pkg, tools/arena, tools/packc, tools/schema-gen) passes with -count=1 -race
  • Lint clean (golangci-lint run --new-from-rev=main) on all changed files

chaholl added 15 commits April 10, 2026 16:08
The sdk/internal/pack.Validator struct had json:"config" where the
promptpack spec uses json:"params", and extra fields (monitor, message)
forbidden by the spec's additionalProperties:false. Validator params
silently unmarshalled to nil, every declared validator ran with empty
params, and the guardrail adapter replaced every response with
DefaultBlockedMessage.

- Replace Validator struct with an exact mirror of the spec Validator
  definition: type, enabled (required), fail_on_violation, params.
- Update convertPackValidatorsToHooks to skip disabled validators, honor
  the spec default of monitor-only when fail_on_violation is absent, and
  pass params (not config) to the guardrail factory.
- Rewrite the existing hook-conversion tests against the new fields.
- Add regression tests that load a promptpack-shaped JSON and assert
  params survive the round-trip.

Fixes silent guardrail enforcement in SDK pack loading path.
Diffs the pack.Validator struct's JSON tags against the embedded
promptpack schema's $defs/Validator definition. Any future drift —
renaming a tag, adding a non-spec field, dropping a required field —
fails the build.
The promptpack schema sets additionalProperties:false on Validator;
adding monitor, config, or message to a validator must fail the
load-time schema check before reaching struct unmarshal.
Optional interface that EvalTypeHandler implementations can adopt to
expose their required-param contract. Used by guardrail hook
construction and ValidateEvalTypes so invalid pack validators/evals
surface at SDK load time instead of failing silently at request time.
MaxLengthHandler, MinLengthHandler, WorkflowStateIsHandler, and
GuardrailTriggeredHandler now implement ParamValidator, hoisting
the required-key checks that previously lived inside Eval() into
a load-time contract.
NewGuardrailHookFromRegistry now calls ParamValidator.ValidateParams
(after ApplyDefaults + NormalizeParams) when the handler implements
the interface. Pack validators with unusable params surface as errors
at SDK load time instead of silently blocking every response at
request time.

Part of the fix for #933 - combined with the struct-tag fix earlier
in the series, this makes the class of silent guardrail enforcement
impossible at the SDK boundary.
Arena's existing fail-fast at tools/arena/engine/builder_integration.go
and the SDK's public preflight at sdk/evaluate.go both pick up
param-validation errors for free. The SDK warn-and-skip middleware
(next commit) uses the same function to filter bad eval defs at
middleware creation.
Eval defs whose type is unregistered or whose params are unusable by
the runtime handler are now filtered out at middleware creation, with
a warning logged per skipped def. Mirrors the validator warn-and-skip
pattern in convertPackValidatorsToHooks.

Arena still fail-fasts on the same conditions via builder_integration.
Exercises the full path: promptpack JSON -> Open() -> Send() with a
mock provider. Asserts that a short response is not replaced with
DefaultBlockedMessage when a max_length validator with a generous
limit is declared. Also covers fail_on_violation:true enforcement
and the spec-default monitor-only mode.
Review showed the content-equality assertion alone is a tautology:
with the struct tag broken, the validator hook is silently skipped by
convertPackValidatorsToHooks' warn-and-skip path, so a short response
passes through unchanged for the wrong reason. Capture the SDK logger
and assert NO "Skipping unusable pack validator" warning is emitted,
pinning both structural correctness (hook registered) and behavioural
correctness (content unchanged).
Arena's existing ValidateEvalTypes fail-fast now also surfaces
param-validation errors via the ParamValidator interface extension.
This test pins the contract so future changes can't silently
weaken it.
Mirrors the exact scenario from the GitHub issue: fail_on_violation:false,
max_characters:2000, 25-char response. Existing tests cover the two bugs
separately; this one pins the literal combination the issue author reported.
ValidatePack(path) loads a pack and reports any semantic issues —
unknown validator/eval types, missing required params — that would
cause Open() to warn-and-skip them or Arena to fail fast. Runs the
same handler-level param validation the SDK runs internally during
Open(), exposed as a standalone function for CI gates and operator
tools.

Structural failures (malformed JSON, missing required fields) are
returned as an error. Semantic issues are returned as a slice of
PackIssue records with enough context (prompt, index, type, id,
reason) for operators to pinpoint each problem.
Flip ValidatePack to strict promptpack schema validation by default.
Packs with forbidden fields (e.g. validator.monitor, validator.config)
are returned as load-time errors, not PackIssues — matching the
semantics of sdk.Open() which validates against the embedded schema.

ValidatePack now takes ...Option and consults WithSkipSchemaValidation
as an explicit opt-out for tooling that wants only handler-level
checks. Update test fixtures so the minimal valid pack is spec-
compliant (adds template_engine) without needing the skip flag, and
add tests pinning both the strict default and the opt-out path.
Replace the functional-options signature with an explicit boolean
parameter. The previous approach reused sdk.Option (shared with Open)
which advertised 80+ options but only actually read one — confusing
and a latent footgun. A named boolean makes the contract obvious at
every call site: the caller must explicitly decide strict-or-skip,
and the Go zero value (false) gives strict-by-default if they forget.
@sonarqubecloud
Copy link
Copy Markdown

@chaholl chaholl merged commit 2d2ba82 into main Apr 10, 2026
29 checks passed
@chaholl chaholl deleted the fix/933-pack-validator-spec-compliance branch April 10, 2026 18:18
chaholl added a commit that referenced this pull request Apr 10, 2026
Resolves a conflict in sdk/eval_middleware.go where #935 introduced
resolveRunnerAndFilter (to filter invalid eval defs) in the same block
this branch had replaced to clone the runner. Resolution keeps both:
resolveRunnerAndFilter picks the runner and filters defs, then we
clone the returned runner before attaching emitter and eval hooks so a
user-supplied WithEvalRunner instance is never mutated across Open()
calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pack validators silently run with empty params due to SDK/runtime struct tag mismatch

1 participant