fix(sdk): pack validator spec compliance (#933)#935
Merged
Conversation
The sdk/internal/pack.Validator struct had json:"config" where the promptpack spec uses json:"params", and extra fields (monitor, message) forbidden by the spec's additionalProperties:false. Validator params silently unmarshalled to nil, every declared validator ran with empty params, and the guardrail adapter replaced every response with DefaultBlockedMessage. - Replace Validator struct with an exact mirror of the spec Validator definition: type, enabled (required), fail_on_violation, params. - Update convertPackValidatorsToHooks to skip disabled validators, honor the spec default of monitor-only when fail_on_violation is absent, and pass params (not config) to the guardrail factory. - Rewrite the existing hook-conversion tests against the new fields. - Add regression tests that load a promptpack-shaped JSON and assert params survive the round-trip. Fixes silent guardrail enforcement in SDK pack loading path.
Diffs the pack.Validator struct's JSON tags against the embedded promptpack schema's $defs/Validator definition. Any future drift — renaming a tag, adding a non-spec field, dropping a required field — fails the build.
The promptpack schema sets additionalProperties:false on Validator; adding monitor, config, or message to a validator must fail the load-time schema check before reaching struct unmarshal.
Optional interface that EvalTypeHandler implementations can adopt to expose their required-param contract. Used by guardrail hook construction and ValidateEvalTypes so invalid pack validators/evals surface at SDK load time instead of failing silently at request time.
MaxLengthHandler, MinLengthHandler, WorkflowStateIsHandler, and GuardrailTriggeredHandler now implement ParamValidator, hoisting the required-key checks that previously lived inside Eval() into a load-time contract.
NewGuardrailHookFromRegistry now calls ParamValidator.ValidateParams (after ApplyDefaults + NormalizeParams) when the handler implements the interface. Pack validators with unusable params surface as errors at SDK load time instead of silently blocking every response at request time. Part of the fix for #933 - combined with the struct-tag fix earlier in the series, this makes the class of silent guardrail enforcement impossible at the SDK boundary.
Arena's existing fail-fast at tools/arena/engine/builder_integration.go and the SDK's public preflight at sdk/evaluate.go both pick up param-validation errors for free. The SDK warn-and-skip middleware (next commit) uses the same function to filter bad eval defs at middleware creation.
Eval defs whose type is unregistered or whose params are unusable by the runtime handler are now filtered out at middleware creation, with a warning logged per skipped def. Mirrors the validator warn-and-skip pattern in convertPackValidatorsToHooks. Arena still fail-fasts on the same conditions via builder_integration.
Exercises the full path: promptpack JSON -> Open() -> Send() with a mock provider. Asserts that a short response is not replaced with DefaultBlockedMessage when a max_length validator with a generous limit is declared. Also covers fail_on_violation:true enforcement and the spec-default monitor-only mode.
Review showed the content-equality assertion alone is a tautology: with the struct tag broken, the validator hook is silently skipped by convertPackValidatorsToHooks' warn-and-skip path, so a short response passes through unchanged for the wrong reason. Capture the SDK logger and assert NO "Skipping unusable pack validator" warning is emitted, pinning both structural correctness (hook registered) and behavioural correctness (content unchanged).
Arena's existing ValidateEvalTypes fail-fast now also surfaces param-validation errors via the ParamValidator interface extension. This test pins the contract so future changes can't silently weaken it.
Mirrors the exact scenario from the GitHub issue: fail_on_violation:false, max_characters:2000, 25-char response. Existing tests cover the two bugs separately; this one pins the literal combination the issue author reported.
ValidatePack(path) loads a pack and reports any semantic issues — unknown validator/eval types, missing required params — that would cause Open() to warn-and-skip them or Arena to fail fast. Runs the same handler-level param validation the SDK runs internally during Open(), exposed as a standalone function for CI gates and operator tools. Structural failures (malformed JSON, missing required fields) are returned as an error. Semantic issues are returned as a slice of PackIssue records with enough context (prompt, index, type, id, reason) for operators to pinpoint each problem.
Flip ValidatePack to strict promptpack schema validation by default. Packs with forbidden fields (e.g. validator.monitor, validator.config) are returned as load-time errors, not PackIssues — matching the semantics of sdk.Open() which validates against the embedded schema. ValidatePack now takes ...Option and consults WithSkipSchemaValidation as an explicit opt-out for tooling that wants only handler-level checks. Update test fixtures so the minimal valid pack is spec- compliant (adds template_engine) without needing the skip flag, and add tests pinning both the strict default and the opt-out path.
Replace the functional-options signature with an explicit boolean parameter. The previous approach reused sdk.Option (shared with Open) which advertised 80+ options but only actually read one — confusing and a latent footgun. A named boolean makes the contract obvious at every call site: the caller must explicitly decide strict-or-skip, and the Go zero value (false) gives strict-by-default if they forget.
|
chaholl
added a commit
that referenced
this pull request
Apr 10, 2026
Resolves a conflict in sdk/eval_middleware.go where #935 introduced resolveRunnerAndFilter (to filter invalid eval defs) in the same block this branch had replaced to clone the runner. Resolution keeps both: resolveRunnerAndFilter picks the runner and filters defs, then we clone the returned runner before attaching emitter and eval hooks so a user-supplied WithEvalRunner instance is never mutated across Open() calls.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Fixes #933 — the SDK silently replaced every response with
DefaultBlockedMessagewhen a pack declared any validator, due to a JSON tag mismatch that dropped validator params.This PR delivers the immediate fix and a set of prevention mechanisms so the class of bug cannot recur.
The bug
sdk/internal/pack.Validatorhadjson:"config"where the promptpack spec usesjson:"params", plus forbiddenmonitor/messagefields the spec'sadditionalProperties:falserejects. Params silently unmarshalled to nil, every declared validator ran with empty params, themax_lengthhandler reported"missing or zero 'max'/'max_characters' param"every turn, and the guardrail adapter replaced every response withDefaultBlockedMessage.The pack in the issue is fully valid per the promptpack spec. The SDK was the defect.
What changed
Immediate fix (
e5c66290): SDKValidatorstruct aligned to$defs/Validatorinruntime/prompt/schema/promptpack.schema.json—type,enabled(required),fail_on_violation(optional),params(optional).convertPackValidatorsToHooksupdated to skip disabled validators, honor the spec default of monitor-only whenfail_on_violationis absent, and passParams(notConfig) to the guardrail factory.Prevention mechanisms:
2538c7d3) — walkspack.Validatorvia reflection and diffs against the embedded schema's$defs/Validator. Any future tag rename, extra field, or required-with-omitempty fails the build.a02bed79) — pins thatValidateAgainstSchemarejects validators with forbiddenmonitor/config/messagefields before struct unmarshal.ParamValidatorinterface (4edfb07b,580cfe11) — new optional interface inruntime/evalsthat handlers can implement to expose required-param contracts. Implemented onMaxLengthHandler,MinLengthHandler,WorkflowStateIsHandler,GuardrailTriggeredHandler.a32e8170) —NewGuardrailHookFromRegistrynormalises params (ApplyDefaults+NormalizeParams) and callsValidateParams, so pack validators with unusable params fail at SDK load instead of silently enforcing every turn.ValidateEvalTypesextended (af6ad2b3) — the existing function used by Arena fail-fast and SDK eval preflight now also invokesParamValidator. Arena gains stricter checks for free.d1e76e49) —sdk/eval_middleware.gofilters bad eval defs at middleware creation with a warning, mirroring the existing validator warn-and-skip inconvertPackValidatorsToHooks.sdk.ValidatePack(path, skipSchemaValidation)preflight API (8849cb5b,2809df11,a868f1a2) — lets CI gates and operator tools check whether a pack will load cleanly beforesdk.Open()is called. Strict-by-default (uses the embedded promptpack schema). Returns[]PackIssuefor semantic problems anderrorfor structural/schema failures.Tests:
77c070ad,b88a385f) — full pathsdk.Open() → pack.Load() → Send()with a mock provider, asserts content unchanged AND validator actually registered (not silently dropped by the warn-and-skip layer). Empirically verified to fail if the struct tag regresses.9f74362c) — literal mirror of the reproducer in Pack validators silently run with empty params due to SDK/runtime struct tag mismatch #933 (fail_on_violation: false, 25-char response).7517857c) — pins that Arena's existing fail-fast picks upParamValidatorerrors.ValidatePacktests covering valid pack, unknown validator type, missing required params, disabled skip, unknown eval type, missing eval params, multiple issues, file-not-found, structural error, strict schema rejection of non-spec fields, and the skip-schema opt-out.Latent bugs surfaced
The stricter validation revealed two pre-existing test-fixture bugs, both fixed in the same commits that introduced the stricter check:
TestNewGuardrailHook_AllTypesdeclaredmax_lengthwith onlymax_tokens(nomax_characters) — was relying on silent runtime tolerance.TestEvalMiddleware_ResolvesPackAndPromptEvalsused fabricated eval types (regex_override) and barelengthwith empty params — was bypassing validation entirely.API addition
Test plan
ParamValidatoron all four handlers with happy paths, missing keys, nil params, zero, negative, and alias key namesParamValidatorValidateEvalTypesextension with stub handleradditionalProperties:falsefieldsValidatorstructfail_on_violation:truefail_on_violation:falsesdk.ValidatePacktests covering 12 scenarios including strict default and opt-outruntime,sdk,pkg,tools/arena,tools/packc,tools/schema-gen) passes with-count=1 -racegolangci-lint run --new-from-rev=main) on all changed files