Skip to content

Prompt quality auditing: self-validation protocol for assembled prompts #134

@Alan-Jowett

Description

@Alan-Jowett

Summary

Add a prompt quality auditing protocol that validates the assembled prompt itself — checking structural completeness, coherence, and alignment before the prompt is used.

Motivation

GitHub Spec Kit introduced a powerful concept: "unit tests for requirements" — checklists that test whether the requirements themselves are well-written, not whether the implementation works. Examples:

  • ✅ ""Are error handling requirements specified for all failure modes?"" (tests the spec)
  • ❌ ""Test error handling works"" (tests the implementation)

PromptKit should apply this same rigor to its own output: the assembled prompt. Before a user loads a prompt into an LLM session, we should be able to validate:

  • Are all {{param}} placeholders filled? (no leftover template variables)
  • Does the persona's domain align with the template's domain?
  • Are non-goals specified? (prevents scope creep)
  • Are all referenced protocols actually loaded?
  • Does the format's produces artifact type match the template's output_contract?
  • Are there conflicting instructions between protocols?
  • Is the prompt within reasonable token bounds for the target model?

Proposed Design

  1. New protocol: guardrails/prompt-self-validation — a meta-protocol that checks the assembled prompt for structural issues
  2. Checklist approach: Following Spec Kit's pattern, produce a quality checklist:
    • Completeness: All params filled, all sections present
    • Coherence: Persona domain matches task domain
    • Consistency: No conflicting protocol instructions
    • Measurability: Output format has concrete structure requirements
    • Scope: Non-goals are defined
  3. CLI integration: npx promptkit assemble could run validation automatically and emit warnings

Credit

This pattern is inspired by GitHub Spec Kit's ""unit tests for requirements"" concept in their checklist-template.md. The insight — testing the specification quality rather than the implementation — maps directly to testing prompt quality rather than LLM output quality.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions