Skip to content

Add generic voice artifact manifest schema#812

Merged
Atharva-Kanherkar merged 3 commits into
mainfrom
codex/voice-manifest-schema
May 14, 2026
Merged

Add generic voice artifact manifest schema#812
Atharva-Kanherkar merged 3 commits into
mainfrom
codex/voice-manifest-schema

Conversation

@Atharva-Kanherkar
Copy link
Copy Markdown
Collaborator

Closes #811\n\n## Summary\n- adds a producer-neutral JSON Schema for voice artifact manifests\n- validates manifest examples and common malformed references in backend tests\n- links the manifest schema from the generic voice artifact contract docs\n\n## Testing\n- cd backend && go test ./internal/voiceartifacts\n\n## Review checkpoint\n- testing/codex-voice-manifest-schema.md\n\nThis stays generic to AgentClash voice evals. Voicey is not a canonical schema surface here; product-specific details belong in metadata or raw traces.

@Atharva-Kanherkar Atharva-Kanherkar merged commit 1047b09 into main May 14, 2026
5 checks passed
@Atharva-Kanherkar Atharva-Kanherkar deleted the codex/voice-manifest-schema branch May 14, 2026 12:10
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR introduces a producer-neutral JSON Schema (Draft 2020-12) for AgentClash voice artifact manifests, adds backend Go tests that validate both positive and negative examples against that schema, and links the new schema from the voice-artifact contracts docs.

  • docs/schemas/voice-artifact-manifest.schema.json: New schema encoding schema_version as a pinned const, UUID patterns for run_id/run_agent_id, location-based if/then guards (local_path vs object_storage), path-traversal rejection, and uniqueItems for exact-duplicate detection.
  • backend/internal/voiceartifacts/schema_test.go: Two new test functions; the rejection suite carries an unreachable validate struct field and an unconditional strings.Contains assertion on the error message that diverges from the existing optional-errSubstr pattern.
  • web/content/docs/concepts/voice-artifact-contracts.mdx and testing/codex-voice-manifest-schema.md: Documentation wiring and test-contract description; no logic changes.

Confidence Score: 4/5

Safe to merge; the schema and documentation are correct, and the tests cover the intended cases, but the test helper code has minor rough edges that could cause confusion on future library upgrades.

The JSON schema itself is well-formed and the positive/negative test coverage is meaningful. Two issues in the test file keep the score from being clean: an unused validate field that is structurally dead code, and an unconditional strings.Contains assertion on the jsonschema library's error text that could produce misleading failures if the library's error formatting changes during a future upgrade. Neither affects current correctness.

backend/internal/voiceartifacts/schema_test.go — the rejection test struct and error-string check both warrant a quick look before merging.

Important Files Changed

Filename Overview
backend/internal/voiceartifacts/schema_test.go Adds two new test functions for the manifest schema; contains an unreachable validate field and an unconditional error-string assertion that diverges from the established pattern.
docs/schemas/voice-artifact-manifest.schema.json New JSON Schema (Draft 2020-12) for voice artifact manifests; correctly encodes required kinds, location-based if/then constraints, path-traversal guards, and checksum/UUID patterns.
testing/codex-voice-manifest-schema.md New markdown test-contract document for the manifest schema; describes functional behavior, unit tests, and expected test commands. Documentation only, no issues.
web/content/docs/concepts/voice-artifact-contracts.mdx Documentation update linking the new manifest schema and clarifying that the Go validator enforces unique artifact keys beyond what JSON Schema's uniqueItems covers.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    M[Voice Artifact Manifest] --> SV["schema_version == '2026-05-13'"]
    M --> RI["run_id (UUID)"]
    M --> RAI["run_agent_id (UUID)"]
    M --> VSI["voice_session_id (minLength: 1)"]
    M --> A["artifacts array (minItems: 1, uniqueItems: true)"]
    M --> META["metadata (optional)"]

    A --> CONTAINS["allOf: contains checks"]
    CONTAINS --> C1["caller_audio ✓"]
    CONTAINS --> C2["agent_audio ✓"]
    CONTAINS --> C3["transcript_json ✓"]
    CONTAINS --> C4["waveform_timeline_json ✓"]
    CONTAINS --> C5["structured_output_json ✓"]

    A --> AREF["artifact_ref"]
    AREF --> KEY["key (minLength: 1)"]
    AREF --> KIND["kind (enum: 11 values)"]
    AREF --> LOC["location"]
    AREF --> CHKSUM["checksum_sha256 (^[0-9a-f]{64}$)"]

    LOC --> LP["local_path"]
    LOC --> OS["object_storage"]

    LP --> LPREQ["requires: path (relative, no traversal)"]
    LP --> LPFORBID["must NOT have: bucket, object_key"]

    OS --> OSREQ["requires: bucket + object_key"]
    OS --> OSFORBID["must NOT have: path"]
Loading

Fix All in Codex

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
backend/internal/voiceartifacts/schema_test.go:254-258
**Unused `validate` field is dead code**

The `validate func(t *testing.T, manifest map[string]any)` field is declared in the struct but none of the eight test cases ever assign it. The `if tt.validate != nil` guard on line 316 is therefore unreachable. The existing rejection-test helper (`TestVoiceReportSchemasRejectInvalidExamples`) uses a simpler `errSubstr string` pattern, which is both populated and checked. Either remove the field or populate it in at least one case to keep the code meaningful.

### Issue 2 of 2
backend/internal/voiceartifacts/schema_test.go:322-325
**Unconditional schema-filename assertion diverges from the established pattern and may mask real failures**

Every rejection case now hard-requires `err.Error()` to contain `"voice-artifact-manifest.schema.json"`. The existing `TestVoiceReportSchemasRejectInvalidExamples` uses a per-case optional `errSubstr` field that is only checked `if tt.errSubstr != ""`, so the substring check can be enabled selectively. If the `jsonschema` library changes how it formats its error strings (e.g., during a library upgrade), *all* eight rejection sub-tests would fail with `"error … does not mention manifest schema"` even though each invalid document was correctly rejected — the misleading failure would make a library upgrade look like a schema regression.

Reviews (1): Last reviewed commit: "fix: align voice manifest schema preflig..." | Re-trigger Greptile

Comment on lines +254 to +258
tests := []struct {
name string
mutate func(map[string]any)
validate func(t *testing.T, manifest map[string]any)
}{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused validate field is dead code

The validate func(t *testing.T, manifest map[string]any) field is declared in the struct but none of the eight test cases ever assign it. The if tt.validate != nil guard on line 316 is therefore unreachable. The existing rejection-test helper (TestVoiceReportSchemasRejectInvalidExamples) uses a simpler errSubstr string pattern, which is both populated and checked. Either remove the field or populate it in at least one case to keep the code meaningful.

Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/internal/voiceartifacts/schema_test.go
Line: 254-258

Comment:
**Unused `validate` field is dead code**

The `validate func(t *testing.T, manifest map[string]any)` field is declared in the struct but none of the eight test cases ever assign it. The `if tt.validate != nil` guard on line 316 is therefore unreachable. The existing rejection-test helper (`TestVoiceReportSchemasRejectInvalidExamples`) uses a simpler `errSubstr string` pattern, which is both populated and checked. Either remove the field or populate it in at least one case to keep the code meaningful.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Comment on lines +322 to +325
if tt.validate != nil {
tt.validate(t, manifest)
}
err := schema.Validate(asJSONDocument(t, manifest))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unconditional schema-filename assertion diverges from the established pattern and may mask real failures

Every rejection case now hard-requires err.Error() to contain "voice-artifact-manifest.schema.json". The existing TestVoiceReportSchemasRejectInvalidExamples uses a per-case optional errSubstr field that is only checked if tt.errSubstr != "", so the substring check can be enabled selectively. If the jsonschema library changes how it formats its error strings (e.g., during a library upgrade), all eight rejection sub-tests would fail with "error … does not mention manifest schema" even though each invalid document was correctly rejected — the misleading failure would make a library upgrade look like a schema regression.

Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/internal/voiceartifacts/schema_test.go
Line: 322-325

Comment:
**Unconditional schema-filename assertion diverges from the established pattern and may mask real failures**

Every rejection case now hard-requires `err.Error()` to contain `"voice-artifact-manifest.schema.json"`. The existing `TestVoiceReportSchemasRejectInvalidExamples` uses a per-case optional `errSubstr` field that is only checked `if tt.errSubstr != ""`, so the substring check can be enabled selectively. If the `jsonschema` library changes how it formats its error strings (e.g., during a library upgrade), *all* eight rejection sub-tests would fail with `"error … does not mention manifest schema"` even though each invalid document was correctly rejected — the misleading failure would make a library upgrade look like a schema regression.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a generic voice artifact manifest schema

1 participant