Skip to content

[codex] Add manifest auxiliary artifact verification#158

Merged
project-navi-bot merged 4 commits into
mainfrom
codex/auxiliary-artifact-manifest
Jun 3, 2026
Merged

[codex] Add manifest auxiliary artifact verification#158
project-navi-bot merged 4 commits into
mainfrom
codex/auxiliary-artifact-manifest

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Owner

Summary

  • add auxiliary_artifacts to the ordvec-manifest schema for caller-named sidecars with path, SHA-256, byte length, and required/optional presence
  • verify declared auxiliary artifacts under the same default path policy as index and row-map paths, with deterministic per-member report entries and stable reason/error codes
  • include auxiliary artifact state/bytes in the SQLite verification cache key so stale reports are not reused after sidecar drift or optional absent/present changes
  • document the contract in the manifest README, provenance docs, root README, and changelog

Scope

Validation

  • cargo fmt --check
  • cargo test -p ordvec-manifest
  • cargo test -p ordvec-manifest --features sqlite
  • cargo check -p ordvec-manifest --no-default-features
  • cargo clippy -p ordvec-manifest --all-targets --features sqlite -- -D warnings
  • git diff --check

Adversarial Review

  • Read-only scouts checked manifest: verify named auxiliary artifacts beside an index #144 scope, ordgrep downstream sidecar needs, and the existing ordvec-manifest surface before implementation.
  • Final adversarial review found no blockers; its only residual risk was optional-absent to present SQLite cache behavior, which is now covered by sqlite_cache_key_distinguishes_optional_auxiliary_absent_and_present.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces named auxiliary artifact verification to ordvec-manifest. It allows manifests to declare caller-owned sidecars (auxiliary artifacts) that are integrity-checked for path, size, and SHA-256 digest under the same path policy as the primary index. It supports both required and optional sidecars, produces deterministic report entries, and integrates with the SQLite cache to invalidate cached reports when sidecar bytes or states change. Documentation and comprehensive tests have been updated accordingly. There are no review comments, so no feedback is provided.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Fieldnote-Echo Fieldnote-Echo marked this pull request as ready for review June 3, 2026 13:57
@Fieldnote-Echo Fieldnote-Echo requested a review from Copilot June 3, 2026 13:57
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add manifest auxiliary artifact verification with sidecar integrity checks

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add auxiliary artifact verification to manifest schema with required/optional sidecar support
• Verify declared sidecars' paths, SHA-256 digests, and byte lengths under default path policy
• Include auxiliary artifact state/bytes in SQLite cache key for stale report invalidation
• Document auxiliary artifact contract in manifest and provenance documentation
Diagram
flowchart LR
  A["Manifest Schema"] -->|"Add auxiliary_artifacts field"| B["AuxiliaryArtifact struct"]
  B -->|"name, path, sha256, file_size_bytes, required"| C["Verification Logic"]
  C -->|"Path resolution & validation"| D["State Determination"]
  D -->|"Verified/OptionalAbsent/MissingRequired/Failed"| E["AuxiliaryArtifactReport"]
  E -->|"Include in cache key"| F["SQLite Cache"]
  F -->|"Invalidate on drift"| G["Fresh Verification"]

Loading

Grey Divider

File Changes

1. ordvec-manifest/src/lib.rs ✨ Enhancement +308/-0

Core auxiliary artifact verification and schema definitions

• Add AuxiliaryArtifact struct with name, path, sha256, file_size_bytes, and required fields
• Add AuxiliaryArtifactReport struct and AuxiliaryArtifactState enum for verification results
• Implement verify_auxiliary_artifacts() function to verify declared sidecars with deterministic
 ordering
• Implement validate_auxiliary_artifact_shape() to check schema constraints (non-empty names,
 unique names, valid SHA-256)
• Implement resolve_auxiliary_artifact_path() to handle path resolution with escape detection and
 optional/required logic
• Add auxiliary_artifacts field to IndexManifest and VerificationReport structures
• Initialize auxiliary_artifacts vector in manifest creation and verification report initialization

ordvec-manifest/src/lib.rs


2. ordvec-manifest/src/main.rs ✨ Enhancement +4/-0

Display auxiliary artifacts count in CLI output

• Add display of auxiliary_artifacts count in manifest info output

ordvec-manifest/src/main.rs


3. ordvec-manifest/src/sqlite.rs ✨ Enhancement +96/-4

SQLite cache key integration for auxiliary artifacts

• Add auxiliary_artifacts_sha256 column to verification_reports table schema
• Add migration logic to handle schema updates for new auxiliary artifacts column
• Implement current_auxiliary_artifacts_sha256() to compute cache key from current artifact state
• Implement auxiliary_artifacts_sha256_from_report() to extract cache key from verification report
• Add AuxiliaryArtifactCacheEntry struct for serializable cache key computation
• Update cache key queries and storage to include auxiliary artifacts SHA-256 digest
• Update cache invalidation logic to distinguish optional absent vs present states

ordvec-manifest/src/sqlite.rs


View more (5)
4. ordvec-manifest/tests/manifest.rs 🧪 Tests +252/-4

Comprehensive test coverage for auxiliary artifacts

• Add auxiliary_artifact() helper function for test artifact creation
• Add auxiliary_artifacts_verify_and_report_deterministically() test verifying sorted report order
• Add auxiliary_artifacts_fail_closed_on_tamper_missing_and_path_escape() test for integrity
 checks
• Add auxiliary_artifact_schema_rejects_unknown_fields_and_duplicate_names() test for schema
 validation
• Add sqlite_cache_key_includes_auxiliary_artifact_bytes() test for cache invalidation on sidecar
 drift
• Add sqlite_cache_key_distinguishes_optional_auxiliary_absent_and_present() test for optional
 state transitions

ordvec-manifest/tests/manifest.rs


5. CHANGELOG.md 📝 Documentation +6/-0

Update changelog with auxiliary artifacts feature

• Document addition of named auxiliary artifact verification feature
• Note support for required/optional sidecar states and cache invalidation

CHANGELOG.md


6. README.md 📝 Documentation +4/-3

Update root README with auxiliary artifacts mention

• Update manifest verifier description to include named auxiliary sidecars
• Clarify that verifier checks auxiliary artifact path, size, and digest integrity

README.md


7. docs/INDEX_PROVENANCE.md 📝 Documentation +11/-0

Document auxiliary artifact verification contract

• Document auxiliary artifact verification checks (path, SHA-256, byte length)
• Explain auxiliary artifact purpose for application-owned sidecars
• Clarify verifier reports presence/absence and integrity without interpreting bytes
• Note caller responsibility to load sidecars only after verification

docs/INDEX_PROVENANCE.md


8. ordvec-manifest/README.md 📝 Documentation +20/-10

Document auxiliary artifacts in manifest README

• Update feature description to include named auxiliary artifacts
• Document auxiliary_artifacts field structure and semantics
• Explain required vs optional artifact behavior and verification outcomes
• Update SQLite cache description to include auxiliary artifact state/bytes in cache key

ordvec-manifest/README.md


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Jun 3, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (1) 🎨 UX issues (0) 🔗 Cross-repo conflicts (0)

Grey Divider


Action required

1. ordvec-manifest missing report examples ✓ Resolved 📎 Requirement gap ⚙ Maintainability
Description
Documentation updates describe auxiliary artifact verification but do not include concrete example
verification reports for both a successful run and a tampered/failed case. This violates the
requirement to document the unified report format with explicit examples.
Code

ordvec-manifest/README.md[R77-94]

Evidence
PR Compliance ID 7 requires documentation to include at least two example reports (success and
failure/tamper). The updated ordvec-manifest README documents auxiliary artifact semantics but
provides no example report outputs illustrating those semantics.

Documentation includes example reports for success and tamper/failure cases
ordvec-manifest/README.md[1-94]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The documentation does not include example unified verification reports (JSON) showing both a success case and a tampered/failed case, as required for downstream auditability and user comprehension.

## Issue Context
The README documents the verifier behavior and limits, including `auxiliary_artifacts`, but does not show sample report payloads that demonstrate statuses/codes/fields.

## Fix Focus Areas
- ordvec-manifest/README.md[77-94]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Cache ignores failed aux bytes 🐞 Bug ≡ Correctness
Description
SQLite cache keys omit auxiliary artifact signatures whenever any auxiliary entry is
Failed/MissingRequired, so a cached failing report can be reused even after auxiliary bytes change
(still failing), returning stale observed sha/size in the report. This breaks the stated cache
contract of invalidating reports on sidecar drift.
Code

ordvec-manifest/src/sqlite.rs[R394-455]

Evidence
current_auxiliary_artifacts_sha256 returns None when auxiliary verification produces any errors,
and auxiliary_artifacts_sha256_from_report also returns None for Failed/MissingRequired
states—so cache lookup treats all failing auxiliary states as equivalent (NULL) and may reuse
stale reports. Meanwhile, auxiliary verification records observed hash/size even before marking an
entry Failed for mismatches, so the data needed to distinguish drift is available but discarded by
the cache-key builder.

ordvec-manifest/src/sqlite.rs[394-455]
ordvec-manifest/src/lib.rs[1139-1171]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
SQLite cache-key computation drops `auxiliary_artifacts_sha256` to `NULL` whenever any auxiliary artifact is `Failed` or `MissingRequired` (and also bails out when `verify_auxiliary_artifacts` emits any errors). This allows a cached *failing* report to be reused even if the auxiliary file changes again but remains failing, producing stale `sha256`/`size_bytes` fields in the cached report.

## Issue Context
- `verify_auxiliary_artifacts` already captures observed `sha256` and `size_bytes` before marking an entry as `Failed` for digest/size mismatches, so the cache key can still be made sensitive to drift even in failure cases.
- The cache should invalidate when auxiliary state/bytes change, regardless of whether verification succeeds.

## Fix Focus Areas
- ordvec-manifest/src/sqlite.rs[394-455]
- ordvec-manifest/src/lib.rs[1113-1171]

## Fix approach
1. Remove/relax the `if !report.errors.is_empty() { return Ok(None); }` guard in `current_auxiliary_artifacts_sha256` so you still produce a cache component when verification fails.
2. Update `auxiliary_artifacts_sha256_from_report` to *include* `Failed` and `MissingRequired` entries in the serialized cache material instead of returning `Ok(None)`.
  - Include at minimum: `name`, `manifest_path`, `required`, `state`, and `reason_code`.
  - For `Failed` entries where `entry.sha256`/`entry.size_bytes` are present (e.g., mismatch cases), include the observed values so drift changes the cache key.
  - For `MissingRequired`, include state/reason without sha/size.
3. Keep ordering deterministic by hashing the already-deterministic `report.auxiliary_artifacts` sequence.
4. Add/extend a test demonstrating: failing aux (mismatch) -> mutate aux to a different mismatch -> cached report must not be reused (observed sha/size should update).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. No aux artifact count limit 📎 Requirement gap ⛨ Security
Description
The new auxiliary_artifacts feature can accept an unbounded number of entries and verification
sorts/iterates them without any explicit, configurable resource limit. This can enable excessive
CPU/memory use on hostile manifests and provides no stable error code for a declaration-count limit
violation.
Code

ordvec-manifest/src/lib.rs[R1206-1215]

Evidence
PR Compliance IDs 4 and 5 require explicit, configurable resource limits (including auxiliary
declaration counts once auxiliary artifacts exist) and stable error codes for limit violations. The
PR adds auxiliary_artifacts to the manifest and sorts/iterates over the full list (collect() +
sort_by(...)) with no limit enforcement, and VerifyOptions contains no limit fields to configure
such a bound.

Verification uses bounded parsing with explicit resource limits configurable via VerifyOptions
Stable error codes for limit violations with deterministic failure behavior and test coverage
ordvec-manifest/src/lib.rs[1518-1555]
ordvec-manifest/src/lib.rs[1206-1215]
ordvec-manifest/src/lib.rs[1400-1406]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`auxiliary_artifacts` declarations are unbounded: the verifier collects and sorts all entries and then verifies them without any explicit limit, and `VerifyOptions` has no configurable limit for auxiliary declaration count. This violates the checklist requirement for bounded parsing/resource limits and stable error codes for limit violations.

## Issue Context
- `auxiliary_artifacts` is now part of the manifest schema and is processed during verification.
- To fail deterministically under hostile input, introduce a safe default maximum (configurable via `VerifyOptions`) and emit a stable error code when exceeded.

## Fix Focus Areas
- ordvec-manifest/src/lib.rs[1400-1406]
- ordvec-manifest/src/lib.rs[1518-1555]
- ordvec-manifest/src/lib.rs[347-378]
- ordvec-manifest/src/lib.rs[1206-1215]
- ordvec-manifest/tests/manifest.rs[951-1065]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. AuxiliaryArtifactReport missing path/digest ✓ Resolved 📎 Requirement gap ◔ Observability
Description
Auxiliary artifact report entries for OptionalAbsent and MissingRequired do not include
resolved/canonical path information or any digest/length details, so callers must re-open the
manifest (or mine error strings) to log what was checked. This violates the requirement that report
entries carry resolved-path and digest/length fields sufficient for audit logging.
Code

ordvec-manifest/src/lib.rs[R1296-1303]

Evidence
PR Compliance ID 5 requires per-artifact report entries to include resolved/normalized path and
digest/length details sufficient for logging. In verify_auxiliary_artifacts, the OptionalAbsent
and MissingRequired branches set only state/reason_code and leave
canonical_path/sha256/size_bytes unset, and the AuxiliaryArtifactReport struct has no
separate fields for declared/expected digest/length.

Report includes resolved-path and digest details sufficient for logging
ordvec-manifest/src/lib.rs[1213-1311]
ordvec-manifest/src/lib.rs[1986-1996]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Auxiliary artifact report entries omit audit fields (resolved/normalized path and digest/length details) for non-verified states like `optional_absent` and `missing_required`.

## Issue Context
PR Compliance requires that each artifact entry in the verification report contains enough resolved-path and digest/length information for logging/auditing without re-reading the manifest. Currently, optional-absent/missing-required entries only carry `name`, `manifest_path`, `state`, and `reason_code`.

## Fix Focus Areas
- ordvec-manifest/src/lib.rs[1213-1311]
- ordvec-manifest/src/lib.rs[1986-1996]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Redundant base canonicalize ✓ Resolved 🐞 Bug ➹ Performance
Description
resolve_auxiliary_artifact_path() canonicalizes base_dir for every auxiliary artifact, causing
redundant filesystem work when many sidecars are declared and again when SQLite computes the
auxiliary cache key.
Code

ordvec-manifest/src/lib.rs[R1380-1395]

Evidence
Auxiliary verification iterates over all declared auxiliary artifacts and calls the resolver each
time; the resolver canonicalizes the same base_dir on every call. SQLite cache-key computation also
calls verify_auxiliary_artifacts, repeating the same pattern during cache evaluation.

ordvec-manifest/src/lib.rs[1213-1222]
ordvec-manifest/src/lib.rs[1380-1395]
ordvec-manifest/src/sqlite.rs[459-469]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`resolve_auxiliary_artifact_path()` calls `fs::canonicalize(base_dir)` on every auxiliary artifact verification. With up to `max_auxiliary_artifacts` entries (default 1024), this repeats identical filesystem work (and is also exercised in SQLite cache-key calculation via `current_auxiliary_artifacts_sha256()`).

### Issue Context
`base_dir` is constant per verification run (`document.base_dir`). Canonicalizing it once and reusing the result removes O(n) redundant `canonicalize()` calls while preserving the same path-escape enforcement.

### Fix Focus Areas
- ordvec-manifest/src/lib.rs[1213-1466]
- ordvec-manifest/src/sqlite.rs[459-469]

### Suggested approach
1. Compute `base_canonical` once per verification (e.g., at the start of `verify_auxiliary_artifacts`).
2. Pass `&base_canonical` into the resolver (either by adding a new resolver that accepts it, or by refactoring `resolve_auxiliary_artifact_path` to accept an optional precomputed canonical base).
3. Optional: if `options.allow_path_escape` is true, skip canonical-base computation entirely (since the `starts_with(base_canonical)` check is not needed).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf1c7eb99c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ordvec-manifest/src/sqlite.rs Outdated
Comment thread ordvec-manifest/src/lib.rs
Comment thread ordvec-manifest/src/sqlite.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends ordvec-manifest to support caller-declared “auxiliary artifacts” (sidecar files) in the manifest schema, verifies them under the existing path-safety policy, reports their verification state deterministically, and integrates auxiliary state/bytes into the SQLite verification cache key to avoid reusing stale reports after sidecar drift.

Changes:

  • Add auxiliary_artifacts to the manifest schema and verification report, including required/optional semantics and stable reason codes.
  • Implement auxiliary artifact verification (path policy + size + SHA-256) with deterministic report ordering.
  • Extend SQLite caching to account for auxiliary artifact state/bytes and document the new contract across READMEs/provenance/changelog.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
README.md Updates top-level README to mention auxiliary sidecar verification.
ordvec-manifest/tests/manifest.rs Adds tests for deterministic auxiliary reporting, fail-closed behavior, schema strictness, and SQLite cache-key behavior.
ordvec-manifest/src/sqlite.rs Adds auxiliary_artifacts_sha256 to cache key / schema and computes auxiliary cache-key material.
ordvec-manifest/src/main.rs Prints auxiliary artifact count in CLI manifest info output.
ordvec-manifest/src/lib.rs Adds schema field + report field + auxiliary verification implementation and shape validation.
ordvec-manifest/README.md Documents auxiliary artifact schema/behavior and SQLite cache implications.
docs/INDEX_PROVENANCE.md Documents auxiliary artifact verification as part of provenance checks.
CHANGELOG.md Notes the new auxiliary artifact verification feature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ordvec-manifest/src/lib.rs
Comment thread ordvec-manifest/src/sqlite.rs
Comment thread ordvec-manifest/src/sqlite.rs Outdated
@Fieldnote-Echo Fieldnote-Echo force-pushed the codex/auxiliary-artifact-manifest branch 2 times, most recently from a6cce83 to 0b19f0a Compare June 3, 2026 14:46
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo Fieldnote-Echo force-pushed the codex/auxiliary-artifact-manifest branch from 0b19f0a to ca62d24 Compare June 3, 2026 17:11
Copy link
Copy Markdown
Owner Author

Rebased #158 onto current main after #157 and added the remaining auxiliary sidecar byte bound.

  • Resolved rebase conflicts in README.md and ordvec-manifest/README.md, preserving the persisted-format/resource-limit docs and adding the auxiliary-artifact wording.
  • Added ResourceLimits::max_auxiliary_artifact_bytes with a 64 MiB default and CLI override --max-auxiliary-artifact-bytes.
  • Routed auxiliary artifact verification through sha256_file_bounded() with stable report code auxiliary_artifact_file_too_large instead of direct unbounded sha256_file().
  • Documented the new limit/code in ordvec-manifest/README.md, docs/INDEX_PROVENANCE.md, and the changelog.
  • Added auxiliary_artifact_byte_limit_is_enforced_before_hashing to pin the behavior.

Validation: cargo fmt --all --check, cargo test -p ordvec-manifest, and git diff --check.

@Fieldnote-Echo
Copy link
Copy Markdown
Owner Author

/agentic_review

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Jun 3, 2026

Code review by qodo was updated up to the latest commit ca62d24

Comment thread ordvec-manifest/README.md
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Copy link
Copy Markdown
Owner Author

Follow-up for the latest Qodo rereview on #158:

  • Added concrete ordvec-manifest README examples for a successful auxiliary-artifact report and a tamper/missing-required failure report.
  • Extended AuxiliaryArtifactReport with audit fields populated from the manifest side: resolved_path, expected_sha256, and expected_size_bytes, while preserving observed sha256/size_bytes when bytes are readable.
  • Canonicalized the auxiliary base directory once per verification run and reused it for each sidecar path check.
  • Confirmed the repeated count-limit and failed-cache-key findings are already addressed on the current head: ResourceLimits::max_auxiliary_artifacts plus auxiliary_artifact_count_limit_exceeded; current_auxiliary_artifacts_sha256() no longer bails on auxiliary errors; cache material includes failed and missing_required entries; sqlite test coverage includes failed sidecar byte drift.

Validation:

  • cargo fmt --all --check
  • cargo test -p ordvec-manifest
  • cargo test -p ordvec-manifest --features sqlite
  • git diff --check

@project-navi-bot project-navi-bot merged commit d86c546 into main Jun 3, 2026
30 checks passed
@project-navi-bot project-navi-bot deleted the codex/auxiliary-artifact-manifest branch June 3, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

manifest: verify named auxiliary artifacts beside an index

3 participants