Skip to content

Add C2PA Monitor experiment#459

Open
lnispel wants to merge 7 commits intoWordPress:developfrom
OpenVerifiable:feature/c2pa-monitor
Open

Add C2PA Monitor experiment#459
lnispel wants to merge 7 commits intoWordPress:developfrom
OpenVerifiable:feature/c2pa-monitor

Conversation

@lnispel
Copy link
Copy Markdown

@lnispel lnispel commented Apr 22, 2026

Read-only feature that detects [C2PA Content Credentials](https://c2pa.org/) in uploaded JPEG/PNG/WebP images at the add_attachment hook, captures the raw manifest store to a sidecar file under wp-content/uploads/ai-c2pa/, and persists a structured _wpai_monitor_record postmeta entry for downstream consumers.

What?

Closes

Adds a new C2pa_Monitor experiment that:

  • Detects C2PA segments via streaming magic-byte/segment walks: JPEG APP11/JUMBF, PNG caBX, WebP RIFF C2PA. Hard byte caps throughout.
  • Reassembles JPEG manifests fragmented across multiple APP11 markers by tracking Box Instance Numbers (per ISO 19566-5). Continuation segments do not need the c2pa/jumb token in their first 64 bytes.
  • Streams raw manifest bytes to disk under uploads/ai-c2pa/<attachment_id>.<format>.c2pa, hashing in flight (SHA-256). Postmeta stores the hash, length, and relative sidecar path — not the bytes themselves.
  • Creates the sidecar directory on demand with .htaccess (Apache deny) and index.php hardening. nginx operators must add a deny rule manually (documented in the experiment README).
  • Wraps the entire capture path in a fail-open try / catch ( Throwable ) boundary: errors land in the record's errors[] array and the upload itself is never blocked.
  • Adds no external dependencies, no Composer additions, and no outbound HTTP. Pure PHP, compatible with the plugin's PHP 7.4 floor.

Why?

C2PA Content Credentials are increasingly embedded in images uploaded to WordPress sites (camera firmware, AI image generators, editorial signing tools), but WordPress's image processing pipeline destroys them — APP11 segments don't survive GD/Imagick re-encoding. Capturing the manifest at add_attachment, before subsize generation, is the only point in the WordPress lifecycle where the original bytes are still intact.

This PR establishes that capture as read-only infrastructure: nothing user-facing changes, but the manifest data becomes available to downstream consumers (admin UI, REST endpoints, verification tooling) without each of them having to re-parse containers.

How?

Implementation lives entirely under includes/Experiments/C2pa_Monitor/ with one entry point:

  • C2pa_Monitor::capture_for_attachment() is hooked to add_attachment at priority 20, gated by MIME on image/jpeg, image/png, and image/webp.
  • Format_Detector walks containers and returns a location descriptor (segments + total length) without reading payload bytes.
  • Manifest_Reader consumes that descriptor, streams the bytes via fread into a hash context and an in-memory buffer, and produces an immutable Raw_Manifest value object.
  • Sidecar_Writer persists the bytes, ensuring the directory and hardening files exist exactly once.
  • Record normalizes the structured payload and persists it as JSON-encoded postmeta at _wpai_monitor_record.

Reviewing this PR. The diff is large but cleanly partitioned. The work was developed in five layers and the integration branch's commits are organized that way:

  1. Register the experiment (scaffolding, no behavior).
  2. Define the record schema (Raw_Manifest, Record, tests).
  3. Detect image formats (Format_Detector, fixtures, tests).
  4. Read and persist manifests (Manifest_Reader, Sidecar_Writer, tests).
  5. Wire the intake hook (register() body, capture_for_attachment, end-to-end tests).

Each layer is independently coherent if you'd rather review incrementally; otherwise, the integrated diff stands on its own.

Use of AI Tools

AI assistance: Yes
Tool(s): Cursor, Claude
Model(s):
Used for: Architecture and PR-segmentation planning, initial code drafts, test scaffolding. Final implementation, all design decisions, and the byte-level format work were reviewed and edited by me.

Testing Instructions

  1. Enable the AI plugin globally and toggle on C2PA Monitor under experiments.
  2. Upload a JPEG with embedded C2PA credentials (e.g., the Adobe or Truepic samples from the C2PA conformance set; c2patool can also generate test files).
  3. Inspect the attachment's postmeta for _wpai_monitor_record — it should be JSON with c2pa.present: true, a SHA-256 hash, and a sidecar_path_relative value.
  4. Confirm the sidecar file exists at wp-content/uploads/ai-c2pa/<attachment_id>.jpeg.c2pa and that its bytes match manifest_sha256.
  5. Upload a JPEG without C2PA — the postmeta should record c2pa.present: false and no sidecar should be written.
  6. Upload a non-image (e.g., a .txt file) — no postmeta should be written at all.
  7. Disable the experiment and re-upload — no postmeta should be written.

Automated coverage:

  • Format_Detector: magic bytes, single-segment APP11, multi-segment reassembly, interleaved APP0/APP1/APP2 around C2PA, generic JUMBF (non-C2PA) ignored, truncated input, JPEG_MAX_SEGMENTS cap (positive and negative), PNG/caBX, WebP simple + extended (VP8X) + odd-length padding.
  • Manifest_Reader: byte-exact roundtrip for JPEG/PNG/WebP, multi-segment reassembly, deterministic SHA-256, MAX_MANIFEST_BYTES rejection, missing file, empty segments, bad offsets.
  • Sidecar_Writer: write + roundtrip, hardening files, format sanitization, overwrite, multi-attachment coexistence, custom .htaccess preserved across ensure_dir().
  • Record: roundtrip, defaults on empty input, JSON-not-serialize storage format, null on corrupt JSON, null when absent.
  • C2pa_Monitor end-to-end: JPEG/PNG/WebP present, JPEG absent, unsupported MIME, fail-open on bogus ID, truncated JPEG, duration_ms recorded, add_attachment hook actually fires, file-deleted-on-disk produces errors[0].stage = 'resolve_path'.

Synthetic fixtures are generated at runtime so no binary blobs land in the repo and there is no third-party fixture licensing question.

Deferred (out of scope for this PR)

  • JUMBF box reader and CBOR decoder; populating c2pa.decoded claim summary (claim generator, digital source type, action history).
  • Admin UI, media library badge, CR icon — gated on C2PA conformance.
  • Cryptographic verification of manifests.
  • Preserving manifests through WordPress's GD/Imagick subsize pipeline.

Screenshots or screencast

No UI changes in this PR.

Changelog Entry

Added - New Experiment: C2PA Monitor — read-only detection of [C2PA Content Credentials](https://c2pa.org/) in uploaded JPEG/PNG/WebP images. Captures the raw manifest store to a sidecar file under wp-content/uploads/ai-c2pa/ and stores a structured _wpai_monitor_record postmeta entry for downstream consumers. Fail-open and never blocks an upload. JUMBF/CBOR claim decoding deferred to a follow-up PR.

Open WordPress Playground Preview

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 81.68103% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.25%. Comparing base (1a55305) to head (eafd662).

Files with missing lines Patch % Lines
...ludes/Experiments/C2pa_Monitor/Format_Detector.php 79.67% 38 Missing ⚠️
includes/Experiments/C2pa_Monitor/C2pa_Monitor.php 78.44% 25 Missing ⚠️
...ludes/Experiments/C2pa_Monitor/Manifest_Reader.php 80.35% 11 Missing ⚠️
...cludes/Experiments/C2pa_Monitor/Sidecar_Writer.php 86.95% 6 Missing ⚠️
includes/Experiments/C2pa_Monitor/Record.php 92.15% 4 Missing ⚠️
includes/Experiments/C2pa_Monitor/Raw_Manifest.php 87.50% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop     #459      +/-   ##
=============================================
+ Coverage      69.07%   70.25%   +1.17%     
- Complexity       957     1129     +172     
=============================================
  Files             60       66       +6     
  Lines           4511     4975     +464     
=============================================
+ Hits            3116     3495     +379     
- Misses          1395     1480      +85     
Flag Coverage Δ
unit 70.25% <81.68%> (+1.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jeffpaul jeffpaul added this to the 1.0.0 milestone Apr 22, 2026
@lnispel lnispel closed this Apr 26, 2026
@lnispel lnispel force-pushed the feature/c2pa-monitor branch from ec97690 to 43d3bde Compare April 26, 2026 20:06
@github-project-automation github-project-automation Bot moved this from In progress to Done in WordPress AI Planning & Roadmap Apr 26, 2026
lnispel added 5 commits April 26, 2026 13:06
… tests

- JSON postmeta contract; @see DIF wpai-monitor-record schema
- Partial README (postmeta + constraints)

Made-with: Cursor
- Format_Detector for JPEG/PNG/WebP C2PA segments
- Synthetic fixtures and Format_DetectorTest

Made-with: Cursor
- Streaming Manifest_Reader and Sidecar_Writer
- README: sidecar layout, rationale, test fixtures

Made-with: Cursor
- capture_for_attachment, sidecar, Record persistence
- C2pa_MonitorTest; README: full flow, DIF schema cross-links, out of scope

Made-with: Cursor
lnispel added a commit to OpenVerifiable/ai that referenced this pull request Apr 26, 2026
@lnispel lnispel reopened this Apr 26, 2026
@lnispel lnispel force-pushed the feature/c2pa-monitor branch 2 times, most recently from 1e33baa to 8257f84 Compare April 26, 2026 20:59
@lnispel lnispel marked this pull request as ready for review April 26, 2026 21:12
@github-actions
Copy link
Copy Markdown

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @lnispel.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Unlinked contributors: lnispel.


To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Synthetic fixtures are not valid renderable images. GD's WebP codec
fatals when create_upload_object triggers wp_generate_attachment_metadata.
Suppress intermediate_image_sizes_advanced in setUp/tearDown so GD is
never invoked on fixture bytes.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants