Skip to content

Wpcomsh fatal-error: emit a transportable plugin/version/core/php signature#48369

Merged
taipeicoder merged 6 commits into
trunkfrom
add/wpcom-fatal-error-signature
Apr 29, 2026
Merged

Wpcomsh fatal-error: emit a transportable plugin/version/core/php signature#48369
taipeicoder merged 6 commits into
trunkfrom
add/wpcom-fatal-error-signature

Conversation

@taipeicoder
Copy link
Copy Markdown
Contributor

@taipeicoder taipeicoder commented Apr 29, 2026

Proposed changes

  • Add wpcom_build_fatal_error_signature() / wpcom_decode_fatal_error_signature() to jetpack-mu-wpcom (src/common/fatal-error-signature.php) — a shared helper that produces a base64url-encoded JSON token over {kind, slug, version, wp, php}, fully reversible so consumers can group on the decoded parts. Sized for reuse by other mu-plugins (e.g. Plugin Conflicts Guardian: Pre-flight activation gate #48261's activation-probe failure path) so they can correlate on the same encoded token.
  • Wire wpcomsh's fatal-error screen to log the signature to logstash via WPCOMSH_Log::unsafe_direct_log(). Telemetry runs independent of viewer (admin or anonymous) so dashboards aren't biased toward admin-only sites.
  • Two-layer dedup so a persistent fatal doesn't emit one record + one outbound HTTP per visitor:
    • Coarse file-path gate (anonymous viewers only): skips plugin identification entirely on duplicate hits — wp_cache_add( 'wpcomsh_fatal_file:' . sha256($error['file']), …, 5 min ). Identification (glob + get_plugin_data) only runs on the first anonymous request per file/5-min, then no-ops the rest.
    • Fine signature gate (all viewers): skips the WPCOMSH_Log file load + outbound HTTP — wp_cache_add( 'wpcomsh_fatal_sig:' . sha256($signature), …, 5 min ). Belt-and-suspenders for the case where the file-path gate doesn't catch (e.g. same plugin loading from multiple paths).
  • Log the decoded parts (kind, slug, extension_version, wp_version, php_version) as separate fields alongside the encoded signature, so dashboards can term-aggregate without a base64+JSON decode step.
  • Defensive class_exists( 'WPCOMSH_Log', false ) + on-demand require_once from WPCOMSH__PLUGIN_DIR_PATH . '/jetpack_vendor/automattic/jetpack-mu-wpcom/...' (the wpcomsh convention used by lib/tonesque.php and lib/class.color.php), gated on defined( 'WPCOMSH__PLUGIN_DIR_PATH' ) for the bootstrap window before constants.php is loaded.
  • The fatal-error screen and recovery email UI are unchanged — the signature is purely server-side telemetry.

Related product discussion/links

  • The shared helper is sized for reuse by the in-flight Plugin Conflicts Guardian work in Plugin Conflicts Guardian: Pre-flight activation gate #48261, which can call wpcom_build_fatal_error_signature() directly from its activation-probe failure path. The encoded signature field stays in the wpcomsh logstash payload so cross-system consumers can join on the same opaque token.

Does this pull request change what data or activity we track or use?

Yes. On the first fatal per (site, signature, 5-min) where wpcomsh can identify the offending extension, a new logstash record is queued via the existing WPCOMSH_Log pipeline.

Signature contents (the new fields this PR adds to the logstash record):

  • extra.signature — base64url-encoded JSON token. Decoded JSON contains:
    • kindplugin, muplugin, or theme
    • slug — extension slug, lowercased and trimmed
    • version — extension version string
    • wp$wp_version
    • phpPHP_MAJOR_VERSION.PHP_MINOR_VERSION.PHP_RELEASE_VERSION (normalized so distro suffixes don't fragment grouping)
  • extra.kind, extra.slug, extra.extension_version, extra.wp_version, extra.php_version — the same parts emitted as separate fields for term-aggregation.

Logstash record envelope (added by WPCOMSH_Log::send_to_api(), applies to every wpcomsh telemetry record):

  • Top-level siteurl — the result of get_site_url(). Consistent with the existing safeguard / woa / marketplace / atomic-storage telemetry baseline; no new exposure beyond that baseline.

What we deliberately do not include: error message, stack trace, file paths, user id, request URI.

Coverage caveat: wpcomsh_fatal_identify_plugin() only resolves directory-based plugins/themes (e.g. wp-content/plugins/akismet/akismet.php). Single-file extensions (e.g. wp-content/mu-plugins/foo.php) aren't identified and therefore produce no signature record. Pre-existing limitation of the identification helper, not introduced by this PR.

Testing instructions

  • On an Atomic test site, trigger a fatal from a directory-based plugin or mu-plugin (e.g. drop wp-content/mu-plugins/boom/boom.php containing trigger_error( 'boom', E_USER_ERROR ) on init).
  • Confirm the WordPress.com fatal-error screen still renders the same way for admins and anonymous viewers.
  • In Kibana (log2logstash index, filter tags:atomic_wpcomsh_errors AND extra.messages.message:"wpcomsh_fatal_signature", or just free-text "wpcomsh_fatal_signature"), confirm a record arrives shortly after the request. The wpcom receiver wraps it: top-level message is [remote_error] [blog: …] [transfer: …], our payload sits at extra.messages[0].message with the decoded parts under extra.messages[0].extra.* (signature, kind, slug, extension_version, wp_version, php_version).
  • Verify dedup: hit the failing site multiple times in quick succession (anonymous + logged-in admin). Confirm only one logstash record is emitted per signature within the 5-minute TTL window.
  • Decode the signature locally to verify the round-trip:
    var_dump( wpcom_decode_fatal_error_signature( '<token from logstash>' ) );
    Verify the decoded parts match the offending plugin and the site's WP / PHP versions, and agree with the separate extra.* fields.
  • Trigger a fatal with no identifiable file path (e.g. an out-of-memory in core) and confirm no wpcomsh_fatal_signature record is emitted.

…nature

Add `wpcom_build_fatal_error_signature()` /
`wpcom_decode_fatal_error_signature()` to jetpack-mu-wpcom as a shared
helper, then wire wpcomsh's fatal-error screen to compute the signature
once per fatal, fire `do_action( 'wpcomsh_fatal_signature', ... )`, and
ship it to logstash via `WPCOMSH_Log::unsafe_direct_log()` — the
in-repo precedent already used by safeguard, woa, marketplace, and the
Atomic storage provider.

The signature is PII-free: only the kind, lowercased extension slug,
extension version, WordPress version, and PHP version (normalized to
MAJOR.MINOR.PATCH so distro suffixes don't fragment the grouping). It
travels as a single base64url-encoded JSON token, fully decodable by
consumers so they can group on parts.

Screen and recovery email rendering are unchanged — the signature is
purely server-side telemetry.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WoA dev site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin (WordPress.com Site Helper), and enable the add/wpcom-fatal-error-signature branch.
  • To test on Simple, run the following command on your sandbox:
bin/jetpack-downloader test jetpack-mu-wpcom-plugin add/wpcom-fatal-error-signature

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!


Wpcomsh plugin:

  • Next scheduled release: Atomic deploys happen twice daily on weekdays (p9o2xV-2EN-p2)

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.

@github-actions github-actions Bot added the [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. label Apr 29, 2026
@jp-launch-control
Copy link
Copy Markdown

jp-launch-control Bot commented Apr 29, 2026

Code Coverage Summary

Coverage changed in 1 file.

File Coverage Δ% Δ Uncovered
projects/packages/jetpack-mu-wpcom/src/class-jetpack-mu-wpcom.php 2/352 (0.57%) -0.00% 1 ❤️‍🩹

1 file is newly checked for coverage.

File Coverage
projects/packages/jetpack-mu-wpcom/src/common/fatal-error-signature.php 0/52 (0.00%) 💔

Full summary · PHP report

Coverage check overridden by I don't care about code coverage for this PR Use this label to ignore the check for insufficient code coveage. .

Collapse the action + default listener pair into a direct call from
the screen filter. The action existed for hypothetical third-party
listeners; none exist. Other mu-plugins that want the same signature
shape can call `wpcom_build_fatal_error_signature()` directly from
their own fatal-detection paths (e.g. #48261's activation probe) — the
shared helper in jetpack-mu-wpcom is the correct extension point, not
a parallel wpcomsh-specific hook.

Also log the decoded parts (kind, slug, extension_version, wp_version,
php_version) alongside the encoded signature so Kibana queries can
term-aggregate without an ingest-time base64+JSON decode step. Use
`class_exists( 'WPCOMSH_Log', false )` to skip autoloader I/O during
fatal handling.

Net: -1 file, ~150 fewer lines, no public API surface for nonexistent
consumers. Helper in jetpack-mu-wpcom is unchanged and remains
available for cross-mu-plugin reuse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@taipeicoder taipeicoder self-assigned this Apr 29, 2026
taipeicoder and others added 4 commits April 29, 2026 12:17
A persistent fatal on a high-traffic site would otherwise emit one
logstash row + one outbound wp_remote_post() per visitor, since the
filter fires for every rendered fatal and unsafe_direct_log() always
schedules a shutdown send. Gate on wp_cache_add() with a 5-min TTL
keyed by sha256(signature), placed before the WPCOMSH_Log require so
dedup hits skip the file load. Throwable-catch around the cache call
fails open so a broken cache never silences telemetry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s hits

Identification (glob + get_plugin_data) was running for every visitor
of a persistent fatal even though the anonymous render path doesn't
display plugin info — its only use on that path is to feed the
signature logger. Hoist user_id / is_admin resolution into the screen
filter, identify unconditionally for admins (the rendered notice needs
$plugin), and gate identify+log behind a coarse $error['file'] cache
key for anonymous viewers. Thread the resolved user_id and is_admin
into wpcomsh_fatal_build_render_context so they're not recomputed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop verbose framing and references outside wpcomsh / jetpack
(log2logstash, RFC 4648, libsodium-seal aside, fix #31284,
memcached/Atomic detail, Error_Handler precedent). No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The relative require for the signature helper used vendor/, but the
released wpcomsh has the package at jetpack_vendor/. Switch both
fallback requires (signature helper and WPCOMSH_Log) to the
WPCOMSH__PLUGIN_DIR_PATH + jetpack_vendor convention used elsewhere
in wpcomsh (lib/tonesque.php, lib/class.color.php), and gate on
defined() since wpcom-fatal-error/load.php is required before
constants.php.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@taipeicoder taipeicoder added the I don't care about code coverage for this PR Use this label to ignore the check for insufficient code coveage. label Apr 29, 2026
@taipeicoder taipeicoder marked this pull request as ready for review April 29, 2026 05:12
@taipeicoder taipeicoder added [Status] Needs Review This PR is ready for review. and removed [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. [Status] In Progress labels Apr 29, 2026
@taipeicoder taipeicoder merged commit 97409c1 into trunk Apr 29, 2026
76 checks passed
@taipeicoder taipeicoder deleted the add/wpcom-fatal-error-signature branch April 29, 2026 06:33
@github-actions github-actions Bot removed the [Status] Needs Review This PR is ready for review. label Apr 29, 2026
taipeicoder added a commit that referenced this pull request Apr 30, 2026
…c_extension_conflict

Follow-up to #48369. Adds a sibling `WPCOMSH_Log::unsafe_direct_log_logstash( $feature, $message, $options = [] )` that POSTs to /rest/v1.1/logstash with caller-supplied `properties` (indexed under `properties.*` in Kibana for filter/sort/aggregate), `severity`, and `extra` (unstructured context). Switches the fatal-error signature emitter to it under feature `atomic_extension_conflict` (severity `critical`), so these records get their own Kibana bucket — and the decoded parts (`signature`, `kind`, `slug`, `extension_version`, `wp_version`, `php_version`) land under `properties.*` rather than the noisier nested `extra.*` path used by the parent PR's /automated-transfers/log envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
arthur791004 added a commit that referenced this pull request May 4, 2026
* wpcomsh recovery-mode sync: include per-extension error info

Follow-up to #48213. The state snapshot now also carries an extracted
view of the live *_paused_extensions option, so wpcom-side consumers
(Calypso) can surface what fataled instead of just that something
fataled.

Each record carries kind/slug/version + errno/message/file/line plus
the transportable signature token from #48369, so a fatal seen via
the recovery email and via the wpcomsh fatal-error screen can be
joined on the same opaque token. file is reduced to its basename so
server paths don't leak.

Reading from the live option on every snapshot (instead of stashing
errors in our own option, or threading them through one capture path)
means every POST — email / session-start / session-end — emits a
complete state, and session-end naturally shows errors=[] without any
explicit clear step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phan: widen \$payload type to array<string,mixed>

The new recovery_session_errors field is an array of records, so the
existing array<string,int> phpdoc no longer fits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop signature from recovery-mode-sync error records

The flat fields (kind/slug/version/errno/message/file/line) cover the
Calypso display use case. The signature was for cross-surface
analytics joining (recovery email vs. fatal-error screen logstash),
which has no consumer yet. We can re-add when one materializes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop signature mention from changelog entry

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Capture error_get_last() at email-send time

So the fatal-request POST already carries the error info, instead of
waiting for the admin to click the recovery email link.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Recovery sync: match core's slug shape in resolve_extension_for_file

Use the first path segment under WP_PLUGIN_DIR as the plugin slug — the
same value WP_Recovery_Mode::get_extension_for_error() produces and the
key WP itself uses inside *_paused_extensions. Previously we returned
the main-file path (e.g. akismet/akismet.php), which would not match the
slug stored once a session is created for the same fatal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI pushed a commit to dognose24/jetpack that referenced this pull request May 4, 2026
…tic#48440)

* wpcomsh recovery-mode sync: include per-extension error info

Follow-up to Automattic#48213. The state snapshot now also carries an extracted
view of the live *_paused_extensions option, so wpcom-side consumers
(Calypso) can surface what fataled instead of just that something
fataled.

Each record carries kind/slug/version + errno/message/file/line plus
the transportable signature token from Automattic#48369, so a fatal seen via
the recovery email and via the wpcomsh fatal-error screen can be
joined on the same opaque token. file is reduced to its basename so
server paths don't leak.

Reading from the live option on every snapshot (instead of stashing
errors in our own option, or threading them through one capture path)
means every POST — email / session-start / session-end — emits a
complete state, and session-end naturally shows errors=[] without any
explicit clear step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phan: widen \$payload type to array<string,mixed>

The new recovery_session_errors field is an array of records, so the
existing array<string,int> phpdoc no longer fits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop signature from recovery-mode-sync error records

The flat fields (kind/slug/version/errno/message/file/line) cover the
Calypso display use case. The signature was for cross-surface
analytics joining (recovery email vs. fatal-error screen logstash),
which has no consumer yet. We can re-add when one materializes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop signature mention from changelog entry

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Capture error_get_last() at email-send time

So the fatal-request POST already carries the error info, instead of
waiting for the admin to click the recovery email link.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Recovery sync: match core's slug shape in resolve_extension_for_file

Use the first path segment under WP_PLUGIN_DIR as the plugin slug — the
same value WP_Recovery_Mode::get_extension_for_error() produces and the
key WP itself uses inside *_paused_extensions. Previously we returned
the main-file path (e.g. akismet/akismet.php), which would not match the
slug stored once a session is created for the same fatal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: dognose24 <6869813+dognose24@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

I don't care about code coverage for this PR Use this label to ignore the check for insufficient code coveage. [Package] Jetpack mu wpcom WordPress.com Features [Plugin] Wpcomsh

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant