Skip to content

Prevent unnecessary outbox activities from third-party plugin post updates #2927

@pfefferle

Description

@pfefferle

Problem

Third-party plugins (editorial calendars, social schedulers, content converters, hosting sync tools) frequently call wp_update_post() on published posts without meaningfully changing the content. Each call fires wp_after_insert_post, which creates an outbox entry — even when the post content hasn't actually changed.

This results in:

  • Excessive Update activities sent to followers for unchanged posts
  • Spurious Delete activities when plugins briefly transition post status
  • Outbox flooding at rates of several activities per minute during bulk operations
  • Old posts (years old) being re-federated unnecessarily

Contributing factors

  1. Plugins calling wp_update_post() remotely or in batch — e.g., cloud-based editorial calendars syncing post metadata, content migration tools converting classic posts to blocks, search-and-replace operations
  2. Non-deterministic the_content output — plugins injecting HTML with random values (nonces, embed secrets, random player IDs) into the_content filter output, making content appear different on every render even when the actual post content is identical. Examples:
    • WordPress core's get_post_embed_html() generates a fresh #?secret= on every call
    • Podcast player plugins inject random player_id values via wp_rand()
    • Subscription/paywall blocks rendering dynamic URLs
  3. No deduplication gate — activity type is currently determined based solely on post status transitions, without checking whether the federated content actually changed

Proposed solution

Add a lightweight content-change detection step before creating outbox entries for Update activities. When an Update is determined, compare a hash of the relevant post properties against the previously federated version before writing to the outbox.

Considerations

  • Latency budget is tight — this runs on every wp_after_insert_post, so the check must be fast (hash comparison, not full object transformation)
  • Storage — could store a content hash in post meta (e.g., _activitypub_content_hash) that gets set when an activity is actually dispatched
  • What to hash — should include the fields that matter for federation: post_content, post_title, post_excerpt, post_status, key meta fields. Should NOT include rendered the_content output (which is non-deterministic)
  • Create vs Update — this check should only apply to Update activities; Create and Delete should proceed unconditionally
  • Escape hatch — provide a filter so plugins can bypass the check if needed

Alternative approaches

  • Rate-limiting outbox entries per post (e.g., max 1 Update per post per N minutes)
  • Debouncing via short-lived transients before scheduling the outbox write
  • Comparing post_modified_gmt against a stored "last federated" timestamp

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions