Skip to content

Post Media: move Meta_Extractor class to jetpack-post-media package#47205

Open
jeherve wants to merge 19 commits intotrunkfrom
jeherve/move-media-extractor
Open

Post Media: move Meta_Extractor class to jetpack-post-media package#47205
jeherve wants to merge 19 commits intotrunkfrom
jeherve/move-media-extractor

Conversation

@jeherve
Copy link
Member

@jeherve jeherve commented Feb 18, 2026

Fixes CM-76
Fixes CM-546
Fixes CM-543

Proposed changes:

  • Move Jetpack_Media_Meta_Extractor class logic into Automattic\Jetpack\Post_Media\Meta_Extractor in the jetpack-post-media package.
  • Deprecate the old Jetpack_Media_Meta_Extractor class, replacing method bodies with _deprecated_function() notices and forwarding calls to the new class.
  • Update Jetpack_Media_Summary (the only production caller) to use the new namespaced class directly.
  • Port tests from the Jetpack plugin to the post-media package, adapting from WP_UnitTestCase to WorDBless.
  • Remove the old Jetpack_MediaExtractor_Test from the Jetpack plugin (tests now live in the package).
  • Add automattic/jetpack-post-media as a dependency of the Jetpack plugin.
  • Clean up Phan baseline for the now-simplified deprecated file.
  • Use the package-local Images class (from Post Media: Add Images class (copy of Jetpack_PostImages) #47208) instead of \Jetpack_PostImages — no more external dependency on the Jetpack plugin for image extraction.
  • Use the package-local Shortcodes class (from Post Media: Add Shortcodes class for shortcode ID extraction #47200) instead of jetpack_shortcode_get_{$shortcode}_id() functions for shortcode ID extraction.
  • Move the Shortcodes class from Automattic\Jetpack to Automattic\Jetpack\Post_Media namespace for consistency with Images and Meta_Extractor.
  • Remove the placeholder class-post-media.php and unused PACKAGE_VERSION constant.
  • Clean up class references to use use import statements (WP_Post, WP_Error).

Other information:

  • Have you written new tests for your changes, if applicable?
  • Have you checked the E2E test CI results, and verified that your changes do not break them?
  • Have you tested your changes on WordPress.com, if applicable (if so, you'll see a generated comment below with a script to run)?

Does this pull request change what data or activity we track or use?

No.

Testing instructions:

  • Verify the new Meta_Extractor class at projects/packages/post-media/src/class-meta-extractor.php contains all methods from the original Jetpack_Media_Meta_Extractor.
  • Verify the old class at projects/plugins/jetpack/_inc/lib/class.media-extractor.php forwards all calls to the new class with deprecation notices.
  • Verify projects/plugins/jetpack/_inc/lib/class.media-summary.php uses Meta_Extractor from the post-media package.
  • Verify Meta_Extractor uses Images:: (same namespace) instead of \Jetpack_PostImages::.
  • Verify Meta_Extractor uses Shortcodes::get_{shortcode}_id() instead of the old jetpack_shortcode_get_* function lookups.
  • Verify Shortcodes class is now in the Automattic\Jetpack\Post_Media namespace, consistent with Images and Meta_Extractor.
  • Verify changelog entries are present for both the package and the plugin.

Copilot AI review requested due to automatic review settings February 18, 2026 17:18
@github-actions github-actions bot added [Package] Post Media [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ [Tests] Includes Tests labels Feb 18, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 18, 2026

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!


Jetpack plugin:

The Jetpack plugin has different release cadences depending on the platform:

  • WordPress.com Simple releases happen as soon as you deploy your changes after merging this PR (PCYsg-Jjm-p2).
  • WoA releases happen weekly.
  • Releases to self-hosted sites happen monthly:
    • Scheduled release: March 3, 2026
    • Code freeze: March 3, 2026

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Social plugin:

No scheduled milestone found for this plugin.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Mu Wpcom plugin:

  • Next scheduled release: WordPress.com Simple releases happen semi-continuously (PCYsg-Jjm-p2)

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Wpcomsh plugin:

  • Next scheduled release: Atomic deploys happen twice daily on weekdays (p9o2xV-2EN-p2)

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Classic Theme helper plugin plugin:

No scheduled milestone found for this plugin.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 18, 2026

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WoA dev site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin (Jetpack or WordPress.com Site Helper), and enable the jeherve/move-media-extractor branch.
  • To test on Simple, run the following command on your sandbox:
bin/jetpack-downloader test jetpack jeherve/move-media-extractor
bin/jetpack-downloader test jetpack-mu-wpcom-plugin jeherve/move-media-extractor

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the post media metadata extraction logic out of the Jetpack plugin into the automattic/jetpack-post-media package, while keeping backward compatibility via a deprecated wrapper class in the plugin.

Changes:

  • Adds Automattic\Jetpack\Post_Media\Meta_Extractor to the jetpack-post-media package and ports the corresponding test suite to WorDBless.
  • Deprecates Jetpack_Media_Meta_Extractor and forwards calls to the new namespaced implementation; updates Jetpack_Media_Summary to use the new class directly.
  • Adds automattic/jetpack-post-media as a Jetpack plugin dependency and updates the Phan baseline accordingly.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
projects/plugins/jetpack/composer.json Adds automattic/jetpack-post-media as a dependency.
projects/plugins/jetpack/composer.lock Locks the newly added package dependency.
projects/plugins/jetpack/changelog/move-media-meta-extractor Plugin changelog entry for the deprecation.
projects/plugins/jetpack/_inc/lib/class.media-summary.php Switches production usage to the namespaced Meta_Extractor.
projects/plugins/jetpack/_inc/lib/class.media-extractor.php Deprecates legacy class and forwards to the package implementation.
projects/plugins/jetpack/.phan/baseline.php Removes now-unneeded baseline entries for the simplified legacy file.
projects/packages/post-media/src/class-meta-extractor.php Introduces the package Meta_Extractor implementation.
projects/packages/post-media/tests/php/bootstrap.php Initializes the shared WP/WorDBless test environment.
projects/packages/post-media/tests/php/Meta_Extractor_Test.php Ports/exercises extractor behavior in package tests.
projects/packages/post-media/changelog/move-media-meta-extractor Package changelog entry for the new class.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jp-launch-control
Copy link

jp-launch-control bot commented Feb 18, 2026

Code Coverage Summary

Cannot generate coverage summary while tests are failing. 🤐

Please fix the tests, or re-run the Code coverage job if it was something being flaky.

Full summary · PHP report · JS report

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

projects/plugins/jetpack/_inc/lib/class.media-extractor.php:73

  • The @param type for $extract_alt_text is documented as string, but the function signature and usage treat it as a boolean. Update the docblock type to bool/boolean to match actual behavior (helps static analysis and IDE hints).
	 * @param string $content HTML content.
	 * @param array  $image_list Array of already found images.
	 * @param string $extract_alt_text Whether or not to extract the alt text.
	 *

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 20, 2026 09:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 20, 2026 10:00
@jeherve jeherve self-assigned this Feb 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 76 to 79
public static function extract_images_from_content( $content, $image_list, $extract_alt_text = false ) {
$image_list = self::get_images_from_html( $content, $image_list, $extract_alt_text );
return self::build_image_struct( $image_list, array() );
_deprecated_function( __METHOD__, 'jetpack-$$next-version$$', 'Automattic\Jetpack\Post_Media\Meta_Extractor::extract_images_from_content' );
return Meta_Extractor::extract_images_from_content( $content, $image_list, $extract_alt_text );
}
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter type for $extract_alt_text is documented as string but should be boolean or bool to match the actual parameter type and usage. This appears to be a typo introduced during the deprecation.

Copilot uses AI. Check for mistakes.
$srcs = wp_list_pluck( $from_gallery, 'src' );
$image_list = array_merge( $image_list, $srcs );
}
++$image_booleans['gallery']; // @todo This count isn't correct, will only every count 1
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment contains a typo: "will only every count 1" should be "will only ever count 1". This typo was carried over from the original code and should be fixed.

Suggested change
++$image_booleans['gallery']; // @todo This count isn't correct, will only every count 1
++$image_booleans['gallery']; // @todo This count isn't correct, will only ever count 1

Copilot uses AI. Check for mistakes.
@github-actions github-actions bot added the [Plugin] Social Issues about the Jetpack Social plugin label Feb 20, 2026
Copilot AI review requested due to automatic review settings February 20, 2026 10:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 24 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

projects/packages/post-media/tests/php/Meta_Extractor_Test.php:478

  • The expected image order in the test has been changed from image1-then-image2 to image2-then-image1. While the code may still function correctly (since the extracted images are deduplicated using array_unique), this change in ordering could indicate a behavioral difference between the old Jetpack_Media_Meta_Extractor and the new Meta_Extractor.

Please verify that this order change is intentional and doesn't break any existing code that may depend on a specific image order. If the order doesn't matter semantically, consider documenting this in the PR description or adding a comment explaining why the order changed.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* @deprecated $$next-version$$
*
* @param string $content HTML content.
* @param array $image_list Array of already found images.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter type for $extract_alt_text is documented as string but should be boolean or bool. The parameter is actually a boolean flag indicating whether to extract alt text, as can be seen from the method signature (default value is false) and the corresponding parameter in the new Meta_Extractor class.

Copilot uses AI. Check for mistakes.
jeherve and others added 16 commits February 27, 2026 10:30
Replace Jetpack_Media_Meta_Extractor usage with the new namespaced
Automattic\Jetpack\Post_Media\Meta_Extractor class.
Required for the new Meta_Extractor class that replaces
Jetpack_Media_Meta_Extractor.
- Add @Covers annotation to match CoversClass attribute
- Fix equals sign alignment in create_attachment() and get_images_from_html()
- Convert inline associative array to multi-line in add_test_post()
Replace the dynamic jetpack_shortcode_get_{$shortcode}_id() function
lookups and {Name}Shortcode class method calls with the new
Automattic\Jetpack\Shortcodes class from the post-media package.
- Replace \Jetpack_PostImages calls with Images:: (same namespace)
- Remove @phan-suppress annotations no longer needed
- Add `use` for Shortcodes class, simplify references
- Remove placeholder class-post-media.php and PACKAGE_VERSION
- Remove stale comment in class.media-summary.php
- Remove old Jetpack_MediaExtractor_Test (now in post-media package)
Clean up class references to use imports instead of inline
fully-qualified names. Update keeper_shortcodes docblock to
reference the Shortcodes class.
Move the Shortcodes class from Automattic\Jetpack to
Automattic\Jetpack\Post_Media for consistency with the
Images and Meta_Extractor classes in the same package.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The forwarding call to Meta_Extractor::reduce_extracted_images()
would cause a fatal error since the method is protected and
Jetpack_Media_Meta_Extractor does not extend Meta_Extractor.
Inline the implementation instead.
Fixes CM-546, Fixes CM-543

These functions have been extracted into the automattic/jetpack-post-media
package as the Shortcodes class under the Automattic\Jetpack\Post_Media
namespace.

The old functions (jetpack_shortcode_get_*_id, jetpack_get_youtube_id,
jetpack_youtube_sanitize_url) are now deprecated shims that delegate
to their package counterparts.

Callers continue to work through the shims; they will emit deprecation
notices when WP_DEBUG is enabled, guiding developers to the new API.
The move of media extraction code from the Jetpack plugin to the
post-media package introduced several test failures:

- Phan flagged a redundant assignment (`$x = $x - Y` → `$x -= Y`)
  and two potentially undefined array keys in get_images_from_html
  (src_width / src_height may not exist on the extracted image).

- Meta_Extractor_Test ran under WorDBless but lacked the environment
  setup that the full WordPress test harness provides: no admin user
  (so wp_kses_post sanitized content on insert), no upload-dir
  normalization, no image_downsize shim for virtual files, and no
  dummy shortcode registrations for youtube/vimeo/ted/wpvideo. Add a
  set_up() following the same pattern already used in ImagesTest.

- Functions_Compat_Test and Jetpack_Shortcodes_Vimeo_Test now call
  deprecated wrapper functions (jetpack_youtube_sanitize_url,
  jetpack_get_youtube_id, jetpack_shortcode_get_vimeo_id) which
  trigger _deprecated_function notices caught by WP_UnitTestCase.
  Declare the expected deprecations with setExpectedDeprecated().
Copilot AI review requested due to automatic review settings February 27, 2026 09:32
@jeherve jeherve force-pushed the jeherve/move-media-extractor branch from ce71443 to 2e69122 Compare February 27, 2026 09:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 32 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to +143
$ret_images = array();
foreach ( $images as $image ) {
if ( empty( $image['src'] ) ) {
continue;
}
$ret_image = array(
'url' => $image['src'],
);
if ( ! empty( $image['src_height'] ) || ! empty( $image['src_width'] ) ) {
$ret_image['src_width'] = $image['src_width'] ?? '';
$ret_image['src_height'] = $image['src_height'] ?? '';
}
if ( ! empty( $image['alt_text'] ) ) {
$ret_image['alt_text'] = $image['alt_text'];
} else {
$ret_image = $image['src'];
}
$ret_images[] = $ret_image;
}
return $ret_images;
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reduce_extracted_images method contains the full implementation instead of just forwarding to the new Meta_Extractor class. For consistency with the other deprecated methods (extract, extract_from_content, etc.), this should only call _deprecated_function() and forward to Meta_Extractor::reduce_extracted_images(). The current implementation defeats the purpose of moving the logic to the package and creates code duplication.

Suggested change
$ret_images = array();
foreach ( $images as $image ) {
if ( empty( $image['src'] ) ) {
continue;
}
$ret_image = array(
'url' => $image['src'],
);
if ( ! empty( $image['src_height'] ) || ! empty( $image['src_width'] ) ) {
$ret_image['src_width'] = $image['src_width'] ?? '';
$ret_image['src_height'] = $image['src_height'] ?? '';
}
if ( ! empty( $image['alt_text'] ) ) {
$ret_image['alt_text'] = $image['alt_text'];
} else {
$ret_image = $image['src'];
}
$ret_images[] = $ret_image;
}
return $ret_images;
return Meta_Extractor::reduce_extracted_images( $images );

Copilot uses AI. Check for mistakes.
Internal callers (youtube.php, vimeo.php, archiveorg.php,
archiveorg-book.php, class.media-summary.php, enhanced-open-graph.php)
now call Shortcodes:: methods directly instead of going through the
deprecated global function wrappers. This fixes unexpected deprecation
notices in Jetpack_Shortcodes_Youtube_Test (15 failures across all PHP
versions) and resolves phan PhanDeprecatedFunction errors.

Also fixes a docblock type mismatch in class.media-extractor.php
(string → bool) and adds phan baseline entries for functions.compat.php
and Functions_Compat_Test.php which intentionally call deprecated
functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants