Skip to content

Comments: Allow WP_Comment_Query fields to select arbitrary columns#11934

Open
dd32 wants to merge 1 commit into
WordPress:trunkfrom
dd32:65313-comment-query-fields-projections
Open

Comments: Allow WP_Comment_Query fields to select arbitrary columns#11934
dd32 wants to merge 1 commit into
WordPress:trunkfrom
dd32:65313-comment-query-fields-projections

Conversation

@dd32
Copy link
Copy Markdown
Member

@dd32 dd32 commented May 22, 2026

Trac ticket: https://core.trac.wordpress.org/ticket/65313

Summary

Extends WP_Comment_Query's fields argument to accept:

  • A column name from $wpdb->comments (e.g. 'comment_post_ID') — returns a flat array of distinct values for that column.
  • A 'col_a=>col_b' pair (e.g. 'comment_ID=>comment_post_ID') — returns an associative array keyed by col_a's value with col_b's value as the entry, mirroring the 'id=>parent' idiom in WP_Query and WP_Term_Query.

'ids' and '' (default) are unchanged — the new behaviour is purely additive.

Why

Plugins that want "the distinct comment_post_IDs matching X" have three options today, none good:

  1. fields => '' + array_column( $q->comments, 'comment_post_ID' ) — hydrates every matched comment just to discard all but one field.
  2. fields => 'ids' + get_comment( $id )->comment_post_ID in a loop — N+1 cache lookups.
  3. Drop to raw SQL — bypasses filters, caching, and meta-query.

In the gp-translation-helpers reference case in the Trac ticket, hydrating ~32K comments to extract distinct post IDs took ~6.6s; the equivalent raw SQL returned the same 25,694 IDs in ~610ms (~11x speedup, all of it from skipping hydration).

API

// Distinct comment_post_IDs — replaces array_unique( wp_list_pluck( ... ) ).
$post_ids = ( new WP_Comment_Query() )->query( array(
    'meta_key'   => 'locale',
    'meta_value' => 'nl',
    'user_id'    => 42,
    'fields'     => 'comment_post_ID',
) );

// [ comment_ID => comment_post_ID ] map — same shape as WP_Query's 'id=>parent'.
$by_id = ( new WP_Comment_Query() )->query( array(
    'post_id' => $post_id,
    'fields'  => 'comment_ID=>comment_post_ID',
) );

Column names must be passed in their exact case. The only exception is the ID suffix of comment_ID / comment_post_ID, which is accepted in any case (comment_id, comment_Id, comment_ID). Unknown columns fall through to the default behaviour (WP_Comment objects), keeping the path forward-compatible.

Test plan

  • npm run test:php -- tests/phpunit/tests/comment/query.php — 166 tests / 489 assertions pass.
  • npm run test:php -- --group comment — 540 tests / 1367 assertions pass (no regressions).
  • New tests cover:
    • Single-column form returns distinct values.
    • 'col_a=>col_b' map form returns associative array.
    • Case-flexibility of the ID suffix on both sides of =>; strict case for non-ID columns.
    • Unknown column (single + map form) falls back to full WP_Comment objects.
    • 'ids' semantics unchanged.
    • Empty result sets return array() without a follow-up query.
  • phpcs --standard=phpcs.xml.dist clean on both changed files (only pre-existing warnings on untouched lines).

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dd32.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link
Copy Markdown

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@dd32 dd32 force-pushed the 65313-comment-query-fields-projections branch from 9115f83 to 5f33400 Compare May 22, 2026 05:50
Extends the `fields` argument of `WP_Comment_Query` to accept:

  - A column name from the `$wpdb->comments` table (e.g.
    `'comment_post_ID'`). Returns a flat array of distinct values for
    that column.
  - A `'col_a=>col_b'` pair (e.g. `'comment_ID=>comment_post_ID'`).
    Returns an associative array keyed by col_a's value, with col_b's
    value as the entry — mirroring `WP_Query` / `WP_Term_Query`'s
    `'id=>parent'` idiom.
  - An array of column names. Returns an array of `stdClass` objects
    with the requested columns as properties (one entry per matched
    comment, no deduplication).

The motivation is to avoid hydrating full `WP_Comment` objects when the
caller only needs a subset of columns. In one reference case
(gp-translation-helpers), hydrating ~32K comments to extract distinct
post IDs took ~6.6s; the equivalent raw SQL took ~610ms (~11x), all of
it from skipping hydration.

Single-column results apply `DISTINCT` in SQL so callers can drop the
surrounding `array_unique( wp_list_pluck( ... ) )`. The array form does
not — it returns one row per matched comment, like a hand-written
projection.

Field-selection results are cached separately from the base
comment-ID query, sharing the comment `last_changed` salt so any
comment mutation invalidates both layers together. A repeat call with
identical args runs zero SQL queries.

Column names must be passed in their exact case; the only exception is
the `ID` suffix of `comment_ID` / `comment_post_ID`, which is accepted
in any case.

`'ids'` and `''` retain their existing meaning; the new behaviour is
purely additive.

Props dd32.
See #65313.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial review pass. I'll have some more suggestions, but I'm short on time for this week.

* @param string $field
* @return string|null
*/
private function parse_field_column( $field ) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private function parse_field_column( $field ) {
private function parse_field_column( string $field ): ?string {

*
* @since 7.1.0
*
* @param string $field
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param string $field
* @param string $field Field name.

* @since 7.1.0
*
* @param string $field
* @return string|null
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return string|null
* @return non-empty-string|null Canonical column name or null if unknown.

Comment on lines +1128 to +1129
if ( preg_match( '/^(comment(?:_post)?_)([iI][dD])$/', $field, $m ) ) {
return $m[1] . 'ID';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid abbreviations:

Suggested change
if ( preg_match( '/^(comment(?:_post)?_)([iI][dD])$/', $field, $m ) ) {
return $m[1] . 'ID';
if ( preg_match( '/^(comment(?:_post)?_)([iI][dD])$/', $field, $matches ) ) {
return $matches[1] . 'ID';

* @param array $fields Tagged tuple from `parse_fields()`.
* @return array
*/
private function get_comment_field_values( $comment_ids, $fields ) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private function get_comment_field_values( $comment_ids, $fields ) {
private function get_comment_field_values( array $comment_ids, array $fields ) {

* @param mixed $fields Raw `fields` query var.
* @return array|null
*/
private function parse_fields( $fields ) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private function parse_fields( $fields ) {
private function parse_fields( $fields ): ?array {

* @since 7.1.0
*
* @param mixed $fields Raw `fields` query var.
* @return array|null
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return array|null
* @param string[]|string $fields Raw `fields` query var.
* @return array{ 'list', non-empty-string[] }
* |array{ 'map', non-empty-string, non-empty-string }
* |array{ 'col', non-empty-string }
* |null
*/

* Return shapes:
* - `array( 'col', $column )` — single column, DISTINCT.
* - `array( 'map', $key_col, $val_col )` — `'col_a=>col_b'` associative map.
* - `array( 'list', $columns )` — array form, returns `stdClass[]`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was analyzing this method with PHPStan and this doesn't seem to be correct. It seems this is actually:

Suggested change
* - `array( 'list', $columns )` — array form, returns `stdClass[]`.
* - `array( 'list', $columns )` — array form, returns `string[]`.

* @return string|null
*/
private function parse_field_column( $field ) {
static $columns = array(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest the static be removed. It doesn't really gain much if anything.

Suggested change
static $columns = array(
$columns = array(

Also, PHPStan is tripped up by it below, for some reason:

Parameter #2 $haystack of function in_array expects array, mixed given.

But what about a private const instead?

This could be added to the class:

	private const COLUMNS = array(
		'comment_ID',
		'comment_post_ID',
		'comment_author',
		'comment_author_email',
		'comment_author_url',
		'comment_author_IP',
		'comment_date',
		'comment_date_gmt',
		'comment_content',
		'comment_karma',
		'comment_approved',
		'comment_agent',
		'comment_type',
		'comment_parent',
		'user_id',
	);

This would seem to be reusable in the ::parse_orderby() method as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants