Implement new array function array_column() #56

Closed
wants to merge 12 commits into
from

Conversation

Projects
None yet
10 participants
@ramsey
Contributor

ramsey commented Apr 14, 2012

array_column() returns the values of the specified column from a multi-dimensional array.

I've actually had this patch sitting around for several years, and I decided to take a look, clean it up, and finally submit it. Inspired by database methods like PDOStatement::fetchColumn(), I wrote this to scratch an itch, since I've often had the need to take a recordset array and pull a single column of values from it, which normally involves looping over the array and creating a new array of the values for the column I want to pull. I decided to implement this in the PHP core to save that step.

You can take a look at the test included to see how it works. The second parameter is the key for the column to return, which can be a number for numeric indexes or a string for associative arrays. If it can't find the column specified, it simply returns an empty array.

@smalyshev

View changes

ext/standard/array.c
+ break;
+ default:
+ php_error_docref(NULL TSRMLS_CC, E_WARNING, "The key should be either a string or an integer");
+ return;

This comment has been minimized.

@smalyshev

smalyshev Apr 14, 2012

Contributor

Maybe RETURN_FALSE ? Though then I'd move the array init after the check.

@smalyshev

smalyshev Apr 14, 2012

Contributor

Maybe RETURN_FALSE ? Though then I'd move the array init after the check.

This comment has been minimized.

@ramsey

ramsey Apr 14, 2012

Contributor

I've updated this to RETURN_FALSE. I've also cleaned it up a bit by removing some of the other cases.

@ramsey

ramsey Apr 14, 2012

Contributor

I've updated this to RETURN_FALSE. I've also cleaned it up a bit by removing some of the other cases.

This comment has been minimized.

@pierrejoye

pierrejoye Apr 15, 2012

Contributor

IS_OBJECT should be there too, calling convert_to_string if necessary (will deal with the existence of __toString or not.

@pierrejoye

pierrejoye Apr 15, 2012

Contributor

IS_OBJECT should be there too, calling convert_to_string if necessary (will deal with the existence of __toString or not.

This comment has been minimized.

@pierrejoye

pierrejoye Apr 15, 2012

Contributor

convert_to_string_ex afair :)

@pierrejoye

pierrejoye Apr 15, 2012

Contributor

convert_to_string_ex afair :)

This comment has been minimized.

@ramsey

ramsey Apr 16, 2012

Contributor

I've added IS_OBJECT.

@ramsey

ramsey Apr 16, 2012

Contributor

I've added IS_OBJECT.

@nikic

View changes

ext/standard/array.c
+ }
+
+ Z_ADDREF_PP(zvalue);
+ zend_hash_next_index_insert(HASH_OF(return_value), (void **)zvalue, sizeof(zval *), NULL);

This comment has been minimized.

@nikic

nikic Apr 14, 2012

Member

As you already know that return_value is an array you probably should use Z_ARRVAL_P instead of HASH_OF.

Or even easier just use the add_next_index_zval API function:

add_next_index_zval(return_value, *zvalue);
@nikic

nikic Apr 14, 2012

Member

As you already know that return_value is an array you probably should use Z_ARRVAL_P instead of HASH_OF.

Or even easier just use the add_next_index_zval API function:

add_next_index_zval(return_value, *zvalue);

This comment has been minimized.

@ramsey

ramsey Apr 14, 2012

Contributor

Thanks. I've updated it to use add_next_index_zval().

@ramsey

ramsey Apr 14, 2012

Contributor

Thanks. I've updated it to use add_next_index_zval().

@cataphract

This comment has been minimized.

Show comment
Hide comment
@cataphract

cataphract Apr 16, 2012

Contributor

I'd rather have a more general mechanism for extracting subsets of arrays, like in Mathematica:

In[6]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[All, 2;;]]
Out[6]= {{2}, {4}, {6, 7}}

In[7]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[All, 2]]
Out[7]= {2, 4, 6}

In[8]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[{1,3}, 2]]
Out[8]= {2, 6}

but I guess this is better than nothing.

Contributor

cataphract commented Apr 16, 2012

I'd rather have a more general mechanism for extracting subsets of arrays, like in Mathematica:

In[6]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[All, 2;;]]
Out[6]= {{2}, {4}, {6, 7}}

In[7]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[All, 2]]
Out[7]= {2, 4, 6}

In[8]:= {{1, 2}, {3, 4}, {5, 6, 7}}[[{1,3}, 2]]
Out[8]= {2, 6}

but I guess this is better than nothing.

@ramsey

This comment has been minimized.

Show comment
Hide comment
@ramsey

ramsey Apr 18, 2012

Contributor

@cataphract, can you suggest how that might look like PHP? I don't think my implementation is too far off. With some slight modifications, it might be possible to achieve something similar to what you're asking.

Contributor

ramsey commented Apr 18, 2012

@cataphract, can you suggest how that might look like PHP? I don't think my implementation is too far off. With some slight modifications, it might be possible to achieve something similar to what you're asking.

Merge branch 'PHP-5.3' into feature/array_column
* PHP-5.3:
  Fix bug 61746 Failing tests in ext/standard/tests/file/windows_links/*
  Fix bug 61720 ext\libxml\tests\bug61367-read.phpt fails
  Fix bug 61719 ext\soap\tests\bugs\bug31422.phpt fails
  Fix bug 61718 ext\ldap\tests\ldap_set_rebind_proc_error.phpt fails
  Fix bug 61717 ext\ldap\tests\ldap_sasl_bind_basic.phpt fails
  Fix bug 61716 tests\basic\021.phpt fails
  Fix bug 61683 tests\basic\bug20539.phpt fails
  Fix bug 61680 ext\zlib\tests\gzencode_variation1-win32.phpt fails
  Fix bug 61676 ext\tidy\tests\bug54682.phpt fails
  Fix bug 61743 tests in ext\standard\tests\file\windows_acls\* fail
@pierrejoye

This comment has been minimized.

Show comment
Hide comment
@pierrejoye

pierrejoye Apr 18, 2012

Contributor

@ramsey

btw, pls put a RFC together and send a notification mail to internals once you are ready, referring to this discussion too to avoid to have to discuss everything again from scratch :)

Contributor

pierrejoye commented Apr 18, 2012

@ramsey

btw, pls put a RFC together and send a notification mail to internals once you are ready, referring to this discussion too to avoid to have to discuss everything again from scratch :)

@cataphract

This comment has been minimized.

Show comment
Hide comment
@cataphract

cataphract Apr 18, 2012

Contributor

@ramsey You could for instance use null mean both All and to specify a range. This can be sorted out later. For instance, for my examples above:

array_part($foo, [null, [2, null]]);
array_part($foo, [null, 2]);
array_part($foo, [[1, 3], 2]);
Contributor

cataphract commented Apr 18, 2012

@ramsey You could for instance use null mean both All and to specify a range. This can be sorted out later. For instance, for my examples above:

array_part($foo, [null, [2, null]]);
array_part($foo, [null, 2]);
array_part($foo, [[1, 3], 2]);
@smalyshev

This comment has been minimized.

Show comment
Hide comment
@smalyshev

smalyshev May 2, 2012

Contributor

I think we shouldn't try to boil the ocean here. array_column is very useful on its own. If we will need super-method to do everything and make fresh coffee, we could add it later, but that IMHO should not hold adding array_column.

Contributor

smalyshev commented May 2, 2012

I think we shouldn't try to boil the ocean here. array_column is very useful on its own. If we will need super-method to do everything and make fresh coffee, we could add it later, but that IMHO should not hold adding array_column.

@dsp

This comment has been minimized.

Show comment
Hide comment
@dsp

dsp Jun 6, 2012

Member

Yes I agree. I think array_column is good enough to be added. I think we are still waiting for an RFC.

Member

dsp commented Jun 6, 2012

Yes I agree. I think array_column is good enough to be added. I think we are still waiting for an RFC.

ramsey added some commits Jun 21, 2012

Merge branch 'PHP-5.3' into feature/array_column
* PHP-5.3: (160 commits)
  Remove outdated and user-specific files
  Add NEWS for bug #62262
  Fixed bug RecursiveArrayIterator does not implement Countable
  sync zip ext version with pecl
  one more correction for COM upgrading notes
  split gzgetc_basic.phpt for zlib 1.2.7
  com ext upgrading correction
  com ext upgrading info
  Reverted the BC fix regarding to #57905, test adopted
  Merge PHP 5.3.14 NEWS
  re-add 61755 to NEWS
  Make travis silent
  Adding a test for ext/posix/tests/posix_getegid_basic.phpt
  typo
  improve overflow checks
  fix potential overflow in _php_stream_scandir
  set current versions for libzip and zip ext
  updated NEWS
  zip windows fixes
  fixed bc break related to #57905
  ...
@ramsey

This comment has been minimized.

Show comment
Hide comment
@ramsey

ramsey Jun 21, 2012

Contributor

I finally got around to creating an RFC and submitting this to internals:

Contributor

ramsey commented Jun 21, 2012

I finally got around to creating an RFC and submitting this to internals:

@ccampbell

This comment has been minimized.

Show comment
Hide comment
@ccampbell

ccampbell Jul 12, 2012

Hey @ramsey, I think think this proposal is a great idea (I do stuff like this all the time), but how do you feel about extending it slightly.

For example if you start with the following array:

$data = array(
    array('id' => 1, 'name' => 'bob', 'valid' => 1, 'email' => 'bob@yahoo.com'),
    array('id' => 2, 'name' => 'joe', 'valid' => 1, 'email' => 'joe@gmail.com'),
    array('id' => 3, 'name' => 'allison', 'valid' => 1, 'email' => 'allison@gmail.com')
);

Then you do array_column($data, 'name'); it would return

array(
    'bob',
    'joe',
    'allison
)

But what if you could do array_column($data, 'id', 'name', 'email') or array_column($data, ['id', 'name', 'email']) and have that return

array(
    'id' => array(
        1,
        2,
        3
    ),
    'name' => array(
        'bob',
        'joe',
        'allison'
    ),
    'email' => array(
        'bob@yahoo.com',
        'joe@gmail.com',
        'allison@gmail.com'
    )
)

This allows you to basically achieve the same thing but instead of having to make multiple calls you could get a bunch of arrays back in one iteration.

Not sure if this would really hurt performance in which case maybe it could be a separate method called array_columns, but I think it could be useful.

We actually do this at Vimeo quite a bit. Functions like array_multisort require the data in this format.

Hey @ramsey, I think think this proposal is a great idea (I do stuff like this all the time), but how do you feel about extending it slightly.

For example if you start with the following array:

$data = array(
    array('id' => 1, 'name' => 'bob', 'valid' => 1, 'email' => 'bob@yahoo.com'),
    array('id' => 2, 'name' => 'joe', 'valid' => 1, 'email' => 'joe@gmail.com'),
    array('id' => 3, 'name' => 'allison', 'valid' => 1, 'email' => 'allison@gmail.com')
);

Then you do array_column($data, 'name'); it would return

array(
    'bob',
    'joe',
    'allison
)

But what if you could do array_column($data, 'id', 'name', 'email') or array_column($data, ['id', 'name', 'email']) and have that return

array(
    'id' => array(
        1,
        2,
        3
    ),
    'name' => array(
        'bob',
        'joe',
        'allison'
    ),
    'email' => array(
        'bob@yahoo.com',
        'joe@gmail.com',
        'allison@gmail.com'
    )
)

This allows you to basically achieve the same thing but instead of having to make multiple calls you could get a bunch of arrays back in one iteration.

Not sure if this would really hurt performance in which case maybe it could be a separate method called array_columns, but I think it could be useful.

We actually do this at Vimeo quite a bit. Functions like array_multisort require the data in this format.

@Abeja

This comment has been minimized.

Show comment
Hide comment
@Abeja

Abeja Jul 13, 2012

Hello @ramsey, in my work I often use a similar function, perhaps it will be interesting to you:

function array_collect( $Items , $IndexKey , $ValueKey ) {
    if ( is_null( $IndexKey ) && is_null( $ValueKey ) || !count( $Items ) ) return $Items;
    $Result = array();
    if ( !is_null( $IndexKey ) ) {
        $Keys = is_array( $IndexKey ) ? array_values( $IndexKey ) : array( $IndexKey );
        $Key = ( is_null( end( $Keys ) ) && !array_pop( $Keys ) ) ? '[]' : '';
        $Code = '$Result[ $Item[ $Keys[ ' . implode( '] ] ][ $Item[ $Keys[ ' , array_keys( $Keys ) ) . '] ] ]' . $Key . ' = is_null( $ValueKey ) ? $Item : $Item[ $ValueKey ];';
        foreach ( $Items as $Item ) eval( $Code );
    } else
        foreach ( $Items as $Item ) 
            $Result[] = is_null( $ValueKey ) ? $Item : $Item[ $ValueKey ]; 
    return $Result;                             
}

Examples:

$records = array(
    array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
    array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
    array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
);

array_collect( $records , array( 'parent-id' , 'id' ) , 'name' )
// => array( 7 => array( 1 => 'Doe' , 4 => 'Jones' ), 3 => array( 2 => 'Smith' ) )

array_collect( $records , array( 'parent-id' , null ) , 'name' )
// => array( 7 => array( 'Doe' , 'Jones' ), 3 => array ('Smith' ) )

array_collect( $records , null , 'name' )
// => array( 'Doe' , 'Smith' , 'Jones' )

array_collect( $records , 'id' , null )
/* => 
    array(
        1 => array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
        2 => array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
        4 => array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
    );
*/

Abeja commented Jul 13, 2012

Hello @ramsey, in my work I often use a similar function, perhaps it will be interesting to you:

function array_collect( $Items , $IndexKey , $ValueKey ) {
    if ( is_null( $IndexKey ) && is_null( $ValueKey ) || !count( $Items ) ) return $Items;
    $Result = array();
    if ( !is_null( $IndexKey ) ) {
        $Keys = is_array( $IndexKey ) ? array_values( $IndexKey ) : array( $IndexKey );
        $Key = ( is_null( end( $Keys ) ) && !array_pop( $Keys ) ) ? '[]' : '';
        $Code = '$Result[ $Item[ $Keys[ ' . implode( '] ] ][ $Item[ $Keys[ ' , array_keys( $Keys ) ) . '] ] ]' . $Key . ' = is_null( $ValueKey ) ? $Item : $Item[ $ValueKey ];';
        foreach ( $Items as $Item ) eval( $Code );
    } else
        foreach ( $Items as $Item ) 
            $Result[] = is_null( $ValueKey ) ? $Item : $Item[ $ValueKey ]; 
    return $Result;                             
}

Examples:

$records = array(
    array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
    array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
    array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
);

array_collect( $records , array( 'parent-id' , 'id' ) , 'name' )
// => array( 7 => array( 1 => 'Doe' , 4 => 'Jones' ), 3 => array( 2 => 'Smith' ) )

array_collect( $records , array( 'parent-id' , null ) , 'name' )
// => array( 7 => array( 'Doe' , 'Jones' ), 3 => array ('Smith' ) )

array_collect( $records , null , 'name' )
// => array( 'Doe' , 'Smith' , 'Jones' )

array_collect( $records , 'id' , null )
/* => 
    array(
        1 => array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
        2 => array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
        4 => array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
    );
*/
@pierrejoye

This comment has been minimized.

Show comment
Hide comment
@pierrejoye

pierrejoye Sep 5, 2012

Contributor

What's the status of this PR? Are we ready to vote on the RFC?

Contributor

pierrejoye commented Sep 5, 2012

What's the status of this PR? Are we ready to vote on the RFC?

@ramsey

This comment has been minimized.

Show comment
Hide comment
@ramsey

ramsey Sep 5, 2012

Contributor

I need to make some changes, based on feedback.

Contributor

ramsey commented Sep 5, 2012

I need to make some changes, based on feedback.

@lstrojny

This comment has been minimized.

Show comment
Hide comment
@lstrojny

lstrojny Jan 6, 2013

Contributor

@ramsey could you call a vote on internals on this function?

Contributor

lstrojny commented Jan 6, 2013

@ramsey could you call a vote on internals on this function?

@ramsey

This comment has been minimized.

Show comment
Hide comment
@ramsey

ramsey Jan 11, 2013

Contributor

Closing this pull request. It is superseded by pull request #257.

Contributor

ramsey commented Jan 11, 2013

Closing this pull request. It is superseded by pull request #257.

@ramsey ramsey closed this Jan 11, 2013

@Majkl578

This comment has been minimized.

Show comment
Hide comment
@Majkl578

Majkl578 Jan 12, 2013

Contributor

@ramsey: Next time, there is no need for new issue, just rebase your branch and force-push it, history will be updated here automatically.

Contributor

Majkl578 commented Jan 12, 2013

@ramsey: Next time, there is no need for new issue, just rebase your branch and force-push it, history will be updated here automatically.

salathe added a commit to salathe/php-src that referenced this pull request Mar 11, 2013

dsp added a commit that referenced this pull request Mar 20, 2013

Merge branch 'pull-request/257' into PHP-5.5
* pull-request/257:
  array_column: Fix compile-time warnings
  array_column: Removed array_pluck() alias
  array_column: Set array_pluck as an alias for array_column
  array_column: Implement ability to specify an index column
  Cleaning up a memory leak.
  array_column: Adding test for IS_OBJECT and converting object to string
  array_column: Using add_next_index_zval() at nikic's recommendation.
  array_column: Improved tests
  array_column: Cleaning up, as recommended in pull request #56 comments
  Fixing typo in test for array_column()
  Simplify the code and use zend_hash_next_index_insert()
  Adding test for columns not present in all rows for array_column().
  Adding tests for the negative results of array_column()
  Implement new array function array_column()

dsp added a commit that referenced this pull request Mar 20, 2013

Merge branch 'PHP-5.5'
* PHP-5.5:
  NEWS for array_column
  array_column: Fix compile-time warnings
  array_column: Removed array_pluck() alias
  array_column: Set array_pluck as an alias for array_column
  array_column: Implement ability to specify an index column
  Cleaning up a memory leak.
  array_column: Adding test for IS_OBJECT and converting object to string
  array_column: Using add_next_index_zval() at nikic's recommendation.
  array_column: Improved tests
  array_column: Cleaning up, as recommended in pull request #56 comments
  Fixing typo in test for array_column()
  Simplify the code and use zend_hash_next_index_insert()
  Adding test for columns not present in all rows for array_column().
  Adding tests for the negative results of array_column()
  Implement new array function array_column()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment