Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new array function array_column() #257

Closed
wants to merge 14 commits into from
Closed

Implement new array function array_column() #257

wants to merge 14 commits into from

Conversation

ramsey
Copy link
Member

@ramsey ramsey commented Jan 11, 2013

This pull request supersedes pull request #56. I have cleaned it up and have rebased branch PHP-5.3 onto my branch.

This pull request also includes new work as a result of feedback received on the original pull request and mailing list discussion.

References:

@sc0ttkclark
Copy link

Ready for this!

@ramsey
Copy link
Member Author

ramsey commented Jan 12, 2013

Thanks to the push from @lstrojny, I've opened up voting for this:
http://news.php.net/php.internals/64870

@asgrim
Copy link
Contributor

asgrim commented Jan 12, 2013

I don't think there is need for the alias is there? Surely aliases are just for backwards compatibility? Apologies if that was already discussed on the other PR, on my phone with slow net. Apart from that, this looks useful! 👍

@fititnt
Copy link

fititnt commented Jan 12, 2013

👍

return;
}

switch (Z_TYPE_P(zcolumn)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about double and resource types? array_column() could handle them as well.

@oaass
Copy link

oaass commented Jan 20, 2013

Really looking foreward to this! (Y)

@ccampbell
Copy link

@ramsey I still think it would be useful if there was a way to get multiple columns back with a single call by passing an array like we had talked about. Is there any plan for this?

Example

$records = array(
    array(
        'id' => 2135,
        'first_name' => 'John',
        'last_name' => 'Doe'
    ),
    array(
        'id' => 3245,
        'first_name' => 'Sally',
        'last_name' => 'Smith'
    ),
    array(
        'id' => 5342,
        'first_name' => 'Jane',
        'last_name' => 'Jones'
    )
);

$results = array_column($records, ['id', 'first_name']);
print_r($results);

would output

Array
(
    [id] => Array
        (
            [0] => 2135
            [1] => 3245
            [2] => 5342
        )

    [first_name] => Array
        (
            [0] => John
            [1] => Sally
            [2] => Jane
        )

)

If the third argument was specified it would work like this:

$results = array_column($records, ['first_name', 'last_name'], 'id');
print_r($results);

would return

Array
(
    [first_name] => Array
        (
            [2135] => John
            [3245] => Sally
            [5342] => Jane
        )

    [last_name] => Array
        (
            [2135] => Doe
            [3245] => Smith
            [5342] => Jones
        )

)

It doesn't change any of the existing functionality, and it would save having to make multiple calls to array_column in order to extract multiple columns.

@hakre
Copy link
Contributor

hakre commented Jan 23, 2013

@ccampbell But how would you solve to decide which fashion to return array keys? Your example for example swaps the 2D axes. From a "straight forward" point of view, I'd say this must not be part of the implementation and the following kind of output looks more straight forward to me (no preference given):

$results = array_column($records, ['id', 'first_name']);
print_r($results);

Array
(
    [0] => Array
        (
            [id] => 2135
            [first_name] => John
        )

    [1] => Array
        (
            [id] => 3245
            [first_name] => Sally
        )

    [2] => Array
        (
            [id] => 5342
            [first_name] => Jane
        )

)

And the second example:

$results = array_column($records, ['id', 'first_name'], 'id');
print_r($results);

Array
(
    [2135] => Array
        (
            [id] => 2135
            [first_name] => John
        )

    [3245] => Array
        (
            [id] => 3245
            [first_name] => Sally
        )

    [5342] => Array
        (
            [id] => 5342
            [first_name] => Jane
        )

)

This output btw. is compatiable with the existing one, meaning, you could applay array_column a second time.

@ccampbell
Copy link

@hakre your proposed output doesn't achieve the same thing as the purpose of this function though. It is just filtering out columns from the data set.

It doesn't make sense to me that if you pass in just 'id' you would get an array of ids that you can iterate over directly as ids (foreach ($ids as $id)), but if you pass in ['id', 'first_name'] you can't iterate over first names any differently than you could with the data set you started with. If you only wanted id and first_name like your first example, why not only select those columns from the database and save yourself the call completely?

Perhaps a use case would make more sense. The primary use case I think for getting back the data the way I proposed is for heavy traffic applications where you want to make data more cacheable at the database and sort your dataset in php. We do this at @vimeo pretty heavily and I'm pretty sure other people do as well. For example if you wanted to sort your user ids by last name you could do

$results = array_column($records, ['first_name', 'last_name'], 'id');
natsort($results['last_name']);
$ids = array_keys($ids);

Now you easily have all your user ids sorted alphabetically by last name.

What if now you wanted to sort by last name but when people have the same last name we sort by first name.

It would look something like

$results = array_column($records, ['first_name', 'last_name'], 'id');
array_multisort($results['last_name'], SORT_ASC, $results['first_name'], SORT_ASC, $results);

In your format to do this you would be right back where you started and would have to build the arrays of first_names and last_names manually.

Check out the documentation for http://php.net/manual/en/function.array-multisort.php. The expected format of the data is basically the same way that I proposed because it is the most efficient way to do sorting.

@hakre
Copy link
Contributor

hakre commented Jan 24, 2013

@ccampbell: I did not propose any output. I just wrote that one might equally expect something different. When there is more than one dimension in the output, there is more than one way to arrange it. That's all.

@Abeja
Copy link

Abeja commented Jan 29, 2013

Why "index_key" can not be an array?
It can be useful to build tree indexes.

Examples:

$records = array(
    array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe'),
    array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith'),
    array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones'),
);

array_column( $records , 'name' , array( 'parent-id' , 'id' ) )
// => array( 7 => array( 1 => 'Doe' , 4 => 'Jones' ), 3 => array( 2 => 'Smith' ) )

Result is collections of childs by 'parent-id' key where 'id' is subkey and 'name' is value.

More examples: #56 (comment)

P.S. It also compatible with current array_column and proposed higher

$records = array(
    array( 'id' => 1 , 'parent-id' => 7 , 'name' => 'Doe' , 'sex' => 'male' ),
    array( 'id' => 2 , 'parent-id' => 3 , 'name' => 'Smith' , 'sex' => 'female' ),
    array( 'id' => 4 , 'parent-id' => 7 , 'name' => 'Jones' , 'sex' => 'male' ),
);

array_column( $records , array( 'name' , 'sex' ) , array( 'parent-id' , 'id' ) )
/* => 
array(
    7 => array( 
        1 => array( 'Doe' , 'male' ) , 
        4 => array( 'Jones' , 'male' ) 
    ), 
    3 => array( 
        2 => array( 'Smith' , 'female' ) 
    ) 
)
*/

For myself, I call this array_collect

@dsp
Copy link
Member

dsp commented Mar 20, 2013

merged. thank you.

@dsp dsp closed this Mar 20, 2013
@wilmoore
Copy link

Nice...Would be great if this function would check whether the value is "callable" and if so, call it and use the returned value.

@ramsey
Copy link
Member Author

ramsey commented Mar 31, 2013

@wilmoore Which parameter are you interested in having check whether the value is callable? All of them?

@ccampbell I've been considering your request, but I find myself agreeing with @hakre on the expected output of array_column() if the second parameter is an array. It also lines up perfectly with this bug request: https://bugs.php.net/bug.php?id=64493. In this bug request, the request is to allow NULL to be passed as the second parameter, which would return an array that is nearly identical to the input array but could index each row by the third parameter.

Thoughts?

@ccampbell
Copy link

@ramsey I agree with you that @hakre's comment and that bug definitely fit together better. I do think most people would probably expect that output more than the one I proposed even though that functionality is already possible using array_intersect_key($data, array_flip($desired_columns)).

What I am trying to achieve is not having to make multiple calls to array_column which that solution does not solve.

Starting with

$records = array(
    array(
        'id' => 2135,
        'first_name' => 'John',
        'last_name' => 'Doe'
    ),
    array(
        'id' => 3245,
        'first_name' => 'Sally',
        'last_name' => 'Smith'
    ),
    array(
        'id' => 5342,
        'first_name' => 'Jane',
        'last_name' => 'Jones'
    )
);

The only way to get a list of ids [2135, 3245, 5342] and first_names ['John', 'Sally', 'Jane'] would be to call

$ids = array_column($records, 'id');
$names = array_columns($records, 'first_name');

Calling

$data = array_column($record, ['id', 'first_name']);

Would just filter out the last_name column and not actually pluck out the single columns I am looking for. It's not the end of the world. I bet more people are trying to filter out columns from the data set than trying to grab single column lists. Maybe array_multisort is the actual problem for expecting the data in that format.

@hakre
Copy link
Contributor

hakre commented Mar 31, 2013

@ccampbell: Good point with array_intersect_key($data, array_flip($desired_columns)) and also the array_multisort reference earlier - I like array_multisort. My initial suggestion looks now a bit short-minded.

@wilmoore
Copy link

Which parameter are you interested in having check whether the value is callable? All of them?

I was thinking, any callable returned values might be unwrapped regardless of which parameters were given. Further thought brings me to:

What to do when you want the raw function/callable (for whatever reason)? At that point, you'd have to add yet another optional parameter to satisfy that need. That could get out of hand quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet