New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add iterable\any(iterable $input, ?callable $cb=null), all(...), none(...), find(...), reduce(...) #6053
Changes from 5 commits
5aadb06
b300cbe
7f3cf58
06310e1
6a01897
209e429
8217ce5
4d02c6f
c826269
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,8 @@ | |
|
||
/** @generate-class-entries */ | ||
|
||
namespace { | ||
|
||
final class __PHP_Incomplete_Class | ||
{ | ||
} | ||
|
@@ -1510,3 +1512,18 @@ function sapi_windows_set_ctrl_handler(?callable $handler, bool $add = true): bo | |
|
||
function sapi_windows_generate_ctrl_event(int $event, int $pid = 0): bool {} | ||
#endif | ||
} // end global namespace | ||
|
||
namespace iterable { | ||
|
||
function any(iterable $iterable, ?callable $callback = null): bool {} | ||
|
||
function all(iterable $iterable, ?callable $callback = null): bool {} | ||
|
||
function none(iterable $iterable, ?callable $callback = null): bool {} | ||
|
||
function reduce(iterable $iterable, callable $callback, mixed $initial = null): mixed {} | ||
|
||
function find(iterable $iterable, callable $callback, mixed $default = null): mixed {} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we remove this one? I'd rather add filter, map, and flatmap if we need more functions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. find() is also useful and many other programming languages include both (js, haskell, etc.) For example, if you have an array of a million elements and only want the first match, it is much more efficient to call find() if the iterable contains a matching value (and there would be less service calls and db calls) compared to Additionally, filter() and map() would be waiting on the existence of CachedIterable when it starts, because Traversables can have repeated keys There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This does not require a CachedIterable; they can return any iterator. Edit: on second thought, they should not return a cached iterable. These routines are often chained together; if every piece of the chain caches their results, it will balloon memory usage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean that there's nothing in the standard library yet that supports rewinding, counting, and especially arbitrary offset access It seems much more difficult to use without the support for rewindability, random/repeated offset access, countability, etc. But yes, I suppose you could hide the implementation entirely with InternalIterator and only support a single iteration There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
CachedIterable is basically the name chosen for a rewindable immutable key-value sequence. It isn't cached permanently, it has a regular object lifetime. I'm referring to https://wiki.php.net/rfc/cachediterable There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, I am specifically saying they should not return this object. It will ballon memory usage to do it that way. Think about it; you have a map + filter plus some terminator like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Iterables are made from arrays and generators. Since there is already a large body of functions which work with arrays, it seems reasonable to assume they would use an No, it's much better to compose something if you want it to be eager, e.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Iterables are made from arrays and Traversable objects such as SplObjectStorage, ArrayObject, user-defined data structures, and Generators
In common use cases, the memory increase may be small, especially if the lifetime of the variable or temporary expression holding the result is short. Additionally, a lazy iterable would require keeping a reference to the previous (possibly lazy) iterable, and the iterables/arrays those reference - if the initial iterable is larger than the result then eagerly evaluating
That seems error prone and I'm personally opposed to that. PHP has typically been imperitive rather than functional, and focused on "cater[ing] to the skill-levels and platforms of a wide range of users" as RFC authors are repeatedly reminded in https://wiki.php.net/rfc/template and my interpretation of that is that imperitive would be much more acceptable (aside: The loosely typed language part seems less applicable nowadays)
Lazy data structures would be easy to misuse (consume twice, attempt to serialize or encode, (or var_dump or inspect with Xdebug), easier to attempt to log the full iterable(consume twice) etc) without (or even with) linters and static analyzers, so this really doesn't seem like catering to a wide range of users. Explicitly using a different family of functions to act on generators internally would probably make more sense than being the default, e.g. https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#findFirst-- and https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html (Streams are separate from java.util.Collection in java, javascript eagerly evaluates https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Map, etc) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Having the eager version be longer than the lazy version (instead of shorter or the same length) would also encourage the use of the lazy version, which I'd objected to for being error prone and easy to misuse. |
||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
--TEST-- | ||
Test all() function | ||
--FILE-- | ||
<?php | ||
|
||
use function iterable\all; | ||
|
||
/* | ||
Prototype: bool all(array $array, mixed $callback); | ||
Description: Iterate array and stop based on return value of callback | ||
*/ | ||
|
||
function is_int_ex($item) | ||
{ | ||
return is_int($item); | ||
} | ||
|
||
echo "\n*** Testing not enough or wrong arguments ***\n"; | ||
|
||
function dump_all(...$args) { | ||
try { | ||
var_dump(all(...$args)); | ||
} catch (Error $e) { | ||
printf("Caught %s: %s\n", $e::class, $e->getMessage()); | ||
} | ||
} | ||
|
||
dump_all(); | ||
dump_all(true); | ||
dump_all([]); | ||
dump_all(true, function () {}); | ||
dump_all([], true); | ||
|
||
echo "\n*** Testing basic functionality ***\n"; | ||
|
||
dump_all([1, 2, 3], 'is_int_ex'); | ||
dump_all(['hello', 1, 2, 3], 'is_int_ex'); | ||
$iterations = 0; | ||
dump_all(['hello', 1, 2, 3], function($item) use (&$iterations) { | ||
++$iterations; | ||
return is_int($item); | ||
}); | ||
var_dump($iterations); | ||
|
||
echo "\n*** Testing edge cases ***\n"; | ||
|
||
dump_all([], 'is_int_ex'); | ||
|
||
echo "\nDone"; | ||
?> | ||
--EXPECT-- | ||
*** Testing not enough or wrong arguments *** | ||
Caught ArgumentCountError: iterable\all() expects at least 1 argument, 0 given | ||
Caught TypeError: iterable\all(): Argument #1 ($iterable) must be of type iterable, bool given | ||
bool(true) | ||
Caught TypeError: iterable\all(): Argument #1 ($iterable) must be of type iterable, bool given | ||
Caught TypeError: iterable\all(): Argument #2 ($callback) must be a valid callback or null, no array or string given | ||
|
||
*** Testing basic functionality *** | ||
bool(true) | ||
bool(false) | ||
bool(false) | ||
int(1) | ||
|
||
*** Testing edge cases *** | ||
bool(true) | ||
|
||
Done |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
--TEST-- | ||
Test all() function | ||
--FILE-- | ||
<?php | ||
|
||
use function iterable\all; | ||
|
||
/* | ||
Prototype: bool all(array $array, ?callable $callback = null, int $use_type = 0); | ||
Description: Iterate array and stop based on return value of callback | ||
*/ | ||
|
||
function is_int_ex($item) | ||
{ | ||
return is_int($item); | ||
} | ||
|
||
function dump_all(...$args) { | ||
try { | ||
var_dump(all(...$args)); | ||
} catch (Error $e) { | ||
printf("Caught %s: %s\n", $e::class, $e->getMessage()); | ||
} | ||
} | ||
|
||
|
||
echo "\n*** Testing not enough or wrong arguments ***\n"; | ||
|
||
dump_all(new ArrayIterator()); | ||
dump_all(new ArrayIterator(), true); | ||
|
||
echo "\n*** Testing basic functionality ***\n"; | ||
|
||
dump_all(new ArrayIterator([1, 2, 3]), 'is_int_ex'); | ||
dump_all(new ArrayIterator(['hello', 1, 2, 3]), 'is_int_ex'); | ||
$iterations = 0; | ||
dump_all(new ArrayIterator(['hello', 1, 2, 3]), function($item) use (&$iterations) { | ||
++$iterations; | ||
return is_int($item); | ||
}); | ||
var_dump($iterations); | ||
|
||
echo "\n*** Testing edge cases ***\n"; | ||
|
||
dump_all(new ArrayIterator(), 'is_int_ex'); | ||
|
||
echo "\nDone"; | ||
?> | ||
--EXPECT-- | ||
*** Testing not enough or wrong arguments *** | ||
bool(true) | ||
Caught TypeError: iterable\all(): Argument #2 ($callback) must be a valid callback or null, no array or string given | ||
|
||
*** Testing basic functionality *** | ||
bool(true) | ||
bool(false) | ||
bool(false) | ||
int(1) | ||
|
||
*** Testing edge cases *** | ||
bool(true) | ||
|
||
Done |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
--TEST-- | ||
Test any() function | ||
--FILE-- | ||
<?php | ||
|
||
use function iterable\any; | ||
|
||
/* | ||
Prototype: bool any(array $iterable, mixed $callback); | ||
Description: Iterate array and stop based on return value of callback | ||
*/ | ||
|
||
function is_int_ex($nr) | ||
{ | ||
return is_int($nr); | ||
} | ||
|
||
echo "\n*** Testing not enough or wrong arguments ***\n"; | ||
|
||
function dump_any(...$args) { | ||
try { | ||
var_dump(any(...$args)); | ||
} catch (Error $e) { | ||
printf("Caught %s: %s\n", $e::class, $e->getMessage()); | ||
} | ||
} | ||
|
||
dump_any(); | ||
dump_any(true); | ||
dump_any([]); | ||
dump_any(true, function () {}); | ||
dump_any([], true); | ||
|
||
echo "\n*** Testing basic functionality ***\n"; | ||
|
||
dump_any(['hello', 'world'], 'is_int_ex'); | ||
dump_any(['hello', 1, 2, 3], 'is_int_ex'); | ||
$iterations = 0; | ||
dump_any(['hello', 1, 2, 3], function($item) use (&$iterations) { | ||
++$iterations; | ||
return is_int($item); | ||
}); | ||
var_dump($iterations); | ||
|
||
echo "\n*** Testing second argument to predicate ***\n"; | ||
|
||
dump_any([1, 2, 3], function($item, $key) { | ||
var_dump($key); | ||
return false; | ||
}); | ||
|
||
echo "\n*** Testing edge cases ***\n"; | ||
|
||
dump_any([], 'is_int_ex'); | ||
|
||
dump_any(['key' => 'x'], null); | ||
|
||
echo "\nDone"; | ||
?> | ||
--EXPECT-- | ||
*** Testing not enough or wrong arguments *** | ||
Caught ArgumentCountError: iterable\any() expects at least 1 argument, 0 given | ||
Caught TypeError: iterable\any(): Argument #1 ($iterable) must be of type iterable, bool given | ||
bool(false) | ||
Caught TypeError: iterable\any(): Argument #1 ($iterable) must be of type iterable, bool given | ||
Caught TypeError: iterable\any(): Argument #2 ($callback) must be a valid callback or null, no array or string given | ||
|
||
*** Testing basic functionality *** | ||
bool(false) | ||
bool(true) | ||
bool(true) | ||
int(2) | ||
|
||
*** Testing second argument to predicate *** | ||
Caught ArgumentCountError: Too few arguments to function {closure}(), 1 passed and exactly 2 expected | ||
|
||
*** Testing edge cases *** | ||
bool(false) | ||
bool(true) | ||
|
||
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly prefer removing
$initial
and addingfold
which always requires it:The
fold
function doesn't throw on empty, and thereduce
will. This pattern exists in other languages, such as Kotlin.I'm happy to do this work if you'll agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the inconsistency with array_reduce (which has optional $initial=null) would have more objectors for making it harder to learn the language or switch code from array_reduce to
*reduce
intuitively for beginners.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an error condition in
reduce
where there is not an initial value and an empty iterable, and it should throw because there is no legal value we can return that isn't already possible in the reduction. We should not repeat the mistakes of the past. You argue in another comment thatfind
is useful in other languages, and yet you don't buy that same argument here? What gives?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, I'd agree my earlier proposal for reduce was a mistake and it's worth changing, adding fold and either removing reduce entirely or requiring a non-empty array.
The other argument was about including a function, not a change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can work on
fold
tonight.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you okay with these signatures and semantics for
fold
andreduce
?I imagine there is some discussion to be had for naming the parameters to make sure named parameters is a good experience, so let me know what you think. The name
$into
I picked from Swift. I used$by
because it's short but not an abbreviation likeacc
; my quick glance in other languages' docs didn't turn up anything better.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For fold, I think having the callback third would be hard to remember when it's second in other reduce() function
The inner implementation seem reasonable enough. I assume the ArrayObject is just for illustrating the behavior and not the internal implementation
$seq seems like an harder to remember naming choice compared to $array/$iterable used elsewhere - https://www.php.net/manual/en/function.iterator-apply.php and https://www.php.net/manual/en/function.array-walk.php - especially for non-english speakers
PHP's already using $initial for https://www.php.net/manual/en/function.array-reduce.php and I don't see a strong reason to switch to a different name - initial's been used elsewhere (e.g. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
ArrayIterator
is just for showing behavior, yes. Notably, this will not passNULL
as the first parameter to the callback on the very first time it is called, unlikearray_reduce
and what this PR currently does.As an example with the data set
[1, 3, 5, 7]
:This will print:
And not:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented the changes to
reduce
on this branch: https://github.com/morrisonlevi/php-src/tree/levi/any-all-iterable-checks. For some reason it wouldn't let me select your fork as the merge-base, so I didn't open a PR, but you can look at the last two commits. I did not yet addfold
.