Add generators support #177

Merged
merged 74 commits into from Sep 1, 2012

Projects

None yet

8 participants

@nikic
Collaborator
nikic commented Aug 25, 2012

PR for generators as outlined in https://wiki.php.net/rfc/generators.

(For the diff)

nikic added some commits May 15, 2012
@nikic nikic Add T_YIELD "yield" keyword 9b101ac
@nikic nikic Add flag for generator functions
Generator functions have to specify the * (asterix) modifier after the
function keyword. If they do so the ZEND_ACC_GENERATOR flag is added to
the fn_flags.
252f623
@nikic nikic Minor code cleanup
The block for the foreach separator was nested unnecessary. This commit
simply removes that nesting.
9b51a3b
@nikic nikic Add error if yield is used outside a generator
The yield statement can only be used in generator functions, which are
marked with an asterix.
fd2a109
@nikic nikic Add zend_do_suspend_if_generator calls
The execution of generator functions will be suspended right after the
arguments were RECVed. This will be done in zend_do_suspend_if_generator.
e14cfaf
@nikic nikic Add ZEND_SUSPEND_AND_RETURN_GENERATOR opcode
If the function is a generator this opcode will be invoked right after
receiving the function arguments.

The current implementation is just a dummy.
1cec3f1
@nikic nikic Add empty Generator class ca59e54
@nikic nikic Add some boilerplate code for Generator class
The Generator class now uses a zend_generator struct, so it'll be able to
store additional info.

This commit also ensures that Generator cannot be directly instantiated
and extended. The error tests are now in a separate folder from the
(yet-to-come) functional tests.
40b7533
@nikic nikic Make generator functions return a Generator object
Right now generator functions simply immediately return a new Generator
object (no suspension yet).
46fa26a
@nikic nikic Allocate execute_data using malloc for generators
Generators need to switch the execute_data very often. If the execute_data
is allocated on the VM stack this operation would require to always copy
the structure (which is quite large). That's why the execution context is
allocated on the heap instead (only for generators obviously).
5e763d9
@nikic nikic Add initial code for suspending execution
This is just some initial code, which is still quite broken (and needs to be
moved so it can be reused.)
9ce9a7e
@nikic nikic Add dummy Iterator implementation
This simply adds dummy rewind/valid/current/key/next methods to Generator.
2c5ecb4
@nikic nikic Allow calling zend_vm_gen from everywhere
Before one could only call it with cwd=Zend.
ececcbc
@nikic nikic Add support for executing a zend_execute_data
This adds another function execute_ex(), which accepts a zend_execute_data
struct to run (contrary to execute(), which accepts a zend_op_array from
which it initialized the execute_data).

This needs a bit more cleanup.
f627be5
@nikic nikic Add way to pass generator object to opcode handlers
The generator zval is put into the return_value_ptr_ptr.
1a99d1c
@nikic nikic Add YIELD opcode implementation fafce58
@nikic nikic Implement return for generators
For generators ZEND_RETURN directly calls ZEND_VM_RETURN(), thus passing
execution back to the caller (zend_generator_resume).

This commit also adds a check that only return; is used in generators and
not return $value;.
5bb3a99
@nikic nikic Close generator on return d49d397
@nikic nikic Remove wrong dtor call cbfa96c
@nikic nikic Add first real generator test
The test implements an xrange() function (the generator version of range()).
39d3d5e
@nikic nikic Add support for generator methods 247bb73
@nikic nikic Free loop variables
If the generator is closed before it has finished running, it may happen
that some FREE or SWITCH_FREE opcodes haven't been executed and memory is
leaked.

This fixes it by walking the brk_cont_array and manually freeing the
variables.
64a643a
@nikic nikic Fix generator creation when execute_data is not nested
This happens primarily when the generator is invoked from some internal
place like a dynamic function call.
9f52c5c
@nikic nikic Set EG(current_execute_data)
This fixes several issues. In particular it makes method generators work
properly and also allows generators using a symbol table.
bcc7d97
@nikic nikic Properly free resources when generator return value not used
To keep things clean two new functions are introduced:

zend_clean_and_cache_symbol_table(HashTable *symbol_table)
zend_free_compiled_variables(zval ***CVs, int num)
4aab08b
@nikic nikic Make the GOTO and SWITCH VMs work again b770b22
@nikic nikic Add support for $generator->send()
Yield now is an expression and the return value is the value passed to
$generator->send(). By default (i.e. if ->next() is called) the value is
NULL.

Unlike in Python ->send() can be run without priming the generator with a
->next() call first.
3600914
@nikic nikic Allow to use yield without value
If the generator is used as a coroutine it often doesn't make sense to yield
anything. In this case one can simply receive values using

    $value = yield;

The yield here will simply yield NULL.
ad525c2
@nikic nikic Fix segfault when send()ing to a closed generator 12e9283
@nikic nikic Add $generator->close() method
Calling $generator->close() is equivalent to executing a return statement
at the current position in the generator.
72a91d0
@nikic nikic Add support for yielding keys
Keys are yielded using the

    yield $key => $value

syntax. Currently this is implemented as a statement only and not as an
expression, because conflicts arise considering nesting and use in arrays:

    yield yield $a => $b;
    // could be either
    yield (yield $a) => $b;
    // or
    yield (yield $a => $b);

Once I find some way to resolve these conflicts this should be available
as an expression too.

Also the key yielding code is rather copy-and-past-y for the value yielding
code, so that should be factored out.
bc08c2c
@nikic nikic Add auto-increment keys
When no key is explicitely yielded PHP will used auto-incrementing keys
as a fallback. They behave the same as with arrays, i.e. the key is the
successor of the largest previously used integer key.
8790160
@nikic nikic Allow throwing exceptions from generators
The missing piece is how one can find the next stack frame, which is
required for dtor'ing arguments pushed to the stack. As the generator
execute_data does not live on the stack one can't use it to figure out the
start of the next stack frame. So there must be some other method.
0033a52
@nikic nikic Allow yielding during function calls
During function calls arguments are pushed onto the stack. Now these are
backed up on yield and restored on resume. This requires memcpy'ing them,
but there doesn't seem to be any better way to do it.

Also this fixes the issue with exceptions thrown during function calls.
ee89e22
@nikic nikic Make $generator->send() return the current value
This makes the API easier to use (and is consistent with Python and JS).
1477be9
@nikic nikic Add cloning support for generators
Generators can now be cloned. I'm pretty sure that my current code does not
yet cover all the edge cases of cloning the execution context, so there are
probably a few bugs in there :)
6117f4c
@nikic nikic Improve backtraces from generators
The current situation is still not perfect, as the generator function itself
does not appear in the stack trace. This makes sense in some way, but it
would probably be more helpful if it would show up (with the bound arguments)
after the $generator->xyz() call. This could be misleading too though as the
function is not *really* called there.
7b3bfa5
@nikic nikic Properly handle yield during method calls bf82f46
@nikic nikic Fix cloning of generator methods
Forgot to add a reference to the this variable
40760ec
@nikic nikic Fix backtraces and func_get_args()
To make the generator function show up in backtraces one has to insert an
additional execute_data into the chain, as prev_execute_data->function_state
is used to determine the called function.

Adding the additional stack frame is also required for func_get_args(), as
the arguments are fetched from there too. The arguments have to be copied
in order to keep them around. Due to the way they are saved doing so is
quite ugly, so I added another function zend_copy_arguments to zend_execute.c
which handles this.
f169b26
@nikic nikic Add sceleton for yield* expression
This does not yet actually implement any delegation.
d939d2d
@nikic nikic Fix thread safe build 6233408
@nikic nikic Fix segfault in method test
A ref has to be added to $this if the generator is called !nested (which
is the case when it is invoked via getIterator).
1d3f37d
@nikic nikic Implement get_iterator
This implements the get_iterator handler for Generator objects, thus making
direct foreach() iteration significantly faster.
04e781f
@nikic nikic Pass zend_generator directly to Zend VM
Previously the zval* of the generator was passed into the VM by misusing
EG(return_value_ptr_ptr). Now the zend_generator* itself is directly passed
in. This saves us from always having to pass the zval* around everywhere.
14766e1
@nikic nikic Disallow closing a generator during its execution
If a generator is closed while it is running an E_WARNING is thrown and the
call is ignored. Maybe a fatal error should be thrown instead?
ab75ed6
@nikic nikic Forgot to git add two tests 5a9bddb
@nikic nikic Add support by yielding by-reference 85f077c
@nikic nikic Remove asterix modifier (*) for generators
Generators are now automatically detected by the presence of a `yield`
expression in their body.

This removes the ZEND_SUSPEND_AND_RETURN_GENERATOR opcode. Instead
additional checks for ZEND_ACC_GENERATOR are added to the fcall_common
helper and zend_call_function.

This also adds a new function zend_generator_create_zval, which handles
the actual creation of the generator zval from an op array.

I feel like I should deglobalize the zend_create_execute_data_from_op_array
code a bit. It currently changes EG(current_execute_data) and
EG(opline_ptr) which is somewhat confusing (given the name).
c9709bf
@nikic nikic Move a variable 612c249
@nikic nikic Add some more tests 1f70a4c
@nikic nikic Require parenthesis around yield expressions
If yield is used in an expression context parenthesis are now required.
This ensures that the code is unambiguos.

Yield statements can still be used without parenthesis (which should be
the most common case).

Also yield expressions without value can be used without parenthesis,
too (this should be the most common case for coroutines).

If the yield expression is used in a context where parenthesis are required
anyway, no additional parenthesis have to be inserted.

Examples:

    // Statements don't need parenthesis
    yield $foo;
    yield $foo => $bar;

    // Yield without value doesn't need parenthesis either
    $data = yield;

    // Parentheses don't have to be duplicated
    foo(yield $bar);
    if (yield $bar) { ... }

    // But we have to use parentheses here
    $foo = (yield $bar);

This commit also fixes an issue with by-ref passing of $foo[0] like
variables. They previously weren't properly fetched for write.

Additionally this fixes valgrind warnings which were caused by access to
uninitialized memory in zend_is_function_or_method_call().
8074863
@nikic nikic Remove reference restrictions from foreach
foreach only allowed variables to be traversed by reference. This never
really made sense because

    a) Expressions like array(&$a, &$b) can be meaningfully iterated by-ref
    b) Function calls can return by-ref (so they can also be meaningfully
       iterated)
    c) Iterators could at least in theory also be iterated by-ref (not
       sure if any iterator makes use of this)

With by-ref generators the restriction makes even less sense, so I removed
it altogether.
de80e3c
@nikic nikic Fix throwing of exceptions within a generator
If a generator threw an exception and was iterated using foreach (i.e. not
manually) an infinite loop was triggered. The reason was that the exception
was not properly rethrown using zend_throw_exception_internal.
94b2cca
@nikic nikic Throw error also for return occuring before yield
Previously only an error was thrown when return occured after yield. Also
returns before the first yield would fail for by-ref generators.

Now the error message is handled in pass_two, so all returns are checked.
1340893
@nikic nikic Add T_YIELD in tokenizer_data.c
Also had to fix up some tokenizer tests that were affected by the token
number changes.
99f93dd
@nikic nikic Fix implementation of Iterator interface
It looks like you have to implement the Iterator interface *before*
assigning get_iterator. Otherwise the structure for user iterators isn't
correctly zeroed out.

Additionaly I'm setting class_entry->iterator_funcs.funcs now. Not sure if
this is strictly necessary, but better safe than sorry ;)
268740d
@nikic nikic Merge remote-tracking branch 'php-src/master' into addGeneratorsSupport
This is just an intial merge. It does not yet make generators and finally
work together.

Conflicts:
	Zend/zend_language_scanner.c
	Zend/zend_language_scanner_defs.h
	Zend/zend_vm_def.h
	Zend/zend_vm_execute.h
	Zend/zend_vm_execute.skl
	Zend/zend_vm_opcodes.h
f4ce364
@nikic nikic Support trivial finally in generators (no yield, no return)
The finally clause is now properly run when an exception is thrown in the
try-block. It is not yet run on `return` and also not run when the generator
is claused within a try block.

I'll add those two things as soon as laruence refactored the finally code.
ae71693
@nikic nikic Forgot to add test 7195a5b
@nikic nikic Drop Generator::close() method 05f1048
@nikic nikic Fix zts build (typo) 9003cd1
@nikic nikic Merge remote-tracking branch 'php-src/master' into addGeneratorsSupport
Merging master to fix Windows build

Conflicts:
	Zend/zend_language_scanner.c
	Zend/zend_language_scanner_defs.h
	Zend/zend_vm_def.h
1823b16
@nikic nikic Disallow serialization and unserialization f45a0f3
@nikic nikic Merge remote-tracking branch 'php-src/master' into addGeneratorsSupport
Conflicts:
	Zend/zend_vm_def.h
	Zend/zend_vm_execute.h
6517ed0
@nikic nikic Add dedicated opcode for returns from a generator
Generators don't have a return value, so it doesn't make sense to have
a shared implementation here.
68c1e1c
@nikic nikic Finally with return now works in generators too 7cdf636
@nikic nikic Run finally if generator is closed before finishing 4d8edda
@nikic nikic Fix several issues and allow rewind only at/before first yield
 * Trying to resume a generator while it is already running now throws a
   fatal error.
 * Trying to use yield in finally while the generator is being force-closed
   (by GC) throws a fatal error.
 * Rewinding after the first yield now throws an Exception
f53225a
@laruence
Member

Now I get the diff, thanks

nikic added some commits Aug 25, 2012
@nikic nikic Remove implementation stubs for yield delegation
I decided to leave out yield delegation for an initial proposal, so remove
the stubs for it too.
bd70d15
@nikic nikic Merge remote-tracking branch 'php-src/master' into addGeneratorsSupport
Conflicts:
	Zend/zend_language_parser.y
	Zend/zend_vm_execute.skl
d60e3c6
@smalyshev smalyshev commented on the diff Aug 26, 2012
Zend/tests/errmsg_043.phpt
@@ -1,12 +0,0 @@
---TEST--
@smalyshev
smalyshev Aug 26, 2012 collaborator

why delete this test?

@smalyshev smalyshev commented on the diff Aug 26, 2012
Zend/zend_compile.c
@@ -6289,9 +6336,7 @@ void zend_do_foreach_cont(znode *foreach_token, const znode *open_brackets_token
if (value->EA & ZEND_PARSED_REFERENCE_VARIABLE) {
assign_by_ref = 1;
- if (!(opline-1)->extended_value) {
@smalyshev
smalyshev Aug 26, 2012 collaborator

why this is removed?

@nikic
nikic Aug 27, 2012 collaborator

I removed the by-ref restrictions for foreach. More info in this commit message: nikic@de80e3c

(Same for the test)

@smalyshev smalyshev and 1 other commented on an outdated diff Aug 26, 2012
Zend/zend_generators.c
+ size_t offset = (char *) orig->send_target - (char *) execute_data->Ts;
+ clone->send_target = (temp_variable *) (
+ (char *) clone->execute_data->Ts + offset
+ );
+ Z_ADDREF_P(clone->send_target->var.ptr);
+ }
+
+ if (execute_data->current_this) {
+ Z_ADDREF_P(execute_data->current_this);
+ }
+
+ if (execute_data->object) {
+ Z_ADDREF_P(execute_data->object);
+ }
+
+ /* Prev execute data contains an additional stack frame (for proper)
@smalyshev
smalyshev Aug 26, 2012 collaborator

typo - extra ) here

@nikic
nikic Aug 29, 2012 collaborator

fixed

@reeze reeze commented on the diff Aug 29, 2012
Zend/zend_generators.c
+ }
+
+ /* The sent value was initialized to NULL, so dtor that */
+ zval_ptr_dtor(&generator->send_target->var.ptr);
+
+ /* Set new sent value */
+ Z_ADDREF_P(value);
+ generator->send_target->var.ptr = value;
+ generator->send_target->var.ptr_ptr = &value;
+
+ zend_generator_resume(generator TSRMLS_CC);
+
+ if (generator->value) {
+ RETURN_ZVAL(generator->value, 1, 0);
+ }
+}
@reeze
reeze Aug 29, 2012

missing a close folder comment /* }}} */ here :)

@nikic
nikic Aug 29, 2012 collaborator

fixed

@reeze reeze and 1 other commented on an outdated diff Aug 29, 2012
Zend/zend_generators.c
+ zval_ptr_dtor(&generator->send_target->var.ptr);
+
+ /* Set new sent value */
+ Z_ADDREF_P(value);
+ generator->send_target->var.ptr = value;
+ generator->send_target->var.ptr_ptr = &value;
+
+ zend_generator_resume(generator TSRMLS_CC);
+
+ if (generator->value) {
+ RETURN_ZVAL(generator->value, 1, 0);
+ }
+}
+
+
+/* {{{ proto void Generator::__wakeup
@reeze
reeze Aug 29, 2012

missing a '()'

@nikic
nikic Aug 29, 2012 collaborator

fixed

nikic added some commits Aug 29, 2012
@nikic nikic Make sure that exception is thrown on rewind() after closing too cc07038
@nikic nikic Fix segfault when traversing a by-ref generator twice
If you try to traverse an already closed generator an exception will now be
thrown.

Furthermore this changes the error for traversing a by-val generator by-ref
from an E_ERROR to an Exception.
bef7958
@nikic nikic Fix typos dbc7809
@travisbot

This pull request fails (merged dbc7809 into 35951d4).

@php-pulls php-pulls merged commit dbc7809 into php:master Sep 1, 2012
@KendallHopkins

@nikic I'm getting segfaults with this code https://gist.github.com/3588986

@smalyshev
Collaborator

@KendallHopkins please submit a bug report to bugs.php.net so it could be tracked properly.

@bradfeehan bradfeehan commented on the diff Feb 8, 2013
Zend/zend_vm_def.h
@@ -1845,17 +1845,22 @@
zend_bool nested;
zend_op_array *op_array = EX(op_array);
+ /* Generators go throw a different cleanup process */
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment