New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix simulation of return from unwind block like in following example:… #8509
Fix simulation of return from unwind block like in following example:… #8509
Conversation
… [ #some ] ensure: [ ^#unwindReturn ]
Fixes #8443 |
While it fixes original example it is more complex problem. But if tests are fine PR can be integrated. It's a good step anyway |
(context notNil and: [ context unwindBlock = closureOrNil ]) ifTrue: [ | ||
"Here the return happens inside the unwind block like in following example: | ||
[ #test ] ensure: [ ^ #returnFromUnwindBlock ] | ||
In such cases #findNextUnwindContextUpTo: returns the corresponding #ensure context | ||
because it is always a sender of unwind block. | ||
Here we detect this case and lookup the next unwind context after the current ensure section (current unwind)" | ||
context := context findNextUnwindContextUpTo: newTop ]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, before your fix the aboutToReturn:through:
is sent with the ensure:
context as context.
Then the fix makes sure that the returned context is the next context to unwind, which is the return in the ensure block.
Did you identify what causes the hanging in the first case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real bug still exists: when simulation is going into #aboutToReturn:through: the image hangs.
My fix just removes the one specific case where simulation should not go this way. But the real problem is still here. And it's easy to reproduce:
[
[ 100 + 200 ] ensure: [^600]
] ensure: [ #done ]
Step inside [^600] block. Then step over it will hang the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the situation is much worser. We can't step over return inside ensure:
[ ^100 ] ensure: [ 10+30]
Step into block [^100]. Then step over it will hang the image.
I checked also Pharo 7 - same issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why the image hangs in all those cases? This is very interesting story :)
The short answer: bugs in the debugger lead to the state of target process where it never ends.
The last call of any process is the termination which ends with #suspend.
newProcess
^Process
forContext:
[self value.
Processor terminateActive] asContext
priority: Processor activePriority
The block here is actually never returned and it must not return. If you will remove the #terminateActive call then processes will end with BlockCannotReturn error (see cannotReturn:) because VM will not find the sender to continue the process.
And it is what really happens when we got hanging in the debugger:
- #terminateActive call does not suspend the current process and simply returns.
- VM calls #cannotReturn: which triggers the debugger:
- new debugger is requested
- debugger tries to #suspend activeProcess but it does nothing (see below) and process continues.
- as process continues the method #cannotReturn: returns.
- cannotReturn: has no sender and therefore VM calls cannotReturn: again - Infinite recursions began.
Now why #terminateActive and #suspend can be not working.
The problem is how debugger simulates the #activeProcess calls during stepping operations:.
process := Process activeProcess
The StepOver this expression is evaluated by the debugger process (UIProcess) but still the result will be the process under the debugger (not a UIProcess).
To achieve this the debugger wraps all stepping operations using #effectiveProcess trick:
step
^Processor activeProcess
evaluate: [suspendedContext := suspendedContext step]
onBehalfOf: self
The active process here is UIProcess (which handles StepOver click). It sets the variable #effectiveProcess to the process under the debugger. During the block execution the Processor will always use #effectiveProcess as an active one instead of the real active process which executes the debugger command.
For example you can step over "Processor terminateActive" and it will not terminate the UIProcess while the command is really executed by UIProcess.
Now it should be clear how the normal process completion becomes broken: if #effectiveProcess variable is not reset by the debugger the #terminateActive call will never kill the current process.
And same issue happens with #suspend call. When debugger is requested it uses "Processor activeProcess" to get a process for debugging and to suspend it but it will never return the actual current process. Therefore the #suspend from the debugger will not stop the current process (it will suspend some other process) and the current process will continue execution which will cause the buggy recursion.
Now why effectiveProcess can be not reset? Method #evaluate:onBehalfOf: contains the ensure logic to reset the variable back.
The block here is supposed to do the context/stack manipulations. And some of them are really not trivial. The processing of unwinds (ensure parts) are their responsibility. So it's obvious that there can be some bugs or missing parts. Especially that we don't have good tests for simulation logic.
For this concrete issue with StepOver the problem is in the way how debugger intercepts the return from the callee. The current trap is not enough for the returns inside unwind contexts. And therefore the simulated code is not stopped on the current context and completely finishes the execution up to last #terminateActive in the #newProcess. So #terminateActive is executed with effective process and it hangs the image.
The trap needs to be improved and maybe it's not really easy to do. But what is more important is to find a general fix for described problem.
I hope you understand my explanation. I will push some obvious changes soon: #newProcess should ensure the termination of actual active process despite on any assigned effectiveProcess.
Hi denis |
Hi, sorry I am now on holidays even if I keep an eye from time to time. |
Hi, please check PR #10383; I think the PR fixes the issue. Thanks. |
Hi all, it's been really difficult to review the batch of issues (#8567 , #8845, #9137, #8993, #8509). @tesonep is correctly (!!) proposing to my ear that we make a single PR joining all of them to reduce the merge noise, let the CI test the combination of all issues. I then propose that we eagerly merge this (meta) PR and be ready to quickly apply hot-fixes as soon as possible or revert it in case of fire. Thoughts? @dionisiydk @Ducasse @isCzech @estebanlm |
Superseeded by #10429 |
Fix simulation of return from unwind block like in following example: