Fix binding freevars in comprehension scope by isaiah · Pull Request #1073 · IronLanguages/ironpython3

isaiah · 2020-12-07T14:12:41Z

[lambda: i for i in [1,2]] should bind the i correctly now.

dnfadmin · 2020-12-07T14:12:58Z

All CLA requirements met.

slozier · 2020-12-07T15:21:07Z

Thanks for the PR! Adding some test cases would be useful:

def test_ipy3_gh817(self):
    """https://github.com/IronLanguages/ironpython3/issues/817"""
    for x in [lambda: i for i in [1,2]]:
        self.assertEqual(x(), 2)

    self.assertEqual([tuple(i for j in [1]) for i in [1]], [(1,)])

    self.assertEqual({i: tuple(j for j in t if i != j) for t in ((1,2),) for i in t}, {1: (2,), 2: (1,)})

isaiah · 2020-12-07T17:38:50Z

@slozier nice catch, I actually didn't know there is an issue reported.
Do you know how to check if the following is working with this fix?

For IsolationLevel=SCOPE or ENGINE, expression rewriting kicks in and again nested scopes get completely mismanaged.

isaiah · 2020-12-07T18:07:29Z

Regarding regression tests, this is already covered by the comprehension tests, e.g. https://github.com/IronLanguages/ironpython3/blob/master/Src/StdLib/Lib/test/test_listcomps.py#L84

In case you want to a specific test for the PR, do you have a convention to do this? Couldn't find other specs related to PRs.

slozier · 2020-12-07T18:23:17Z

I tried the tests cases I posted above with IsolationLevel=SCOPE and it failed. I'm not super sure about the details, @BCSharp would know more. The following program also fails but I'm not sure if it's the same failure:

var engine = Python.CreateEngine();
var scope = engine.CreateScope();
var source = engine.CreateScriptSourceFromString("assert [lambda: i for i in [1,2]][0]() == 2");
var compiledCode = source.Compile();
compiledCode.Execute(scope);

If test_listcomps covers this then great, it should be enable by removing the lines at:

ironpython3/Src/IronPythonTest/Cases/CPythonCasesManifest.ini

Lines 573 to 574 in cbfbac2

    
           [CPython.test_listcomps] 
        
           Ignore=true

As for the regression tests I would probably still add them at least to Tests\test_regressions.py since there are also non-listcomp cases (although that one is running with IsolationLevel=PROCESS so it wouldn't have caught the failure mentioned by @BCSharp).

BCSharp · 2020-12-07T20:35:56Z

@isaiah Thanks for the contribution! It's more fun when more minds are engaged with the project.

About the tests, test_listcomps should now be enabled as per @slozier's suggestion, although it tests with doctest so it still uses process isolation, therefore it does not cover rewriting.

To test rewriting, I suggest to add def test_ipy3_gh817 proposed by @slozier to Tests/test_closure.py. This test module is already enabled and uses the default isolation level for tests, which is SCOPE.

The following program also fails but I'm not sure if it's the same failure

@slozier, I believe the C# example you have given fails exactly because of the failure in rewriting. So this is another way how it can be tested.

As for tackling rewriting issue itself, my suggestion would be to look into lookup rewriting section in PythonAst, especially method VisitExtension. It already contains handling of special cases of LambdaExpression and GeneratorExpression, and needs to handle comprehensions too. Further I was considering of hooking ComprehensionScope up properly to the AST hierarchy, so that it always has the same scope Parent as the comprehension it serves.

When all this works, some cleanup in ComprehensionScope would probably be worthwhile.

isaiah · 2020-12-08T19:44:51Z

@BCSharp thanks for the information, it's very helpful. I managed to reproduce the issue, but i had trouble debug the problematic code, as VisualStudio test explorer failed to discover the NUnit tests, how do you do that normally?

BCSharp · 2020-12-08T20:30:01Z

I don't know why your VisualStudio test explorer cannot discover tests, it shows me all 3489 tests (I'm using VS version 16.8.2). But I don't use the test explorer (unless I want to debug tests written in C#) and use the command line instead.

For instance, to debug test_closure:

./make test-debug-test_closure -frameworks net46

Just place Debug.Assert(false) in your code where you would your first breakpoint to be hit (or even use any fancy condition as the assertion to create a conditional breakpoint), then when the assertion fails, you get a popup with buttons Abort, Retry, Ignore, hit Retry and select the currently running VS instance to debug. From there, normal VS debugging happens.

isaiah · 2020-12-11T11:35:20Z

@BCSharp Could you shed me more light on how the expression rewrite works? Cannot really tell where it splits based on the isolation level.

BCSharp · 2020-12-11T19:51:14Z

With the PROCESS isolation level, tests are run by spawning ipy.exe (or equivalent for the system/framework) as a separate process. The ipy console program creates the Python engine, scope, runtime together, so when the test file is compiled, it can be run right away as compiled.

With the SCOPE isolation level, the test case executer creates one engine for all SCOPE-level tests, and only creates a new scope for each test module. The engine is reused. This means that when the test code is compiled by the engine, the references to the global variables are not yet bound to the right scope, which will be provided separately. When the scope is given the code has to be updated, but since expressions are immutable, updating of the global references happens through rewriting of the AST tree.

For details how the test isolation levels are handled, look into CaseExecuter.cs.

isaiah · 2020-12-14T12:58:24Z

@BCSharp @slozier Because of the difficulty to treat a comprehension as a new scope, I used the same technic as the GeneratorExpression to encupsulate the comprehension body inside a FunctionDefinition, the comprehension specs are green, but I'm not sure if it breaks something else. Please have a look.

slozier · 2020-12-15T16:07:47Z

While I like the idea of using the same approach for comprehensions and generators, I have a few concerns:

It appears to be 2-3 times slower?
The parser is constructing the FunctionDefinition. I know that is how the generators do it, but it seems like that doesn't belong in the parser (unless there are technicalities I'm missing)
The Ast diverges from the Python ast. I guess this is the same as the previous point in that it's how the generators did it.

isaiah · 2020-12-15T19:50:32Z

It appears to be 2-3 times slower?

Yes, there are some performance penalty unfortunately, that it has to use dynamic method binding for the accumulation, instead of the native lock free method.

The parser is constructing the FunctionDefinition. I know that is how the generators do it, but it seems like that doesn't belong in the parser (unless there are technicalities I'm missing)

The Ast diverges from the Python ast. I guess this is the same as the previous point in that it's how the generators did it.

Currently the parser generates "lowered" AST, which is the form to be reduced. There are two forms of AST at the moment, one for Python, which are defined in _ast.cs, the other is the Expression Tree extensions. They are able to convert between each other, that's how it works for the GenExp.
I agree with you, ideally the Parser should generate strictly Python AST and have a "lower" phase to convert to the expression tree.

At the moment all specs including test_ast.py passed, except the following one:

class Foo():
  a = [1,2]
  b = [i for i in a]

=> NameError: name "a" is not defined

BCSharp · 2020-12-15T21:22:21Z

@isaiah Thank you for putting effort into this challenge. This is by no means an easy one. To have it properly resolved, I believe some parts of the compiler machinery would have to be redesigned.

This is obviously not the path you chose here, and that's OK, if we can get things working at reasonable cost. This means that some hacks and kinks in the code are unavoidable, but the performance impact of the current proposed solution in my opinion is too high.

Using FunctionDefiniton to handle scope for comprehensions is tempting, and indeed this is one way how it can be made work. Indeed this is what generators do, however, the difference with comprehensions is that generators (and lambdas) are objects that can be passed around, while comprehensions are merely expressions (or a block of statements, if you like) that are always executed inline.

This is probably where the performance hit comes from. Given that in most of the cases in Python programs comprehensions do not contain embedded generators or lambdas, this is a bad trade-off. If we continue on this path, perhaps a solution could be to only use FunctionDefinition for comprehensions that really require it (due to nested non-comprehension scopes) and keep the old way of handling it for the common light-weight cases. Also, along this line, we would only need FunctionDefinition after rewriting, so no need to create it during parsing. Further, there is no need to create a function for each nested comprehension, they could be folded into one function. This all does have a feeling of a heavy hack, but at least it might give us something working with reasonable performance.

On the other hand, did you investigate the option of hooking the old comprehension scope up to the scope hierarchy, for the benefit of proper rewriting?

There are two forms of AST at the moment, one for Python, which are defined in _ast.cs, the other is the Expression Tree extensions. They are able to convert between each other,

I didn't know it was possible to convert an expression tree back to a Python AST. This is interesting, can you point to me where it is done/used?

slozier · 2020-12-15T22:29:36Z

At the moment all specs including test_ast.py passed, except the following one:

Is this what's causing the failures in IronPython.test_function, CPython.test_argparse and all the distutils tests?

isaiah · 2020-12-16T00:36:37Z

Thanks for the review!

@BCSharp What you said makes sense, it could be possible to only generate FunctionDefinition during rewriting. Will give a try with this approach.

Further, there is no need to create a function for each nested comprehension

there is indeed only one function per comprehension, not for each ComprehensionFor

did you investigate the option of hooking the old comprehension scope up to the scope hierarchy

this would be the cleanest solution, I imagine if we move the scope handling out of the AST, and make PythonNameBinder the counterpart of symtable in CPython, it would happen, but I still need to play more with the code.

I didn't know it was possible to convert an expression tree back to a Python AST

There are multiple XxxConvert methods in the _ast module. i.e.
https://github.com/IronLanguages/ironpython3/blob/master/Src/IronPython/Modules/_ast.cs#L213

It needs this because the Parser produces only expression tree, if we need Python AST, we have to convert back.

re: @slozier

Is this what's causing the failures in IronPython.test_function, CPython.test_argparse and all the distutils tests?

the late two for sure, test_function is a bit hard to tell.

slozier · 2020-12-16T01:57:04Z

Interesting that using a comprehension creates a closure on an unrelated function. The test_function failure is essentially this:

def test_function_closure():
    def f(): pass

    assert f.__closure__ is None

    f = []
    [x for x in f]

test_function_closure()

BCSharp · 2020-12-16T05:04:35Z

there is indeed only one function per comprehension, not for each ComprehensionFor

What I meant is one function for nested comprehensions like

[{ y: (z**2 for z in range(x)) for y in range(3)} for x in range(3)]

But if we can keep ordinary cases inlined, I suppose this becomes less important.

There are multiple XxxConvert methods in the _ast module.

Oh yes, I knew about them. Sorry, I got confused about the terminology. I thought that "lowered" AST meant reduced AST. So am I assuming correctly that a reduced AST (expression tree) never gets converted back to a Python AST? What about a rewritten AST? If so, we can keep the expression tree produced by the parser aligned with Python, and only create hidden lambdas when really needed.

isaiah · 2020-12-16T18:05:41Z

@BCSharp for cases like:
[{ y: (z**2 for z in range(x)) for y in range(3)} for x in range(3)]
it generates following function:

def <dictcomp>():
  __ret__ = {}
  for x in range(3):
    for y in range(3):
      __ret__.__setitem__(y, (z**2 for z in range(x)))
  return __ret__

You might have seen two scopes, because the value itself is an generator expression, if you were talking about merging both scope, probably not easy and will cause other issues.

[lambda: i for i in [1,2]] should bind the `i` correctly now.

isaiah · 2020-12-16T21:55:46Z

Rolled back the AST changes and only kept the outer binding in ComprehensionScope, at the moment I'm not capable of fixing the failing spec when IsolationLevel=Scope, so I skipped the regression test in test_closure.
Since it is not a regression and this fix would still improve comprehension in ipy, please advise if we are good to merge or we should close this PR. @slozier @BCSharp

slozier · 2020-12-16T23:50:54Z

As you say, it's an improvement over the current state so if @BCSharp is good with it, so am I.

BCSharp · 2020-12-17T04:36:37Z

Yes, I am OK with that, it is a step in the right direction (functionality-wise) and we can always come back to it later. Good to merge.

isaiah · 2020-12-18T16:25:08Z

The specs are green, please approve.

BCSharp

Looks good to me.
The delay in review had nothing to do with the quality of the work, I was simply not at my workstation.
For future submissions, I'd recommend using a dedicated (per pull request) git branch rather than master. Otherwise the history of your pull request commits is erased the moment you reset your master to the upstream master and push it to GitHub again.

slozier · 2020-12-18T18:12:22Z

Thanks for all the work!

BCSharp · 2020-12-18T18:28:00Z

And my thanks!

isaiah · 2020-12-19T11:49:20Z

Thank you, looking forward to contributing more.

For future submissions, I'd recommend using a dedicated (per pull request) git branch rather than master.

Good point, will keep that in mind.

slozier requested a review from BCSharp December 7, 2020 15:18

slozier linked an issue Dec 7, 2020 that may be closed by this pull request

Name fails to resolve in generator's if when inside comprehension #817

Closed

BCSharp reviewed Dec 7, 2020

View reviewed changes

Comment thread Src/IronPython/Compiler/Ast/Comprehension.cs Outdated

BCSharp reviewed Dec 7, 2020

View reviewed changes

Comment thread Src/IronPythonTest/Cases/CPythonCasesManifest.ini Outdated

BCSharp reviewed Dec 7, 2020

View reviewed changes

Comment thread Src/IronPythonTest/Cases/CPythonCasesManifest.ini Outdated

isaiah added 2 commits December 16, 2020 22:46

Fix binding freevars in comprehension scope

5dbed9f

[lambda: i for i in [1,2]] should bind the `i` correctly now.

Enable test_listcomp and test_setcomp

3db4865

isaiah force-pushed the master branch from eaf11f6 to 3db4865 Compare December 16, 2020 21:50

update DLR

a642f75

isaiah requested a review from BCSharp December 17, 2020 10:31

BCSharp approved these changes Dec 18, 2020

View reviewed changes

slozier merged commit 6fd3527 into IronLanguages:master Dec 18, 2020

BCSharp mentioned this pull request Dec 20, 2020

Support rewriting of comprehension scopes #1083

Merged

slozier mentioned this pull request Mar 7, 2021

NullReferenceException in comprehension expression #1119

Closed

Conversation

isaiah commented Dec 7, 2020

Uh oh!

dnfadmin commented Dec 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slozier commented Dec 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isaiah commented Dec 7, 2020

Uh oh!

isaiah commented Dec 7, 2020

Uh oh!

slozier commented Dec 7, 2020

Uh oh!

BCSharp commented Dec 7, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaiah commented Dec 8, 2020

Uh oh!

BCSharp commented Dec 8, 2020

Uh oh!

isaiah commented Dec 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BCSharp commented Dec 11, 2020

Uh oh!

isaiah commented Dec 14, 2020

Uh oh!

slozier commented Dec 15, 2020

Uh oh!

isaiah commented Dec 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BCSharp commented Dec 15, 2020

Uh oh!

slozier commented Dec 15, 2020

Uh oh!

isaiah commented Dec 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slozier commented Dec 16, 2020

Uh oh!

BCSharp commented Dec 16, 2020

Uh oh!

isaiah commented Dec 16, 2020

Uh oh!

isaiah commented Dec 16, 2020

Uh oh!

slozier commented Dec 16, 2020

Uh oh!

BCSharp commented Dec 17, 2020

Uh oh!

isaiah commented Dec 18, 2020

Uh oh!

BCSharp left a comment

Choose a reason for hiding this comment

Uh oh!

slozier commented Dec 18, 2020

Uh oh!

BCSharp commented Dec 18, 2020

Uh oh!

isaiah commented Dec 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dnfadmin commented Dec 7, 2020 •

edited

Loading

slozier commented Dec 7, 2020 •

edited

Loading

isaiah commented Dec 11, 2020 •

edited

Loading

isaiah commented Dec 15, 2020 •

edited

Loading

isaiah commented Dec 16, 2020 •

edited

Loading