Skip to content

Fix binding freevars in comprehension scope#1073

Merged
slozier merged 3 commits intoIronLanguages:masterfrom
isaiah:master
Dec 18, 2020
Merged

Fix binding freevars in comprehension scope#1073
slozier merged 3 commits intoIronLanguages:masterfrom
isaiah:master

Conversation

@isaiah
Copy link
Copy Markdown
Contributor

@isaiah isaiah commented Dec 7, 2020

[lambda: i for i in [1,2]] should bind the i correctly now.

@dnfadmin
Copy link
Copy Markdown

dnfadmin commented Dec 7, 2020

CLA assistant check
All CLA requirements met.

@slozier slozier requested a review from BCSharp December 7, 2020 15:18
@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 7, 2020

Thanks for the PR! Adding some test cases would be useful:

def test_ipy3_gh817(self):
    """https://github.com/IronLanguages/ironpython3/issues/817"""
    for x in [lambda: i for i in [1,2]]:
        self.assertEqual(x(), 2)

    self.assertEqual([tuple(i for j in [1]) for i in [1]], [(1,)])

    self.assertEqual({i: tuple(j for j in t if i != j) for t in ((1,2),) for i in t}, {1: (2,), 2: (1,)})

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 7, 2020

@slozier nice catch, I actually didn't know there is an issue reported.
Do you know how to check if the following is working with this fix?

For IsolationLevel=SCOPE or ENGINE, expression rewriting kicks in and again nested scopes get completely mismanaged.

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 7, 2020

Regarding regression tests, this is already covered by the comprehension tests, e.g. https://github.com/IronLanguages/ironpython3/blob/master/Src/StdLib/Lib/test/test_listcomps.py#L84

In case you want to a specific test for the PR, do you have a convention to do this? Couldn't find other specs related to PRs.

@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 7, 2020

I tried the tests cases I posted above with IsolationLevel=SCOPE and it failed. I'm not super sure about the details, @BCSharp would know more. The following program also fails but I'm not sure if it's the same failure:

var engine = Python.CreateEngine();
var scope = engine.CreateScope();
var source = engine.CreateScriptSourceFromString("assert [lambda: i for i in [1,2]][0]() == 2");
var compiledCode = source.Compile();
compiledCode.Execute(scope);

If test_listcomps covers this then great, it should be enable by removing the lines at:

[CPython.test_listcomps]
Ignore=true

As for the regression tests I would probably still add them at least to Tests\test_regressions.py since there are also non-listcomp cases (although that one is running with IsolationLevel=PROCESS so it wouldn't have caught the failure mentioned by @BCSharp).

@slozier slozier linked an issue Dec 7, 2020 that may be closed by this pull request
@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 7, 2020

@isaiah Thanks for the contribution! It's more fun when more minds are engaged with the project.

About the tests, test_listcomps should now be enabled as per @slozier's suggestion, although it tests with doctest so it still uses process isolation, therefore it does not cover rewriting.

To test rewriting, I suggest to add def test_ipy3_gh817 proposed by @slozier to Tests/test_closure.py. This test module is already enabled and uses the default isolation level for tests, which is SCOPE.

The following program also fails but I'm not sure if it's the same failure

@slozier, I believe the C# example you have given fails exactly because of the failure in rewriting. So this is another way how it can be tested.

As for tackling rewriting issue itself, my suggestion would be to look into lookup rewriting section in PythonAst, especially method VisitExtension. It already contains handling of special cases of LambdaExpression and GeneratorExpression, and needs to handle comprehensions too. Further I was considering of hooking ComprehensionScope up properly to the AST hierarchy, so that it always has the same scope Parent as the comprehension it serves.

When all this works, some cleanup in ComprehensionScope would probably be worthwhile.

Comment thread Src/IronPython/Compiler/Ast/Comprehension.cs Outdated
Comment thread Src/IronPythonTest/Cases/CPythonCasesManifest.ini Outdated
Comment thread Src/IronPythonTest/Cases/CPythonCasesManifest.ini Outdated
@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 8, 2020

@BCSharp thanks for the information, it's very helpful. I managed to reproduce the issue, but i had trouble debug the problematic code, as VisualStudio test explorer failed to discover the NUnit tests, how do you do that normally?

@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 8, 2020

I don't know why your VisualStudio test explorer cannot discover tests, it shows me all 3489 tests (I'm using VS version 16.8.2). But I don't use the test explorer (unless I want to debug tests written in C#) and use the command line instead.

For instance, to debug test_closure:

./make test-debug-test_closure -frameworks net46

Just place Debug.Assert(false) in your code where you would your first breakpoint to be hit (or even use any fancy condition as the assertion to create a conditional breakpoint), then when the assertion fails, you get a popup with buttons Abort, Retry, Ignore, hit Retry and select the currently running VS instance to debug. From there, normal VS debugging happens.

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 11, 2020

@BCSharp Could you shed me more light on how the expression rewrite works? Cannot really tell where it splits based on the isolation level.

@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 11, 2020

With the PROCESS isolation level, tests are run by spawning ipy.exe (or equivalent for the system/framework) as a separate process. The ipy console program creates the Python engine, scope, runtime together, so when the test file is compiled, it can be run right away as compiled.

With the SCOPE isolation level, the test case executer creates one engine for all SCOPE-level tests, and only creates a new scope for each test module. The engine is reused. This means that when the test code is compiled by the engine, the references to the global variables are not yet bound to the right scope, which will be provided separately. When the scope is given the code has to be updated, but since expressions are immutable, updating of the global references happens through rewriting of the AST tree.

For details how the test isolation levels are handled, look into CaseExecuter.cs.

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 14, 2020

@BCSharp @slozier Because of the difficulty to treat a comprehension as a new scope, I used the same technic as the GeneratorExpression to encupsulate the comprehension body inside a FunctionDefinition, the comprehension specs are green, but I'm not sure if it breaks something else. Please have a look.

@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 15, 2020

While I like the idea of using the same approach for comprehensions and generators, I have a few concerns:

  1. It appears to be 2-3 times slower?
  2. The parser is constructing the FunctionDefinition. I know that is how the generators do it, but it seems like that doesn't belong in the parser (unless there are technicalities I'm missing)
  3. The Ast diverges from the Python ast. I guess this is the same as the previous point in that it's how the generators did it.

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 15, 2020

  1. It appears to be 2-3 times slower?

Yes, there are some performance penalty unfortunately, that it has to use dynamic method binding for the accumulation, instead of the native lock free method.

  1. The parser is constructing the FunctionDefinition. I know that is how the generators do it, but it seems like that doesn't belong in the parser (unless there are technicalities I'm missing)
  2. The Ast diverges from the Python ast. I guess this is the same as the previous point in that it's how the generators did it.

Currently the parser generates "lowered" AST, which is the form to be reduced. There are two forms of AST at the moment, one for Python, which are defined in _ast.cs, the other is the Expression Tree extensions. They are able to convert between each other, that's how it works for the GenExp.
I agree with you, ideally the Parser should generate strictly Python AST and have a "lower" phase to convert to the expression tree.

At the moment all specs including test_ast.py passed, except the following one:

class Foo():
  a = [1,2]
  b = [i for i in a]

=> NameError: name "a" is not defined

@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 15, 2020

@isaiah Thank you for putting effort into this challenge. This is by no means an easy one. To have it properly resolved, I believe some parts of the compiler machinery would have to be redesigned.

This is obviously not the path you chose here, and that's OK, if we can get things working at reasonable cost. This means that some hacks and kinks in the code are unavoidable, but the performance impact of the current proposed solution in my opinion is too high.

Using FunctionDefiniton to handle scope for comprehensions is tempting, and indeed this is one way how it can be made work. Indeed this is what generators do, however, the difference with comprehensions is that generators (and lambdas) are objects that can be passed around, while comprehensions are merely expressions (or a block of statements, if you like) that are always executed inline.

This is probably where the performance hit comes from. Given that in most of the cases in Python programs comprehensions do not contain embedded generators or lambdas, this is a bad trade-off. If we continue on this path, perhaps a solution could be to only use FunctionDefinition for comprehensions that really require it (due to nested non-comprehension scopes) and keep the old way of handling it for the common light-weight cases. Also, along this line, we would only need FunctionDefinition after rewriting, so no need to create it during parsing. Further, there is no need to create a function for each nested comprehension, they could be folded into one function. This all does have a feeling of a heavy hack, but at least it might give us something working with reasonable performance.

On the other hand, did you investigate the option of hooking the old comprehension scope up to the scope hierarchy, for the benefit of proper rewriting?


There are two forms of AST at the moment, one for Python, which are defined in _ast.cs, the other is the Expression Tree extensions. They are able to convert between each other,

I didn't know it was possible to convert an expression tree back to a Python AST. This is interesting, can you point to me where it is done/used?

@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 15, 2020

At the moment all specs including test_ast.py passed, except the following one:

Is this what's causing the failures in IronPython.test_function, CPython.test_argparse and all the distutils tests?

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 16, 2020

Thanks for the review!

@BCSharp What you said makes sense, it could be possible to only generate FunctionDefinition during rewriting. Will give a try with this approach.

Further, there is no need to create a function for each nested comprehension

there is indeed only one function per comprehension, not for each ComprehensionFor

did you investigate the option of hooking the old comprehension scope up to the scope hierarchy

this would be the cleanest solution, I imagine if we move the scope handling out of the AST, and make PythonNameBinder the counterpart of symtable in CPython, it would happen, but I still need to play more with the code.

I didn't know it was possible to convert an expression tree back to a Python AST

There are multiple XxxConvert methods in the _ast module. i.e.
https://github.com/IronLanguages/ironpython3/blob/master/Src/IronPython/Modules/_ast.cs#L213

It needs this because the Parser produces only expression tree, if we need Python AST, we have to convert back.

re: @slozier

Is this what's causing the failures in IronPython.test_function, CPython.test_argparse and all the distutils tests?

the late two for sure, test_function is a bit hard to tell.

@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 16, 2020

Interesting that using a comprehension creates a closure on an unrelated function. The test_function failure is essentially this:

def test_function_closure():
    def f(): pass

    assert f.__closure__ is None

    f = []
    [x for x in f]

test_function_closure()

@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 16, 2020

there is indeed only one function per comprehension, not for each ComprehensionFor

What I meant is one function for nested comprehensions like

[{ y: (z**2 for z in range(x)) for y in range(3)} for x in range(3)]

But if we can keep ordinary cases inlined, I suppose this becomes less important.

There are multiple XxxConvert methods in the _ast module.

Oh yes, I knew about them. Sorry, I got confused about the terminology. I thought that "lowered" AST meant reduced AST. So am I assuming correctly that a reduced AST (expression tree) never gets converted back to a Python AST? What about a rewritten AST? If so, we can keep the expression tree produced by the parser aligned with Python, and only create hidden lambdas when really needed.

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 16, 2020

@BCSharp for cases like:
[{ y: (z**2 for z in range(x)) for y in range(3)} for x in range(3)]
it generates following function:

def <dictcomp>():
  __ret__ = {}
  for x in range(3):
    for y in range(3):
      __ret__.__setitem__(y, (z**2 for z in range(x)))
  return __ret__

You might have seen two scopes, because the value itself is an generator expression, if you were talking about merging both scope, probably not easy and will cause other issues.

[lambda: i for i in [1,2]] should bind the `i` correctly now.
@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 16, 2020

Rolled back the AST changes and only kept the outer binding in ComprehensionScope, at the moment I'm not capable of fixing the failing spec when IsolationLevel=Scope, so I skipped the regression test in test_closure.
Since it is not a regression and this fix would still improve comprehension in ipy, please advise if we are good to merge or we should close this PR. @slozier @BCSharp

@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 16, 2020

As you say, it's an improvement over the current state so if @BCSharp is good with it, so am I.

@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 17, 2020

Yes, I am OK with that, it is a step in the right direction (functionality-wise) and we can always come back to it later. Good to merge.

@isaiah isaiah requested a review from BCSharp December 17, 2020 10:31
@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 18, 2020

The specs are green, please approve.

Copy link
Copy Markdown
Member

@BCSharp BCSharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
The delay in review had nothing to do with the quality of the work, I was simply not at my workstation.
For future submissions, I'd recommend using a dedicated (per pull request) git branch rather than master. Otherwise the history of your pull request commits is erased the moment you reset your master to the upstream master and push it to GitHub again.

@slozier slozier merged commit 6fd3527 into IronLanguages:master Dec 18, 2020
@slozier
Copy link
Copy Markdown
Contributor

slozier commented Dec 18, 2020

Thanks for all the work!

@BCSharp
Copy link
Copy Markdown
Member

BCSharp commented Dec 18, 2020

And my thanks!

@isaiah
Copy link
Copy Markdown
Contributor Author

isaiah commented Dec 19, 2020

Thank you, looking forward to contributing more.

For future submissions, I'd recommend using a dedicated (per pull request) git branch rather than master.

Good point, will keep that in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Name fails to resolve in generator's if when inside comprehension

4 participants