Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

variable 'referenced before assignment' warning, error in all() with .pxd but not .pyx #3477

Closed
rob-miller opened this issue Apr 1, 2020 · 8 comments · Fixed by #4127
Closed

Comments

@rob-miller
Copy link

The following is a standalone example of a problem I've run into trying to accelerate some code:

given allExt.py (takes a set of words and checks if 3rd letter in each is a vowel):

class a3v(object):
    def __init__(self, inset):
        self.myStrSet = inset   # set input word list as an attribute on  class instance

    def all3Vowels(self):     # test if 3rd letter of each word is a vowel
        vndx = 2                    # local variable for 3rd letter index
        inStrSet = self.myStrSet    # local variable for class instance word list
        if all(
            ltr in ("a", "e", "i", "o", "u")
            for ltr in (w[vndx] for w in inStrSet)
        ):
            return True
        return False

and tst.py:

#!/usr/local/bin/python3
import allExt

a3vL = allExt.a3v(["the", "bright", "flower"])

if a3vL.all3Vowels():
   print('yes')
else:
   print('no')

The code runs without issue, and fits with examples I have found in the cython docs. However, when augmented with allExt.pxd:

import cython

cdef class a3v:
     cdef public list myStrSet

     @cython.locals(vndx=int, inStrSet=list, ltr=str)
     cpdef bint all3Vowels(self)

Upon compiling as shown I get:

rob@nova-370:01 14:45:55 atmNdx $ cythonize -3 -f -i allExt.py allExt.pxd
[1/1] Cythonizing /Users/rob/sync/rgithub/cython/atmNdx/allExt.py
warning: allExt.py:12:26: local variable 'vndx' referenced before assignment

Error compiling Cython file:
------------------------------------------------------------
...
    def all3Vowels(self):
        vndx = 2
        inStrSet = self.myStrSet
        if all(
            ltr in ("a", "e", "i", "o", "u")
            for ltr in (w[vndx] for w in inStrSet)
                                        ^
------------------------------------------------------------

allExt.py:12:41: local variable 'inStrSet' referenced before assignment
Traceback (most recent call last):

OTOH, the equivalent (?) pyx version allExt.pyx:

class a3v(object):
    def __init__(self, inset):
        self.myStrSet = inset

    def all3Vowels(self):
        cdef int vndx = 2
        cdef list inStrSet = self.myStrSet
        cdef str ltr
        if all(
            ltr in ("a", "e", "i", "o", "u")
            for ltr in (w[vndx] for w in inStrSet)
        ):
            return True
        return False

compiles and runs without issue.

Have I missed something in the docs or some other simple issue?
Thank you.

@scoder
Copy link
Contributor

scoder commented Apr 3, 2020

Thanks for the report. There seems to be something wrong with the nested generator expressions here and the way the type information is applied from the decorator (or maybe when it is applied).

You didn't mention which Cython version you were using, so I guess it's a recent release. Could you try the latest master branch, and also PR #3323?

BTW, you don't need to declare inStrSet here because its type should get inferred from the assignment, since the attribute is typed.

@rob-miller
Copy link
Author

rob-miller commented Apr 3, 2020 via email

@matusvalo
Copy link
Contributor

matusvalo commented Apr 16, 2021

I have spend several hours investigating the bug and here are my findings.

Bug details

The issue can be easilly reproduced by any construct using for ... in (maybe except list and dict comprehension) when .pxd file is used:

$ cat test.py
def foo():
    a = (x for x in range(5))
foo()

$ cat test.pxd
cdef foo()

but also all(), any(), etc... are causing crashing cython. The example above is crashing with following exception:

Traceback (most recent call last):
  File "/home/matus/dev/cython39/bin/cythonize", line 33, in <module>
    sys.exit(load_entry_point('Cython', 'console_scripts', 'cythonize')())
  File "/home/matus/dev/cython/Cython/Build/Cythonize.py", line 224, in main
    cython_compile(path, options)
  File "/home/matus/dev/cython/Cython/Build/Cythonize.py", line 68, in cython_compile
    ext_modules = cythonize(
  File "/home/matus/dev/cython/Cython/Build/Dependencies.py", line 1115, in cythonize
    cythonize_one(*args)
  File "/home/matus/dev/cython/Cython/Build/Dependencies.py", line 1261, in cythonize_one
    result = compile_single(pyx_file, options, full_module_name=full_module_name)
  File "/home/matus/dev/cython/Cython/Compiler/Main.py", line 575, in compile_single
    return run_pipeline(source, options, full_module_name)
  File "/home/matus/dev/cython/Cython/Compiler/Main.py", line 503, in run_pipeline
    err, enddata = Pipeline.run_pipeline(pipeline, source)
  File "/home/matus/dev/cython/Cython/Compiler/Pipeline.py", line 370, in run_pipeline
    data = run(phase, data)
  File "/home/matus/dev/cython/Cython/Compiler/Pipeline.py", line 350, in run
    return phase(data)
  File "/home/matus/dev/cython/Cython/Compiler/Pipeline.py", line 52, in generate_pyx_code_stage
    module_node.process_implementation(options, result)
  File "/home/matus/dev/cython/Cython/Compiler/ModuleNode.py", line 171, in process_implementation
    self.generate_c_code(env, options, result)
  File "/home/matus/dev/cython/Cython/Compiler/ModuleNode.py", line 451, in generate_c_code
    self.body.generate_function_definitions(env, code)
  File "/home/matus/dev/cython/Cython/Compiler/Nodes.py", line 400, in generate_function_definitions
    stat.generate_function_definitions(env, code)
  File "/home/matus/dev/cython/Cython/Compiler/Nodes.py", line 1839, in generate_function_definitions
    self.generate_lambda_definitions(lenv, code)
  File "/home/matus/dev/cython/Cython/Compiler/Nodes.py", line 374, in generate_lambda_definitions
    node.generate_function_definitions(env, code)
  File "/home/matus/dev/cython/Cython/Compiler/Nodes.py", line 4432, in generate_function_definitions
    super(GeneratorDefNode, self).generate_function_definitions(env, code)
  File "/home/matus/dev/cython/Cython/Compiler/Nodes.py", line 3341, in generate_function_definitions
    FuncDefNode.generate_function_definitions(self, env, code)
  File "/home/matus/dev/cython/Cython/Compiler/Nodes.py", line 1886, in generate_function_definitions
    if self.needs_closure:
AttributeError: 'ClosureScope' object has no attribute 'scope_class'

The cause of the bug

I have compared the version above with following .pyx version:

cdef foo():
    a = (x for x in range(5))

foo()

It seems that the problem is in the following part:

> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2879)visit_FuncDefNode()
-> if node.needs_closure or self.path:
(Pdb) l
2874        def visit_FuncDefNode(self, node):
2875            if self.in_lambda:
2876                self.visitchildren(node)
2877                return node
2878            breakpoint()
2879 ->         if node.needs_closure or self.path:
2880                self.create_class_from_scope(node, self.module_scope)
2881                self.path.append(node)
2882                self.visitchildren(node)
2883                self.path.pop()
2884            return node

In working .pyx case node.needs_closure is evaluated as True and the block is executed. In broken .py case it is evaluated as False, the block is not executed and crash is followed. I did experiment and during breakpoint I have executed following:

(Pdb) l
2874        def visit_FuncDefNode(self, node):
2875            if self.in_lambda:
2876                self.visitchildren(node)
2877                return node
2878            breakpoint()
2879 ->         if node.needs_closure or self.path:
2880                self.create_class_from_scope(node, self.module_scope)
2881                self.path.append(node)
2882                self.visitchildren(node)
2883                self.path.pop()
2884            return node
(Pdb) p node.needs_closure
False
(Pdb) node.needs_closure = True
(Pdb) c

After that the build was finished successfully. Hence, from that point, I focused on investigating needs_closure.

needs_closure details

And I have found out the following:
1. case with .pyx file which is not crashing:

> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2678)visit_FuncDefNode()
-> print('3', self.needs_closure)
(Pdb) l
2673        def visit_FuncDefNode(self, node):
2674            print('2', False)
2675            self.needs_closure = False
2676            self.visitchildren(node)
2677            breakpoint()
2678 ->         print('3', self.needs_closure)
2679            print('3', True)
2680            node.needs_closure = self.needs_closure
2681            self.needs_closure = True
2682
2683            collector = YieldNodeCollector()
(Pdb) p node
<Cython.Compiler.Nodes.DefNode object at 0x7f53cf6a0490>
(Pdb) p self
<Cython.Compiler.ParseTreeTransforms.MarkClosureVisitor object at 0x7f53c8c0f400>
(Pdb) c
3 False
3 True
5 True
5 True
> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2724)visit_CFuncDefNode()
-> print('4', self.needs_closure)
(Pdb) p node
<Cython.Compiler.Nodes.CFuncDefNode object at 0x7f53cf5e4310>                   # Node object Address is 0x7f53cf5e4310
(Pdb) p self
<Cython.Compiler.ParseTreeTransforms.MarkClosureVisitor object at 0x7f53c8c0f400>
(Pdb) c
4 True
4 True
> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2878)visit_FuncDefNode()
-> if node.needs_closure or self.path:
(Pdb) p node
<Cython.Compiler.Nodes.CFuncDefNode object at 0x7f53cf5e4310>                   # Node object Address is 0x7f53cf5e4310
(Pdb)

Here can be seen that node object has needs_closure is set to True (in the line 2724, both nodes are CFuncDefNode) and it was set as displayed above (node objects has same address).

2. case with .py file which is crashing:

> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2678)visit_FuncDefNode()
-> print('3', self.needs_closure)
(Pdb) p node
<Cython.Compiler.Nodes.DefNode object at 0x7f6016a82c10>
(Pdb) p self
<Cython.Compiler.ParseTreeTransforms.MarkClosureVisitor object at 0x7f6016a866a0>
(Pdb) c
3 False
3 True
5 True
5 True
> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2678)visit_FuncDefNode()
-> print('3', self.needs_closure)
<Cython.Compiler.Nodes.DefNode object at 0x7f6016a9e040>
(Pdb) p self
<Cython.Compiler.ParseTreeTransforms.MarkClosureVisitor object at 0x7f6016a866a0>
(Pdb) c
3 True
3 True
> /home/matus/dev/cython/Cython/Compiler/ParseTreeTransforms.py(2878)visit_FuncDefNode()
-> if node.needs_closure or self.path:
(Pdb) p node
<Cython.Compiler.Nodes.CFuncDefNode object at 0x7f600fdfd640>

Here can be seen that node object has needs_closure is set to False because there is no object with the address 0x7f600fdfd640 - the previous nodes are DefNode not CFuncDefNode. This is causing that the statement if node.needs_closure or self.path is evaluated as false causing later on crash as showed before.

@scoder I would like to fix this bug, but now I am slightly stuck. Could you give me a point which part should I look into?

@da-woods
Copy link
Contributor

@matusvalo The node will be converted into a CFuncDefNode in AlignFunctionDefinitions

node = node.as_cfunction(pxd_def)
.

It looks like that gets run just after MarkClosureTransform

MarkClosureVisitor(context),
_align_function_definitions,

You might be able to get away with just swapping their order. If that doesn't work then you might have to do something cleverer to repair the relevant links in AlignFunctionDefinitions.

@matusvalo
Copy link
Contributor

Thank you @da-woods

@matusvalo
Copy link
Contributor

I have swapped the order and it works!

matus@L3505101303:~/dev/cython/Cython$ git diff
diff --git a/Cython/Compiler/Pipeline.py b/Cython/Compiler/Pipeline.py
index 4e8e98c71..342481c32 100644
--- a/Cython/Compiler/Pipeline.py
+++ b/Cython/Compiler/Pipeline.py
@@ -188,8 +188,8 @@ def create_pipeline(context, mode, exclude_classes=()):
         ParallelRangeTransform(context),
         AdjustDefByDirectives(context),
         WithTransform(context),
-        MarkClosureVisitor(context),
         _align_function_definitions,
+        MarkClosureVisitor(context),
         RemoveUnreachableCode(context),
         ConstantFolding(),
         FlattenInListTransform(),

Now the file is compiling without error:

$ rm *.c; cythonize -3ib test.py
rm: cannot remove '*.c': No such file or directory
Compiling /home/matus/dev/test/test.py because it changed.
[1/1] Cythonizing /home/matus/dev/test/test.py
running build_ext
building 'test' extension
creating /home/matus/dev/test/tmp8cp6hu35/home
creating /home/matus/dev/test/tmp8cp6hu35/home/matus
creating /home/matus/dev/test/tmp8cp6hu35/home/matus/dev
creating /home/matus/dev/test/tmp8cp6hu35/home/matus/dev/test
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/matus/dev/cython39/include -I/usr/include/python3.9 -c /home/matus/dev/test/test.c -o /home/matus/dev/test/tmp8cp6hu35/home/matus/dev/test/test.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 /home/matus/dev/test/tmp8cp6hu35/home/matus/dev/test/test.o -o /home/matus/dev/test/test.cpython-39-x86_64-linux-gnu.so

I have checked also the functionality and the generated library is working correctly (at least for my simple example)!

@da-woods
Copy link
Contributor

Good! When you have time feel free to submit it as a PR (with your small example test integrated into https://github.com/cython/cython/blob/master/tests/run/pure_pxd.srctree).

Hopefully it won't break anything else, but the full lot of tests will get run on the PR so we'll find out

@matusvalo
Copy link
Contributor

PR created. I will do further validation in larger code base.

scoder pushed a commit that referenced this issue May 11, 2021
This commit fixes a crash of Cython when generator expressions are used in cdef functions in pure python mode.
Closes #3477
@scoder scoder added this to the 3.0 milestone May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants