Skip to content
This repository has been archived by the owner on Jul 5, 2023. It is now read-only.

ast27 gives syntax error for valid code on s390x #151

Closed
Ikke opened this issue Dec 25, 2020 · 26 comments · Fixed by #152
Closed

ast27 gives syntax error for valid code on s390x #151

Ikke opened this issue Dec 25, 2020 · 26 comments · Fixed by #152

Comments

@Ikke
Copy link

Ikke commented Dec 25, 2020

This examples comes from the test suite of black

from typed_ast import ast27

ast27.parse('''#!/usr/bin/env python2.7

def function((_globals, _locals)):
    print _globals, _locals
''')

Results in:

Traceback (most recent call last):

  File "/usr/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-3-46b135e3f4de>", line 1, in <module>
    ast27.parse('''#!/usr/bin/env python2.7

  File "/usr/lib/python3.8/site-packages/typed_ast/ast27.py", line 50, in parse
    return _ast27.parse(source, filename, mode)

  File "<unknown>", line 3
SyntaxError: can't assign to ()

The same code parses fine on python 2.7.18.

This is with typed-ast 1.4.1. It only happens on s390x on Alpine Linux edge.

Let me know if you need more information.

@gvanrossum
Copy link
Member

I’m as stumped as you. Does ast27 parse this on other platforms?

@Ikke
Copy link
Author

Ikke commented Dec 25, 2020

Yes, for example on x86_64:

>>> from typed_ast import ast27
>>> ast27.parse('''#!/usr/bin/env python2.7
...
... def function((_globals, _locals)):
...     print _globals, _locals
... ''')
<typed_ast._ast27.Module object at 0x7f23004190d0>

@hauntsaninja
Copy link
Collaborator

Related: #139 (comment)

It's interesting that it passes on Python 2.7.18. Diffing the two ast.c's and looking through the changes in ast_for_arguments and compiler_complex_args didn't reveal anything that looked too suspicious to my eyes.

@gvanrossum
Copy link
Member

Maybe it's an endianness problem? IIUC s390 is big-endian, while most prevailing architectures (certainly x86) are little-endian. This would be tough to track down without access to the hardware, so I think the OP will have to debug their own problem. :-(

@ethanhs
Copy link
Contributor

ethanhs commented Dec 26, 2020

Well this is rather bizzare. I got somewhat nerd sniped by this problem and initially tried running s390x in QEMU, but I couldn't get Linux to boot correctly, so I gave up (suffice to say it isn't the most supported platform out there).

I then tried on an old powerbook g4, which uses powerpc and thus is also big endian (though it is only 32bit). I tried running things and everything worked correctly, both the example shown here and all the tests passed unlike #139. So either this is not specific to BE, or its only 64bit BE.

@gvanrossum
Copy link
Member

Thanks for trying that! I believe we have some copies of header files that are used to describe internal APIs. Maybe those are slightly out of sync in a way that only becomes a problem on BE64?

FWIW This may just be a duplicate of #139, as @hauntsaninja pointed out.

@Ikke, are you interested in digging in?

@NattyNarwhal
Copy link

Can't seem to repro here. (It's the PASE Python build shipped by IBM for i, FWIW.)

Python 3.6.12 (default, Nov 10 2020, 13:50:57) 
[GCC 6.3.0] on aix7
Type "help", "copyright", "credits" or "license" for more information.
>>> from typed_ast import ast27
>>> ast27.parse('''#!/usr/bin/env python2.7
... 
... def function((_globals, _locals)):
...     print _globals, _locals
... ''')
<typed_ast._ast27.Module object at 0x7000000002bb320>
>>> import platform
>>> platform.architecture()
('64bit', 'COFF')

@Ikke
Copy link
Author

Ikke commented Dec 27, 2020

@Ikke, are you interested in digging in?

Sure, though I do need some pointers on how to debug python extensions, which I do not have experience with.

I have python installed with debug symbols, and also built typed-ast with debug symbols, but am I bit stuck to get useful information out of gdb.

@gvanrossum
Copy link
Member

I would start by finding how the error message is generated, e.g. by grepping for "can't assign to". It's possible that this message is constructed out of pieces, maybe grep for "assign to".

@Ikke
Copy link
Author

Ikke commented Dec 28, 2020

Thanks, I tried something like that before, but could not find anything. Now I found the set_context function that most likely returns this error.

Specifically this code:

        case Tuple_kind:
            if (asdl_seq_LEN(e->v.Tuple.elts))  {
                e->v.Tuple.ctx = ctx;
                s = e->v.Tuple.elts;
            }
            else {
                expr_name = "()";
            }
            break;

asdl_seq_LEN is defined as

#define asdl_seq_LEN(S) ((S) == NULL ? 0 : (S)->size)

So I guess we need to confirm whether e->v.Tuple.elts is NULL.

@gvanrossum
Copy link
Member

I suppose you could put a breakpoint there and get a traceback from gdb so you can tell where it's called from.

@Ikke
Copy link
Author

Ikke commented Dec 28, 2020

Yes, was working on that. This is what I found so far:

(gdb) p *e.v.Tuple.elts
$2 = {size = 0, elements = {0x2aa000b26d8}}
(gdb) bt
#0  set_context (c=c@entry=0x3ffffffe920, e=e@entry=0x2aa000b2748, ctx=ctx@entry=Store, n=n@entry=0x3fffdafb198) at ast27/Python/ast.c:459
#1  0x000003fffda92546 in compiler_complex_args (c=c@entry=0x3ffffffe920, n=n@entry=0x3fffdafb198) at ast27/Python/ast.c:726
#2  0x000003fffda93224 in ast_for_arguments (c=c@entry=0x3ffffffe920, n=<optimized out>) at ast27/Python/ast.c:823
#3  0x000003fffda9838c in ast_for_funcdef (c=c@entry=0x3ffffffe920, n=0x3fffdacf8d0, decorator_seq=decorator_seq@entry=0x0) at ast27/Python/ast.c:1022
#4  0x000003fffda96350 in ast_for_stmt (c=c@entry=0x3ffffffe920, n=<optimized out>, n@entry=0x3fffdafb3f0) at ast27/Python/ast.c:3493
#5  0x000003fffda98db6 in Ta27AST_FromNode (n=n@entry=0x3fffdacf810, flags=flags@entry=0x3ffffffea30, filename=<optimized out>, arena=arena@entry=0x3fffdba5af0) at ast27/Python/ast.c:275
#6  0x000003fffdaacb1a in string_object_to_c_ast (arena=0x3fffdba5af0, flags=0x3ffffffea30, start=257, filename=0x3fffdad2a30,
    s=0x3fffdb95210 "#!/usr/bin/env python2.7\n\ndef function((_globals, _locals)):\n    print _globals, _locals\n") at ast27/Custom/typed_ast.c:226
#7  string_object_to_py_ast (flags=0x3ffffffea30, start=<optimized out>, filename=0x3fffdad2a30, str=0x3fffdb95210 "#!/usr/bin/env python2.7\n\ndef function((_globals, _locals)):\n    print _globals, _locals\n")
    at ast27/Custom/typed_ast.c:248
#8  ast27_parse_impl (mode=<optimized out>, filename=0x3fffdad2a30, source=<optimized out>) at ast27/Custom/typed_ast.c:291
#9  ast27_parse (self=<optimized out>, args=<optimized out>) at ast27/Custom/typed_ast.c:312
#10 0x000003fffdcd0c30 in cfunction_call_varargs (func=0x3fffdac2630, args=0x3fffdaf3f40, kwargs=0x0) at Objects/call.c:757
[..]

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Dec 28, 2020

Yup, compiler_complex_args is what's responsible for tuple parameter unpacking (as I alluded to in #151 (comment)). I'd just step through it on both architectures and see where they differ. Maybe len is different?

@gvanrossum
Copy link
Member

You should probably turn off optimizations in the compiler so you don't get the <optimized out> cookies. For me, having built Python itself with -g -O, if I use that Python build to run setup.py build --debug here, it looks like it passes those flags here (but only if I use --debug).

FWIW I worry that what you're seeing in this traceback is just the code producing the error message, which maybe removed from the actual parsing problem that's the root cause.

@Ikke
Copy link
Author

Ikke commented Dec 28, 2020

I've added -O0 to extra_compiler_args, and now it's no longer optimized.

FWIW I worry that what you're seeing in this traceback is just the code producing the error message, which maybe removed from the actual parsing problem that's the root cause

Yes, I figured as much, so I would need to find where this part of the AST is generated.

@gvanrossum
Copy link
Member

gvanrossum commented Dec 28, 2020

I'd look into compiler_complex_args().

@Ikke
Copy link
Author

Ikke commented Dec 29, 2020

Not sure if it means something, but one difference I found:

I found something peculiar:

In compiler_complex_args, a new asdl_seq is created with a provided length 2.

If I step through with gdb in s390x, on line 685:

(gdb) p *args                     
$14 = {size = 0, elements = {0x0}}
(gdb) n                           

So size equals 0.

Same on x86_64:

(gdb) p *args
$5 = {size = 2, elements = {0x0}}

Here, size is 2, as expected.

The peculiar thing, if I check the value of size right before it's returned from asdl_seq_new (on s390x):

(gdb) p *seq                      
$10 = {size = 2, elements = {0x0}}

So here it's still 2, but right after it returns, it suddenly 0.

Note that for some reason, it looks for _Py_asdl_seq_new in Parser/asdl.c. To be able to step into that function, I symlinked ast27/Parser to the root of the project, not sure if that affects anything.

But the fact that size is 0 in that asdl_seq is the reason that it returns that error message. Any clue why this might be happening?

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Dec 29, 2020

So to confirm, on S390X, seq->size is 2 at

return seq;
but is somehow changed to 0 when we get to
if (!args)

That's weird.

Does x-ing the memory show it's all zeroes? Maybe try looking at the assembly?

@Ikke
Copy link
Author

Ikke commented Dec 29, 2020

Does x-ing the memory confirm it's all zeroes?

Interesting:

(gdb) x/16xb args                                                           
0x2aa000b26c0:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x02
0x2aa000b26c8:  0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

So the size is still present in the data

For comparrson, on x86_64:

(gdb) x/16hb args
0x7ffff7ad1240: 0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7ad1248: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

For looking at the assembly, I would need a bit more guidance, I have no experience with that (though, I can probably defer it other with more experience on s390x, but I don't know when they have time).

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Dec 29, 2020

Nice!

Okay, so to confirm, on S390X, when the same memory is interpreted as asdl_seq *args (in ast.c) you get args->size == 0 and when it's interpreted as asdl_seq *seq (in asdl.c) you get seq->size == 2?

Maybe asdl_seq in ast.c is different from asdl_seq in asdl.c, and they have different memory layouts?

@Ikke
Copy link
Author

Ikke commented Dec 29, 2020

Bingo:

in asdl.c:

(gdb) ptype asdl_seq  
type = struct {       
    Py_ssize_t size;  
    void *elements[1];
}

in ast.c:

(gdb) ptype asdl_seq  
type = struct {       
    int size;         
    void *elements[1];
}                     

So Py_ssize_t vs int for size. Py_ssize_t is 8 bytes instead of 4.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Dec 29, 2020

Yay! That explains @ethanhs ' findings as well.
And now we have a way to repro the issue on x86. It seems that #include "../Include/asdl.h" in ast27/Python/asdl.c isn't getting ast27/Include/asdl.h, but I can't figure out why.

Edit:
info source tells me I'm actually ending up in Python's asdl.c. So maybe this is a linking issue? It's not often that I get to say I miss C++

hauntsaninja pushed a commit to hauntsaninja/typed_ast that referenced this issue Dec 30, 2020
@hauntsaninja
Copy link
Collaborator

@Ikke can you check if #152 fixes it for you?

@Ikke
Copy link
Author

Ikke commented Dec 30, 2020

Yes, it does. I also ran the black test suite, which passes now. Thanks!

@hauntsaninja
Copy link
Collaborator

Hooray! Thank you for debugging (and reporting)!

@gvanrossum would you be willing to review #152 and maybe make a release?

hauntsaninja added a commit that referenced this issue Dec 30, 2020
Fixes #151 and fixes #139

Linking shenanigans caused ast27 to link against Python's asdl.c, instead of ast27/Python/asdl.c. Python's asdl.c uses Py_ssize_t for asdl_seq's size whereas ast27/Python/asdl.c uses int. This means we were interpreting memory differently in asdl.c and ast.c. On 64-bit big-endian systems, this caused us to misplace asdl.c's writes to asdl_seq->size, since Py_ssize_t is eight bytes, but int is four. The fix mirrors what we do in ast3, that is, manually namespace definitions to avoid linking conflicts.

Co-authored-by: hauntsaninja <>
@gvanrossum
Copy link
Member

gvanrossum commented Dec 30, 2020 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants