Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested if-while-return fails to decompile in 2.6 #284

Closed
trengri opened this issue Aug 7, 2019 · 5 comments
Closed

Nested if-while-return fails to decompile in 2.6 #284

trengri opened this issue Aug 7, 2019 · 5 comments
Labels
Control Flow Problem has to do with bad control-flow detection Python 2.6

Comments

@trengri
Copy link
Contributor

trengri commented Aug 7, 2019

The following code cannot be decompiled when compiled with Python 2.6.

def f():
  if True:
    while True:
      return

When compiled with Python 2.7, it is decompiled fine.
Link to the bytecode: https://ufile.io/st8qz15a

I understand this must be a control flow issue. Are you interested in other 2.6 control flow bug reports?

@trengri trengri changed the title Nested if-while-return failes to decompile in 2.6 Nested if-while-return fails to decompile in 2.6 Aug 7, 2019
@rocky rocky closed this as completed in 55783c2 Aug 9, 2019
@rocky
Copy link
Owner

rocky commented Aug 9, 2019

Bugs are bugs. It not about interest so much as having the time to address bugs. There are many more control-flow bugs in 3.7 and 3.8, and those are harder to address.

Probably a better use of my time is to address the larger problem handling control flow in more comprehensive way.

@rocky rocky added Control Flow Problem has to do with bad control-flow detection Python 2.6 labels Aug 9, 2019
@trengri
Copy link
Contributor Author

trengri commented Aug 9, 2019

Just noticed that "while 1" also fails to decompile:

def f():
  if True:
    while 1:
      return

Sorry for not reporting it initially, didn't think that "while True" and "while 1" produce different bytecode.

@rocky
Copy link
Owner

rocky commented Aug 9, 2019

@trengri
Copy link
Contributor Author

trengri commented Aug 10, 2019

Looking at your patch it was easy.
parse26.py has the following grammar rule:
while1stmt ::= SETUP_LOOP returns COME_FROM
I added this one:
while1stmt ::= SETUP_LOOP returns
And it did the trick.

It could be that this fix can cause problems as I lack compiler background and don't understand the logic by looking at reduction rules. It was your patch that helped, not uncompyle2 tree or uncompile6 traces.
I'm reading the trace from top to bottom and could not figure out how the parser works. The line numbers go out of order, it is difficult to tell what things are normal, what are not, what to expect, etc.

Do we need to worry about "Reduce and invalid by check" messages as it looks like they are printed even if decompile is sucessfull?

Not sure if I understand correctly the following sentence in Fixing Issue #151

> But token number 3 in parenthesis is the last token that we have seen. In particular we have seen SETUP_LOOP LOAD_FAST POP_JUMP_IF_FALSE."

Why it is the last? There are another tokens in the trace...
Got it. Just need to read it once again.

Also notice that the returns starts on line three, token 9 which encompases things that should be in testexpr

Maybe offset 9 (token 10)? Or am I missing something?

To fix the bug #151, you changed 'iflaststmtl' rule, but 'iflaststmtl' is not mentioned anywhere in the grammar reductions. It would be nice if you describe how did you come to 'iflaststmtl'.

@rocky
Copy link
Owner

rocky commented Aug 11, 2019

Contragulations! This indeed is the proper fix.

It could be that this fix can cause problems as I lack compiler background and don't understand the logic by looking at reduction rules.

There are not ill side effects that I can see. I'll explain below.

It was your patch that helped, not uncompyle2 tree or uncompile6 traces.

Ok. If you want to add a wiki entry on how you figured this out, that'd be great!

Let me explain what's going on here. COME_FROM is this artifical instruction that was put in to help detect the control flow structures.

Previously we got this:

--- This code section failed: ---

   2       0  LOAD_GLOBAL           0  'True'
           3  JUMP_IF_FALSE        11  'to 17'
         6_0  THEN                     18
           6  POP_TOP          

   3       7  SETUP_LOOP            8  'to 18'

   4      10  LOAD_CONST               None
          13  RETURN_VALUE     
          14  JUMP_FORWARD          1  'to 18'
        17_0  COME_FROM             3  '3'
          17  POP_TOP          
        18_0  COME_FROM             7  '7'

If there were code after that while 1 statement, I suppose there would be a jump to offset 14 which appears after the RETURN. But in this code there is no jump to 14 anywhere.

What I'm trying to get across here, is not just that this happens to work in this particular situation, but also that it does model accurately the valid bytecode that is produced.

A slight change is that instead of adding that rule, change change the one tht has the COME_FROM so that it is optional. So instead of COME_FROM there is come_from_opt. But it is not expected that'd you'd have known this.

As to your questions...

Do we need to worry about "Reduce and invalid by check" messages as it looks like they are printed even if decompile is sucessfull?

No. In the grammar trace not every reduction that occurs is ultimately used. Only some of them.

In some cases, you can't stop an invalid rule from triggering before the reduction. So in the reduction there are additional checks. For example, in a key/value pair in say
a dictionary literal, you might have LOAD_CONST, expr. But that LOAD_CONST needs to load a string value and any other value would be wrong. So here we would add a check that the LOAD_CONST is loading a string. And you would see that message if it were loading something other than a string, like say a number.

Also notice that the returns starts on line three, token 9 which encompases things that should be in testexpr

Maybe offset 9 (token 10)? Or am I missing something?

I did mean token 9. The text was indeed confusing. I have since updated it.

The last piece of work that should be done here is to write a test program that has the while 1 and while True loops to insure in the future we don't regress. Would you be up for that and put in a PR for that? Thanks.

rocky added a commit that referenced this issue Aug 11, 2019
while 1 bug for 2.6
rocky added a commit that referenced this issue Aug 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Control Flow Problem has to do with bad control-flow detection Python 2.6
Projects
None yet
Development

No branches or pull requests

2 participants