Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tentative fix for #439 #451

Merged
merged 3 commits into from
Apr 17, 2023
Merged

Tentative fix for #439 #451

merged 3 commits into from
Apr 17, 2023

Conversation

andrem-eberle
Copy link
Contributor

I altered the code on scanner.py a bit, to separate CONST from NAME vars, in fact it seems it already had this part, but was garbled outside the loop?

@rocky
Copy link
Owner

rocky commented Apr 17, 2023

This looks good. Since this is so simple, would you do just a little more and include a test for this?

One place a check could be added is to the existing test test/simple_source/expression/05_long_literals.py

Additional code might be:

# Check that we can distinguish names from strings in literal collections, e.g. lists.
# The list has to have more than 4 items go get accumulated in a collection
a = ["y", 'Exception', "x", Exception, "z"]
assert a[1] == "Exception"
assert a[3] == Exception

To update the existing bytecode in test/ run:

./add-test.py -r test/simple_source/expression/05_long_literals.py

@andrem-eberle
Copy link
Contributor Author

Done and done.



# Check that we can distinguish names from strings in literal collections, e.g. lists.
# The list has to have more than 4 items go get accumulated in a collection
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go get -> to get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

for j in range(collection_start, i):
if tokens[j] == "LOAD_CONST":
Copy link
Owner

@rocky rocky Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now realize that while this is necessary, it probably is not sufficient. There may be constants that are code blocks.

However we can start at this and if you are up for it see if we can find those kinds of things that use LOAD_CONST with non-string items in another PR.

Copy link
Owner

@rocky rocky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and thanks. There still may be some edge cases to work out which is noted in the comment.

@rocky rocky merged commit 2f650a6 into rocky:master Apr 17, 2023
@rocky
Copy link
Owner

rocky commented Apr 17, 2023

@andrem-eberle Actually I just tried this out and this has problems.

The bottom of the decompilation looks like this:

    a = [
     "'y'", "'Exception'", "'x'", 'Exception', "'z'"]

There are extra quotes. Also, the assert statements were removed by the compiler removed because it figured these are universally true.

To get back the assert put the assert with the == inside a function, and then Python won't be able to figure this out.

As for the extra quotes, that is somewhere in semantics actions.

@andrem-eberle
Copy link
Contributor Author

This is odd, I can't seem to reproduce the extra quotes here, using latest xdis version and the forked uncompyle6. I am using the pip version of spark, could it be related?

@rocky
Copy link
Owner

rocky commented Apr 18, 2023

Probably not spark related. And the problem is not in xdis either. I tried disassembling to check.

I also tried running from Python 3.8 (and not 3.8).

When I get a chance I will try to see where I am getting the extra quotes. However in disassembly I can definitely see that the assert statements are gone in the bytecode.

@andrem-eberle
Copy link
Contributor Author

andrem-eberle commented Apr 18, 2023

Ah I see. Sorry I have been focusing on 2.7, forgot about 3.x entirely. It does decompile wrong for code compiled with 3.8. In fact, the else block is decompiled wrong here in 3.8 (the values is inserted into the else), which means the test doesn't fail anyway (even with forced asserts), since it never gets into the else. The Exception test is also inside the else in the decompiled version:

if sys.version < (3, 0):
    pass
else:
    values = {'value1':a + 1,
     'value2':2,
     'value3':3,
...

I will look into the extra quotes.

@andrem-eberle
Copy link
Contributor Author

andrem-eberle commented Apr 18, 2023

Well the cause is the str.__repr__() method, it adds the quotes in 3.8 but not in 2.7 (which is why I wasn't seeing this error in 2.7).

One option is to check for python source version in n_actions.py and use repr() or str(), I did some test here and it worked. Is this kind of solution acceptable for you? If so I will add another pull request.

@rocky
Copy link
Owner

rocky commented Apr 18, 2023

Well the cause is the str.repr() method, it adds the quotes in 3.8 but not in 2.7 (which is why I wasn't seeing this error in 2.7).

One option is to check for python source version in n_actions.py and use repr() or str(), I did some test here and it worked. Is this kind of solution acceptable for you? If so I will add another pull request.

Yes, this is good - thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants