Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Parsing bug when multiplying a negative number #1068

Open
mike919192 opened this issue May 12, 2024 · 7 comments
Open

[BUG] Parsing bug when multiplying a negative number #1068

mike919192 opened this issue May 12, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@mike919192
Copy link

Describe the bug
Multiplying by a negative number is being parsed incorrectly. Putting the negative number in parenthesis is a workaround to parse it correctly.

To Reproduce
This cpp2 code:

#include <iostream>
#include <string>

main: () = {
    
    std::cout << 1 * -1 << name() << "\n";
    std::cout << 1 * (-1) << name() << "\n";
}

Compiles to this cpp code


//=== Cpp2 type declarations ====================================================


#include "cpp2util.h"



//=== Cpp2 type definitions and function declarations ===========================

#include <iostream>
#include <string>

auto main() -> int;

//=== Cpp2 function definitions =================================================


auto main() -> int{

    std::cout << *cpp2::impl::assert_not_null(1) - 1 << name() << "\n";
    std::cout << 1 * (-1) << name() << "\n";
}

https://godbolt.org/z/KMc6sxoPe

Additional context
This is present in the head, but not in the release 0.7.0

@mike919192 mike919192 added the bug Something isn't working label May 12, 2024
@sookach
Copy link
Contributor

sookach commented May 16, 2024

I'll take a look at this issue.

@sookach
Copy link
Contributor

sookach commented May 17, 2024

Sorry for the late response, and thank you for the reporting this bug. Not sure if you could already tell, but essentially we're parsing the first expression as a dereference of '1' minus '1' which is not correct. The reason the next line doesn't have the same mistake is because the parser has a specific rule for disallowing a postfix star to be treated as a dereference if followed by a left parenthesis as below:

// * and & can't be unary operators if followed by a (, identifier, or literal

The reason it works in version 0.7.0 but not head is because of commit 5663493 which was published shortly after the release.

@hsutter From what I can tell, in order to fix this issue we would need to change the parser behavior (which would break the code that prompted the original commit).

We can't just add:

MINUS LITERAL
PLUS LITERAL

to the list of token(s) that can't follow a unary star, because even though it would fix this specific issue, that rule would prevent valid binary expressions involving a dereference, and it would also require

EXCLAIM+ LITERAL

to be added to the list of prohibited successor token(s) which would introduce an unbounded lookahead since any amount of exclaims can follow in each other in sequence.

Curious what you think the best course of action is, since it seems we will need to change the language's behavior.

@hsutter
Copy link
Owner

hsutter commented May 17, 2024

Edited to add: I think the following may also address the way the first line parses, because then like the second line it will see interpreting it as a postfix operator doesn't work and backtrack and try multiplication? Need to try it.

Thanks! I'm just in the middle of another bit where I'm not in a compilable state, but I would try this in parse.h:5798 (this is notes for myself, or in case someone wants to try it feel free):

-           //  * and & can't be unary operators if followed by a (, identifier, or literal
+           //  * and & can't be unary operators if followed by a (, identifier, prefix operator, or literal
            if (
                (
                    curr().type() == lexeme::Multiply
                    || curr().type() == lexeme::Ampersand
                    )
                && peek(1)
                && (
                    peek(1)->type() == lexeme::LeftParen
                    || peek(1)->type() == lexeme::Identifier
                    || is_literal(peek(1)->type())
+                   || is_prefix_operator(*peek(1))
                    )
                )
            {
                break;
            }

@sookach
Copy link
Contributor

sookach commented May 18, 2024

Thanks @hsutter. So, the change you propose does indeed fix this issue, but as I suspected:

that rule would prevent valid binary expressions involving a dereference

it introduces another issue. Consider the following code:

x := new<int>(1);
std::cout << x* - 1 << "\n";

this now translates to

auto x {cpp2_new<int>(1)}; 
std::cout << cpp2::move(x) * -1 << "\n";

whereas before it would translate correctly.

@hsutter
Copy link
Owner

hsutter commented May 18, 2024

Ah, good point. I'll think about this more. Thanks again for the report and the notes!

@Green-Sky
Copy link

got another example (beside #1216) that breaks on current master:

main: () = {
    cx : i32 = 10;
    _ = float(cx); // works fine
    _ = PI / float(cx); // works fine too
    _ = float(PI) / float(cx); // works
    _ = PI * cx; // works fine
    _ = float(PI) * cx; // works

    // break with: error: invalid statement encountered inside a compound-statement
    _ = PI * float(cx);
    _ = float(PI) * float(cx);
}

@sookach
Copy link
Contributor

sookach commented Aug 20, 2024

got another example (beside #1216) that breaks on current master:

Thank you for the report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants