Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank line within an indented block of code is wrongly parsed on Windows #2725

Open
Vipul-Cariappa opened this issue Jun 4, 2024 · 3 comments

Comments

@Vipul-Cariappa
Copy link
Contributor

Windows uses a carriage return (\r) and line feed (\n) character to represent a new line (i.e. \r\n is a new line). But in other systems, it is only a single line feed character that is used to represent a new line.
A problem arises when there is a black line in between an indented block of code. If the blank line is indented with the correct number of spaces or tabs it does not cause any problems, but if it is a completely blank line without any spaces and indentation then the parser understands it as the end of the indented code block even though it is not. This problem only occurs with Windows's new line representation.
I am attaching 2 files to test it out. If you open them in vscode or Windows notepad, the editor specifies the new line format of the file in the bottom right corner. If it reads LF, it uses \n to represent a new line; if it reads CRLF, it uses \r\n to represent a new line.

main_unix.py uses only \n to represent a new line and it works correctly, but main_win.py which uses \r\n gives an error.

Contents of file:

# i32 = int # uncomment to make it python compatible

def callme() -> i32:
    i: i32

    for i in range(3):
        print("Hello!!!")

    return i

if __name__ == "__main__":
    callme()

Output:

(lp) C:\Users\vipul\Documents\Workspace\lpython>.\src\bin\lpython.exe --jit .\tmp\main_unix.py
Hello!!!
Hello!!!
Hello!!!

(lp) C:\Users\vipul\Documents\Workspace\lpython>.\src\bin\lpython.exe --jit .\tmp\main_win.py
semantic error: Variable 'i' not declared
 --> .\tmp\main_win.py:6:9
  |
6 |     for i in range(3):
  |         ^


Note: Please report unclear or confusing messages as bugs at
https://github.com/lcompilers/lpython/issues.

main_unix.py.txt
main_win.py.txt

@nikabot
Copy link

nikabot commented Jun 8, 2024

I think we need to handle this in tokeniser,
see initial handling of CRLF in the tokeniser: 4393893 (#1463)

@Vipul-Cariappa
Copy link
Contributor Author

hisashiburi dana @nikabot (mugiwara)

I have provided a temporary fix at e65cfe1. But yes, a permanent fix should be updating the tokenizer.

@nikabot
Copy link

nikabot commented Jun 9, 2024

Yup! (Crocodile!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants