Skip to content

Fix #139#141

Draft
Protowalker wants to merge 1 commit intoendbasic:masterfrom
Protowalker:shebang
Draft

Fix #139#141
Protowalker wants to merge 1 commit intoendbasic:masterfrom
Protowalker:shebang

Conversation

@Protowalker
Copy link
Copy Markdown

No description provided.

@Protowalker
Copy link
Copy Markdown
Author

Ah shoot, I have to fix the history so it doesn't have the old stuff

@jmmv
Copy link
Copy Markdown
Collaborator

jmmv commented Dec 9, 2021

While this works, I would prefer to make the parser stricter and only recognize the #! pair at the very beginning of the stream, and then ignore the first line altogether.

And also needs tests:

  • At least a unit test for the parser.
  • I think this will require an integration test to make sure that 1) the shebang is skipped, and 2) that the interpreter actually reads the program when it's executed via the shebang. Might be tricky to put that test in place though...
  • And also tests to make sure that #! isn't ever recognized if typed within the REPL.

@Protowalker
Copy link
Copy Markdown
Author

In that case I'll take a new approach and make #! its own token, then throw an error in the parser if it's anywhere other than the start of a file stream. Also yes, I did forget to write tests here 😅 it's a draft for a reason after all!

@jmmv
Copy link
Copy Markdown
Collaborator

jmmv commented Dec 9, 2021

Do you think a separate token is worth it? I think just scanning the stream for #! at the beginning and skipping until the first EOL would be easier and keep the rest of the lexer and parser oblivious that this happened.

And I think I hadn't noticed this was a draft! Surprised to see that they show up by default in the PRs list.

@Protowalker
Copy link
Copy Markdown
Author

Protowalker commented Dec 9, 2021

Do you think a separate token is worth it? I think just scanning the stream for #! at the beginning and skipping until the first EOL would be easier and keep the rest of the lexer and parser oblivious that this happened.

My reasoning for a separate token (and also catching it in the tokenizer) was that it would let us give better error messages (+ lexer-time errors instead of runtime errors), but it might also not be worth it. It's up to you. If we want to do this at the time of lexing it would require that #142 be done or a similar solution, though. It would also requiring passing more info through to the lexer, since it would have to know that we're not in the intrepeter.

And I think I hadn't noticed this was a draft! Surprised to see that they show up by default in the PRs list.

Haha, yup, this and #142 are both drafts. Just made the PRs so it would be easier to communicate as I'm working on them.

@jmmv
Copy link
Copy Markdown
Collaborator

jmmv commented Dec 9, 2021

But for line numbers, we'd just need to start the counter at 2 instead of 1 inside the spans when #! is present, right?

The thing is that the #! is a Unix thing bolted onto the language, so that's why I think it should not be part of the lexer/parser and it should just be something that happens "out of band" before the lexer even sees the stream. I'm guessing, but haven't checked, that this is how #! is also added to many other scripting languages that don't have #-style comments in them.

@Protowalker
Copy link
Copy Markdown
Author

That seems a fair argument. The only thing I'm thinking is potential end goals of the project -- if the idea is that people who are new to programming can use it to learn, they might see an example on line that starts with #!/usr/bin/bash and wonder what it means, so they go into their interpreter and type it in only to get unexpected token #!, leaving them confused as to what they did wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants