Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve some lexer error messages #5108

Merged
merged 5 commits into from
Jun 24, 2024

Conversation

sellout
Copy link
Contributor

@sellout sellout commented Jun 20, 2024

Overview

The message mentioned in #5060 was incorrect. This also uses the passed-in id for that message and related messages (removing the reliance on ANSI colors for conveying where the error is). And it adds a suggestion on how to avoid the error.

It also does some minor adjustment of highlighting – styling individual identifiers rather than a list of them.

Fixes #5060.

@sellout sellout requested a review from ChrisPenner June 20, 2024 17:20
@ceedubs
Copy link
Contributor

ceedubs commented Jun 20, 2024

Thanks!

Should the example in #5060 (or something like it) be added to a transcript to avoid a regression?

The message mentioned in unisonweb#5060 was incorrect. This also uses the passed-
in `id` for that message and related messages (removing the reliance on
ANSI colors for conveying where the error is). And it adds a suggestion
on how to avoid the error.

It also does some minor adjustment of highlighting – styling individual
identifiers rather that a list of them.

Fixes unisonweb#5060
@sellout
Copy link
Contributor Author

sellout commented Jun 20, 2024

There were two transcripts that already tested this – I just forgot to commit the changed output.

And I accidentally rebased on trunk. C’est la vie.

@@ -343,10 +343,12 @@ use.keyword.in.namespace = 1

Loading changes detected in scratch.u.

The identifier used here isn't allowed to be a reserved keyword:
The identifier namespace used here is a reserved keyword:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I like the idea of including the text here so we don't just rely on colors. But I could see this getting confusing if it doesn't include quotes or backticks or something around the identifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. I had originally hoped to modify style to wrap Code in something, but it requires not-insignificant changes to support anything other than ANSI colors there.

It’s probably good English to change

The identifier namespace used here is a reserved keyword:

to

The identifier, namespace, used here is a reserved keyword:

And I maybe defining a function like

styleCode code = "" <> style Code code <> ""

to use in the interim would be good enough? (It avoids hardcoding the quotes everywhere, but also doesn’t require generalizing all the ColorText stuff.)

The identifier, ‘namespace’, used here is a reserved keyword:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel qualified to answer that. @aryairani or @mitchellwrosen WDYT?

Copy link
Member

@mitchellwrosen mitchellwrosen Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar enough to know about generalizing ColorText / SyntaxText / whatever to support a different solution, but that styleCode combinator does look fine and good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would use backticks if anything, like in backticked.

Also we have other places where we rely on color that gets lost in transcript/non-ansi output, and I'm not sure what to do about it. :-\ Once upon a time, we scrapped having separate renderers depending on the terminal mode, but long term we could look at that again.

Copy link
Contributor Author

@sellout sellout Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would use backticks if anything, like in backticked.

Done. The reason I prefer Unicode quotes is because you often need to distinguish between literal backticks in the code vs backticks used to surround code. E.g., in this particular error message, it says “You can avoid this problem […] by […] wrapping it in backticks (like `namespace` ).” where we want those backticks to be considered part of the code (and they are colorized as such in contexts that allow it). But this is exactly why we have these style abstractions. If you decide to switch the way it’s quoted down the road, you just change quoteCode and not dozens of string literals.

Also we have other places where we rely on color that gets lost in transcript/non-ansi output, and I'm not sure what to do about it. :-\ Once upon a time, we scrapped having separate renderers depending on the terminal mode, but long term we could look at that again.

Yeah, it’s tough. Apple’s pasteboard lets you store multiple versions of the content depending on where you paste to, but that doesn’t work (at least not easily) from the terminal, and it also relies on the receiving app having pretty sophisticated pasteboard handling. I think you mostly need to accept that anything other than text is going to be tough. Another approach could be to use ANSI escapes to hide the quotes in the terminal, so that copying the text still results in them being in the clipboard. But, yeah, a conversation for another time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting.

This both colorizes and wraps code in backticks, in order to separate it from surrounding context.
@@ -343,10 +343,12 @@ use.keyword.in.namespace = 1

Loading changes detected in scratch.u.

The identifier used here isn't allowed to be a reserved keyword:
The identifier, `namespace`, used here is a reserved keyword:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think breaking up "The identifier used here" with commas is weird / possibly a mistake.

Maybe:

The identifier namespace used here is a reserved keyword

or

The identifier used here, namespace, is a reserved keyword

While we're in here, could you add cases in the transcript for the ones that didn't have coverage, so we can scrutinize their commas too? 🙏
ReservedSymbolyId, InvalidSymbolyId, InvalidWordyId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think breaking up "The identifier used here" with commas is weird / possibly a mistake.

You’re right – I thought appositives always needed to be offset with commas, but not if there is already some other punctuation (the backticks in this case).

While we're in here, could you add cases in the transcript for the ones that didn't have coverage, so we can scrutinize their commas too? 🙏 ReservedSymbolyId, InvalidSymbolyId, InvalidWordyId.

InvalidWordyId isn’t even used, so I removed that case. The other two I’m not sure are reachable. E.g., the transcript case that is ostensibly a ReservedSymbolyId error actually fails differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Another variant gave another error:

(!) a b = 1
  I got confused here:
  
      1 | (!) a b = 1
  
  
  I was surprised to find a ( here.
  I was expecting one of these instead:
  
  * ability
  * newline or semicolon
  * type
  * use

Not sure what's going on there.

Copy link
Contributor

@aryairani aryairani Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(whoops this was with the wrong verson)

Copy link
Contributor

@aryairani aryairani Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregard about [the message having been generated] with the wrong verson, it looks the same in this branch.

Appositives only need to be offset by commas if there isn’t already some other punctuation.
@aryairani aryairani merged commit 6798520 into unisonweb:trunk Jun 24, 2024
20 checks passed
@sellout sellout deleted the lexer-error-messages branch June 25, 2024 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusing error message about reserved keyword
4 participants