Skip to content

Mark error location across the entire length #9130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 12, 2022
Merged

Conversation

faho
Copy link
Member

@faho faho commented Aug 9, 2022

Description

We currently only mark the start of errors with a ^ caret. This changes it so that we mark the start and end with a caret, with a run of ~ tildes in between.

That makes it easier to see the extent of the erroneous syntax, and is familiar from e.g. gcc.

The output looks like

checks/set.fish (line 471): for: a,b: invalid variable name. See `help identifiers`
    for a,b in y 1 z 3
        ^~^

checks/cmdsub-limit.fish (line 67): Too much data emitted by command substitution so it was discarded
    echo this will fail (string repeat --max 513 b) to output anything
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~^

Note that this exposes the error location in a way that it wasn't before, so it's likely to shake out mistakes and weirdnesses. This is why I've elected to keep the tests broken for now, so we can easily see what's changed and what needs improvement.

TODOs:

  • Changes to fish usage are reflected in user documentation/manpages.
  • Tests have been added for regressions fixed
  • User-visible changes noted in CHANGELOG.rst

@faho faho added this to the fish 3.6.0 milestone Aug 9, 2022
@faho
Copy link
Member Author

faho commented Aug 9, 2022

One big issue: If a command substitution errors, this will show both errors and blame the outermost one in the last error:

set fish_read_limit 512
echo this will fail (string repeat -n 2 (string repeat --max 300 b)) to output anything
echo this will fail (string repeat -n 2 (string repeat --max 550 b)) to output anything

results in

checks/cmdsub-limit.fish (line 76): Too much data emitted by command substitution so it was discarded
    echo this will fail (string repeat -n 2 (string repeat --max 300 b)) to output anything
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

for the first and

    checks/cmdsub-limit.fish (line 1): Too much data emitted by command substitution so it was discarded
    string repeat -n 2 (string repeat --max 550 b)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~^
    in command substitution
    \tcalled on line 77 of file checks/cmdsub-limit.fish
    checks/cmdsub-limit.fish (line 77): Too much data emitted by command substitution so it was discarded
    echo this will fail (string repeat -n 2 (string repeat --max 550 b)) to output anything
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

for the second. Doing this right requires threading the error through, and then removing the outer errors because the inner substitution failed and it's already clear where the error is.


Even with that, I'd argue this is still nicer, because the current error is

    echo this will fail (string repeat -n 2 (string repeat --max 550 b)) to output anything
                        ^

@faho
Copy link
Member Author

faho commented Aug 9, 2022

And this is very misleading:

    fish: Unexpected end of string, square brackets do not match
    echo f[oo # not valid, no matching ]
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

This isn't because it detects that ], it's a coincidence, it happens to be the last character.

We simply keep iterating our token cursor over the comment and fall off the end.

@krobelus
Copy link
Contributor

krobelus commented Aug 9, 2022

Doing this right requires threading the error through, and then removing the outer errors because the inner substitution failed and it's already clear where the error is.

Yeah. It's okay to leave this for later.

This isn't because it detects that ], it's a coincidence, it happens to be the last character.

Right, this is a pathological case. Probably we should rewrite this test case to something like

	    fish: Unexpected end of string, square brackets do not match
	    echo f[oo # not valid, no matching ] to close list index 
	          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

which should still capture the original intent of this test

@faho
Copy link
Member Author

faho commented Aug 9, 2022

Right, this is a pathological case. Probably we should rewrite this test case to something like

Honestly I'm skipping the squigglies for unterminated things.

By definition, the closing character could be anywhere, and is missing. So we just mark what started it.

I don't believe marking to the end of the line (or even the entire file?) is helpful.

So you'll get the old

fish: Unexpected end of string, square brackets do not match
echo f[oo # not valid, no matching ]
      ^

@krobelus
Copy link
Contributor

krobelus commented Aug 9, 2022

Honestly I'm skipping the squigglies for unterminated things.

Of course, that's much better. We even allow newlines before terminators

@faho
Copy link
Member Author

faho commented Aug 9, 2022

Okay, I added the unsquiggling but it's awkward, because we now need to define the error-length for tokenizer errors.

@faho faho force-pushed the error-length branch 2 times, most recently from 10dc597 to e0f8de7 Compare August 11, 2022 16:07
@faho
Copy link
Member Author

faho commented Aug 11, 2022

Okay, I believe I've now fixed all the immediate problems. The test suite is passing again.

Note that a bunch of errors aren't altered by this but could be, and some cases aren't ideal (but acceptable).

And littlecheck doesn't check whitespace, so you see a lot of

echo $foo[d]
#CHECKERR: {{.*}}expansion.fish (line {{\d+}}): Invalid index value
#CHECKERR: echo $foo[d]
#CHECKERR: ^

even tho the caret is right smack on the "d", which is correct.

Some more errors we could improve:

if $status; echo foo; end'

echo $foo[wrongindex]

mark just one char instead of the full variable.

fish -c 'echo "$abc["'

doesn't print a caret at all!

(yes, this is detect_errors instead of the proper parser again!)

not time true&

marks the entire statement.

set -l p ")"; echo $$p

marks just the $, could mark $$p?

But those are improvements for later. I'm looking to merge this soon to get feedback and work on it.

Also I'm going to be changing littlecheck so it doesn't escape quotes anymore. Nothing worse than

""
^^

looking wrong because littlecheck displays it as

\"\"
^^

After that's done, I dunno, maybe introduce coloring in the error messages? So you get the erroneous part in red?

@faho
Copy link
Member Author

faho commented Aug 11, 2022

fish -c 'echo "$abc["'

Figured this one out, it's not detect_errors, it's that it skipped the caret if it was meant to be past the end (of the entire script). So

fish -c 'echo "$abc["; true'

would print it (on the ;, but still). As a workaround/robustness measure I've made it so we clamp the caret position at the end of the script. This has a slight inconsistency in that

echo "$abc["

will blame the ", while

echo "$abc["; true

will blame the semicolon, but that's not too bad. And at least it shows the issue, instead of quietly pushing it under the rug (by not showing the caret line).

@faho faho changed the title Error length Mark error location across the entire length Aug 11, 2022
@faho
Copy link
Member Author

faho commented Aug 12, 2022

This exercise is turning up a few broken error messages.

E.g.

fish --no-config -c 'echo notprinted; echo foo; a=b'

doesn't print a caret at all.

(the commit message says it fails to print the error, that was probably an editing artifact, I can no longer reproduce it - at one point I got just "fish:" without it explaining what was wrong)

faho added 11 commits August 12, 2022 18:04
This makes it so instead of marking the error location with a simple
`^`, we mark it with a caret, then a run of `~`, and then an ending `^`.

This makes it easier to see where exactly an error occured, e.g. which
command substitution was meant.

Note: Because this uses error locations that haven't been exposed like
that, it's likely to shake out weirdnesses and inaccuracies. For that
reason I've not adjusted the tests yet.
This makes the awkward case

	    fish: Unexpected end of string, square brackets do not match
	    echo f[oo # not valid, no matching ]
	          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

(that `]` is simply the last character on the line, it's firmly in a comment)

less awkward by only marking the starting brace.

The implementation here is awkward mostly because the tok_t
communicates two things: The error location and how to carry on.

So we need to store the error length separately, and this is the first
time we've done so.

It's possible we can make this simpler.
Fixes error location for unknown commands
This used the decorated statement offset when the expansion errors
refer to the command without decoration.
These were correct, but littlecheck escapes quotes!
This skipped printing a "^" line if the start or length of the error
was longer than the source.

That seems like the correc thing at first glance, however it means
that the caret line isn't skipped *if the file goes on*.

So, for example

```fish
echo "$abc["
```

by itself, in a file or via `fish -c`, would not print an error, but

```fish
echo "$abc["
true
```

would. That's not a great way to print errors.

So instead we just.. imagine the start was at most at the end.

The underlying issue why `echo "$abc["` causes this is that `wcstol`
didn't move the end pointer for the index value (because there is no
number there). I'd fix this, but apparently some of
our recursive variable calls absolutely rely on this position value.
This checked specifically for "| and" and "a=b" and then just gave the
error without a caret at all.

E.g. for a /tmp/broken.fish that contains

```fish
echo foo

echo foo | and cat
```

This would print:

```
/tmp/broken.fish (line 3): The 'and' command can not be used in a pipeline
warning: Error while reading file /tmp/broken.fish
```

without any indication other than the line number as to the location
of the error.

Now we do

```
/tmp/broken.fish (line 3): The 'and' command can not be used in a pipeline
echo foo | and cat
           ^~^
warning: Error while reading file /tmp/broken.fish
```

Another nice one:

```
fish --no-config -c 'echo notprinted; echo foo; a=b'
```

failed to give the error message!

(Note: Is it really a "warning" if we failed to read the one file we
wer told to?)

We should check if we should either centralize these error messages
completely, or always pass them and remove this "code" system, because
it's only used in some cases.
@faho faho merged commit 8d74160 into fish-shell:master Aug 12, 2022
@faho
Copy link
Member Author

faho commented Aug 12, 2022

Alright, merged for now, let's see if anything else turns up.

This seems to be working much better than I expected.

@faho faho deleted the error-length branch September 7, 2022 11:40
@faho faho removed the RFC label Sep 22, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants