Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Reporting Multiple Compiler Errors #139

Merged
merged 11 commits into from
Aug 12, 2021
Merged

Conversation

FrobtheBuilder
Copy link
Contributor

Make the Bebop compiler report multiple errors.
image
These include validation errors and parsing errors. I had to rework the parser significantly to give it the ability to recover upon invalid input. It's not perfect, as it doesn't retain context well while it is while recovering, causing occasional spurious "not a top level definition!" errors following a syntax error, especially in unions. But it works really well other than that.
When the parser encounters an unexpected token, it attempts to skip forward to the next token that could possibly be valid within the closest context possible. So say, within a definition it will attempt to skip to a token that should start a new field, or if it can't find one, it will skip to the next top level definition. In the worst case, if an exception is thrown all the way to the top level Parse method, it will skip to the end of the current file. I also moved the test schemas around since I wanted to add bebop.json project files for both the valid and invalid schemas, and the compiler always searches recursively so they needed to be separated into different directories.

@andrewmd5 andrewmd5 changed the title Multiple errors Support for Reporting Multiple Compiler Errors Aug 11, 2021
"Oops All Failures": {
"commandName": "Project",
"commandLineArgs": "--check",
"workingDirectory": "C:\\Users\\Frob\\Projects\\bebop\\Laboratory\\Schemas\\ShouldFail"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be relative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would have done that, but I'm not sure exactly what it's relative TO and I couldn't find any info on it. Can you enlighten me?

@FrobtheBuilder
Copy link
Contributor Author

What is the purpose of "DummyDefinition?"

Mainly a contrivance I created so I wouldn't have to make definitions nullable in a bunch of places, but I can do that instead if you'd prefer.

Copy link
Contributor

@lynn lynn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I only have some small comments.

My one meta-comment is that there is a bit of this:

bool errored = false;
while (!Eat(TokenKind.CloseBrace)) {
  if (errored && unrecoverable) {
    CancelScope();
    return null;
  }
  errored = false;
  try {
    parse_some_stuff;
    if (wrong) throw new SomeError();
  } catch (SpanException e) {
    _errors.Add(e);
    errored = true;
    SkipUntil(whatever);
    continue;
  }
}

It looks like errored can be eliminated to simplify the control flow a little:

while (!Eat(TokenKind.CloseBrace)) {
  try {
    parse_some_stuff;
    if (wrong) throw new SomeError();
  } catch (SpanException e) {
    _errors.Add(e);
    SkipUntil(whatever);
    if (unrecoverable) {
      CancelScope();
      return null;
    }
    continue;
  }
}

(I would actually be happy if we could eliminate try-catch from the parsing logic entirely, because it feels weird how SpanExceptions are both "things we throw" and "things we make a careful list of" — but then when we do throw one, we're actually immediately catching and putting it in the list… But, I see how it's useful if you have a bunch of parsing steps that could go wrong and you want to recover from them all in the same way. If only C# had monads, or something)

Comment on lines 132 to 133
public InvalidUnionBranchException(Definition definition)
: base($"The definition '{definition.Name}' cannot be used as a union branch. Valid union branches are messages and structs.", definition.Span, 113)
public InvalidUnionBranchException(Definition? definition)
: base($"The definition '{definition?.Name ?? "null"}' cannot be used as a union branch. Valid union branches are messages and structs.", definition?.Span ?? new Span(), 113)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a good reason for this parameter to become nullable: an error message like The definition 'null' cannot be used as a union branch. with an empty span is not useful. And at the call site, it doesn't look like definition can be null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. There was a point where I had a call to that without an argument available but it's gone now.

@@ -20,6 +20,8 @@ namespace Core.Parser
{
public class SchemaParser
{
private readonly HashSet<TokenKind> _topLevelDefinitionKinds = new() { TokenKind.Enum, TokenKind.Struct, TokenKind.Message, TokenKind.Union };
private readonly HashSet<TokenKind> _universalFollowKinds = new() { TokenKind.Enum, TokenKind.Struct, TokenKind.Message, TokenKind.Union, TokenKind.EndOfFile };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "universal follow kinds" mean? Does this mean: if we get confused, we can resume parsing from any token-kind in this set? Maybe it is worthy of a small doc comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's the bottom level follow set for pretty much everything. I could add a comment.

Comment on lines 178 to 187
additionalTokens ??= new();
ConsumeBlockComments();
if (CurrentToken.Kind != kind)
{
_errors.Add(new UnexpectedTokenException(kind, CurrentToken, hint));
while (_index < _tokens.Count - 1 && CurrentToken.Kind != kind && !additionalTokens.Contains(CurrentToken.Kind))
{
_index++;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be DRY-er if we just call ExpectAndSkip(new HashSet<TokenKind>{ kind }, additionalTokens, hint), performance be damned (it only affects the very first if, anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah the problem there is that will add another error to the log. I could probably just use that SkipUntil method I added though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait nevermind I thought you were looking at a different part.

{
additionalTokens ??= new();
ConsumeBlockComments();
if (kinds.All(kind => kind != CurrentToken.Kind))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just if (kinds.Contains(CurrentToken.Kind)) return;, then un-indent the lines below.

Comment on lines 210 to 216
private void SkipUntil(HashSet<TokenKind> kinds)
{
// Always advance by one.
if (_index < _tokens.Count - 1)
{
_index++;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then IMO this should be called SkipCurrentAndThenSkipUntil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should. It works like that to prevent infinite loops, since it's possible that the current token isn't even potentially valid.

@FrobtheBuilder FrobtheBuilder merged commit 10143cc into master Aug 12, 2021
@andrewmd5 andrewmd5 deleted the multiple-errors branch July 18, 2022 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants