Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16g OOP: Error reporting #41

Merged
merged 27 commits into from
May 4, 2022
Merged

16g OOP: Error reporting #41

merged 27 commits into from
May 4, 2022

Conversation

ngjunsiang
Copy link
Contributor

@ngjunsiang ngjunsiang commented May 3, 2022

Earlier, we broke error reporting when I decided to avoid the use of tokens in the resolver and interpreter. Error messages still appear, but we have no idea where in the code they are coming from.

Now that the code is better organized and more clearly structured, let's restore it.

How are we going to report errors from the resolver and interpreter without tokens? We will probably still need them, but they can't be a part of the resolution/execution path.

@ngjunsiang
Copy link
Contributor Author

Let's start simple: associate each Expr with a token that best represents its position in the code, only for error reporting purposes:
https://github.com/nyjc-computing/pseudo/blob/5defa2a9d2710099924850e9a9e261b899760598/lang.py#L144-L172

Since each Expr is going to need this, we put the code in the Expr parent class.

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented May 3, 2022

Let's start simple: associate each Expr with a token that best represents its position in the code, only for error reporting purposes:
https://github.com/nyjc-computing/pseudo/blob/5defa2a9d2710099924850e9a9e261b899760598/lang.py#L144-L172

Since each Expr is going to need this, we put the code in the Expr parent class.

We add a token parameter to each Expr subclass's init: [3e5edde]

... except for Literals, which are a little special: their init comes directly from TypedValue. I want to preserve this equivalence between Literals and TypedValues, so I don't just want Literals to start implementing their own init.

Instead, let's make TypedValue multiple-inheritance-friendly by enabling super()-chaining. This means that TypedValue will pass remaining arguments to the next superclass:
https://github.com/nyjc-computing/pseudo/blob/41a33e45e23fb8ba19b2d37320720315573311df/lang.py#L71-L80

Time to start adding tokens to Exprs.

@ngjunsiang
Copy link
Contributor Author

In the parser, we use a helper function to make Exprs. Let's upgrade it to pass tokens: [8cec6ef]

@ngjunsiang
Copy link
Contributor Author

And then we add tokens wherever we can: [7f5d41c]

@ngjunsiang
Copy link
Contributor Author

Refactoring

Let's take the chance to do some refactoring where it will make token association easier.

Assign expressions

Assign expressions (<name> <- <expression>) appear in two places:

  • Assign statements
  • FOR loops (FOR <name> <- <INTEGER> TO <INTEGER>)

The difference between them is that the Assign statement expects a line break after the Assign expression, while the FOR statement expects the keyword TO after the Assign expression.

Let's refactor them with an assignment() expression:
https://github.com/nyjc-computing/pseudo/blob/16e5494a0ccbb5d11944010b26222a03f7679fc2/lang.py#L208-L213

We'll use this in an ExprStmt in place of an Assign Stmt. One less Stmt, one more use of ExprStmt! In the meantime, the Assign Expr clashes with the Assign Stmt, so let's disable the latter by commenting it out: [9b0a410]

Upgrade the makeExpr() helper in parser.py to recognise Assign Exprs: [6f4a739]

Add an assignment() parser, and use it in assignStmt():
https://github.com/nyjc-computing/pseudo/blob/1fbdd2c22ce29e7eca14b27ff0076c08953f2a42/parser.py#L173-L177
https://github.com/nyjc-computing/pseudo/blob/c579baeb851f6723a385e204a2ea96dd962c53c3/parser.py#L223-L226

And also in forStmt(): [95d7387]

@ngjunsiang
Copy link
Contributor Author

Bugfix

Using the chance to fix some bugs in our Binary Exprs: [de7f460]

Our value() parser is looking quite complex now; it handles 4 expression types! Let's split off some responsibility into separate parsers - literal(), unary(), and nameExpr()`:
https://github.com/nyjc-computing/pseudo/blob/27a5bc8dbea188414b0008242564d885b475f035/parser.py#L73-L129

@ngjunsiang
Copy link
Contributor Author

Resolving Assign Exprs

Now we add a resolver for the new Assign Expr:
https://github.com/nyjc-computing/pseudo/blob/fd30727ab320f3b016eb4c1c5d0a6bdbaabd9f27/resolver.py#L68-L71

And use it in the resolve() entry point:
https://github.com/nyjc-computing/pseudo/blob/fd30727ab320f3b016eb4c1c5d0a6bdbaabd9f27/resolver.py#L121-L122

And update our Assign ExprStmt verifier:
https://github.com/nyjc-computing/pseudo/blob/6f73d763e38e5928f8ac1a81a4d60a7abaf28c1f/resolver.py#L227-L228

Then we can remove our old verifyAssign() verifier: [32f8e49]

@ngjunsiang
Copy link
Contributor Author

Interpreting Assign Exprs

All the code is there already. We convert our old execAssign() into evalAssign(), and add one more rule to evaluate(): [9f7ef84]

Net change: We remove Assign Stmt, and add Assign Expr. Did we really gain anything? Yes, primarily in the simplification of the FOR Stmt. It is now easier to assign tokens to its various Exprs.

@ngjunsiang
Copy link
Contributor Author

Bugfixes

First we resolve the attribute collision between the token attribute and token() method in Expr: [782a99e]

Next, we have to wrap our init and incr Exprs in an ExprStmt, since they are also Stmts in while and repeat loops: [615980a]

And then a little refactoring, as a treat, to group our ExprStmts together in the resolver, since they use the same entry point: [15ce968]

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented May 4, 2022

Error reporting with tokens

Finally, we check that tokens are inserted into the errors: [092a243]

And make a minor improvement in the missing-word errors reported by the parser: [bef432d]

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented May 4, 2022

Result:

[Line 3]     OPENFILE data.txt FOR WRITE
                          ^
'.': Expected FOR after file identifier

Our line and column info is back!

This is going to need extensive testing and review of each possible kind of error, to ensure that the error messages are helpful. That is an eternal work in progress, and too long for the rest of this chapter. We'll deal with other suboptimal error messages as and when they crop up.

@ngjunsiang ngjunsiang merged commit a188d0c into main May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant