Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libexpr: would you like to accept patches that improve error handling in the parser? #8584

Open
inclyc opened this issue Jun 26, 2023 · 8 comments
Labels
feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc

Comments

@inclyc
Copy link
Member

inclyc commented Jun 26, 2023

Is your feature request related to a problem? Please describe.

Nix community needs a language server that could be used for "goto definition" for lambdas nixpkgs, or "goto definition" for packages. I'm writting a parser that based on this official one that suitable for LSPs (that is, preserve comments, better error handling, and preserve ranges). Would you like to accept patches that add some error handling logic, instead of throwing nix::ParseError directly in the parser? This is critical to auto-completion for nixpkgs option system, e.g.

{
   foo.|
  # ^ don't raise a nix::ParseError directly, instead, tolerate this error and go-on parsing
}

Describe the solution you'd like

I have implemented these features in the repo (partial) https://github.com/nix-community/nixd . Basically just extending the syntax productions. e.g.

binds
  : binds attrpath '=' expr ';' { { $$ = $1; addAttr($$, std::move(*$2), $4, makeCurPos(@2, data), *data); delete $2;  data->locations[$$] = CUR_POS; } }
+ | binds attrpath '=' expr error { { $$ = $1; addAttr($$, std::move(*$2), $4, makeCurPos(@2, data), *data); delete $2;  data->locations[$$] = CUR_POS; } }
+ | binds attrpath '=' error { { $$ = $1; addAttr($$, std::move(*$2), nullptr, makeCurPos(@2, data), *data); delete $2;  data->locations[$$] = CUR_POS; } }
+ | binds attrpath error { { $$ = $1; addAttr($$, std::move(*$2), nullptr, makeCurPos(@2, data), *data); delete $2;  data->locations[$$] = CUR_POS; } }
  | binds INHERIT attrs ';'

So far it can recover multiple errors like missing semi-colon, or missing = for binds:

Screenshot_20230626_232226

However currently nix parser would only report the first issue:

$ nix-instantiate --parse ./attr-error.nix
error: syntax error, unexpected '=', expecting ';'

       at /home/lyc/workspace/CS/OS/NixOS/nixd/test-workspace/attr-error.nix:4:7:

            3|   x =
            4|     y = 2;
             |       ^
            5|

Would you like to accept these patches?

Describe alternatives you've considered

Maybe this is not suitable for nix library here.

Priorities

Add 👍 to issues you find important.

@inclyc inclyc added the feature Feature request or proposal label Jun 26, 2023
@fricklerhandwerk fricklerhandwerk added the language The Nix expression language; parser, interpreter, primops, evaluation, etc label Jun 26, 2023
@roberth
Copy link
Member

roberth commented Jun 26, 2023

I can't give a direct answer to your question, but these are my thoughts so far:

It must not regress; neither performance nor the quality of error messages.

It would be good to know for sure that this parser can be adapted sufficiently for your use case. If you need to switch away from this parser anyway, there's no point in proceeding with this.

Tree-sitter is supposed to be tailored to your use case. Changes to the Nix grammar and/or parser are rare.

@inclyc
Copy link
Member Author

inclyc commented Jun 26, 2023

It must not regress; neither performance nor the quality of error messages.

For performance issue: error handling only affects what Nix will do when it encounters an error. If no errors occur, it should behave as usual.

Quality of error messages: When designing compilers/interpreters, I think we usually try to guess what the user is missing (for example, the semicolon at the end of a statement in C/C++). The compiler can handle this error and continue parsing the remaining parts, so we can provide more error information (see my screenshot for reference).

It would be good to know for sure that this parser can be adapted sufficiently for your use case.

I intend to maintain a fork of the parser to support ranges (as it appears that Nix only preserves the beginning of AST nodes, not their endings) and to preserve comments. This may have an impact on performance and not suitable for upstreaming here.

Tree-sitter is supposed to be tailored to your use case.

IMHO, this approach is not practical because the data structures in the forked parser have to be shared with libexpr, including nix::ExprSelect, nix::ExprVar, ... (all AST nodes inherited from nix::Expr). Because we need to evaluate the parsed tree. (Is it possible to evaluate the tree generated by tree-sitter?)

Evaluation is necessary for dynamic bindings, dynamic envs, etc. e.g. with pkgs; [ ]

Changes to the Nix grammar and/or parser are rare

As mentioned above, this will not change the grammar.

@roberth
Copy link
Member

roberth commented Jun 26, 2023

For performance issue: error handling only affects what Nix will do when it encounters an error. If no errors occur, it should behave as usual.

I want to believe this, but I think we should measure it.

I think we usually try to guess what the user is missing

Usually when this happens there will be cases where the tool assumes too much and produces nonsense errors instead of raising the first, valid error.
Nix is generally quick enough that you can fix one error and re-run the evaluation to get the next one.
The terminal is different is this regard, as the errors are further removed from the relevant source location. In an IDE you bring those closer together, so the user might be able to see at a glance where the parser went wrong, and they don't have to scroll up their terminal for a mile.
If we could bail on the first error for CLI use, that would be better.

One possible option is to put all the extra rules in their own section behind an ifdef. That would also settle the performance question.

Is it possible to evaluate the tree generated by tree-sitter?

You'd have to construct Exprs from the tree-sitter result, and you may have to re-do the indented string handling perhaps (not sure). Basically whatever happens in the current parser definition, you'd have to reimplement, but with the tree-sitter output as the input.

@inclyc
Copy link
Member Author

inclyc commented Jun 26, 2023

Thanks for quick response @roberth ❤️ .

High-level, IMHO:

Generally I want to see a consistent software ecosystem with shared components for nix tooling. Such as formatter, linter, and LSPs, however there are many nix parser implementation and they are different from this one in some details. (GLR parser with ambiguous grammar).

e.g some parser could not handle ExprPos

__curPos.

and some lexer do not look ahead for paths.

I don't know if the community agrees with this idea, but I believe it will benefit on developer experience & new users exploring Nix language.


And reply:

they don't have to scroll up their terminal for a mile.

Frankly this is true in C++ world. We definitely want to avoid this in Nix.

Nix is generally quick enough that you can fix one error and re-run the evaluation to get the next one.

I admit that nix itself is fast enough for re-runs, so I filed this issue here. Maybe "recovery from error" is not very suitable for CLIs, but libexpr is a library isn't it? Maybe we can provide different feature set for the parser, some for CLIs, and some for language tooling.

You'd have to construct Exprs from the tree-sitter result, and you may have to re-do the indented string handling perhaps (not sure). Basically whatever happens in the current parser definition, you'd have to reimplement, but with the tree-sitter output as the input.

For implementation details, I would like to use a slightly modified parser in this repo (this is already implemented and landed in nixd), instead of reimpl many funny stuffs happend in the parser. Maintaining a tree-transformer is definitely more complicate than maintaining a slightly modified fork.

@Ericson2314
Copy link
Member

I get that reusing as much of this C++ codebase as possible is the essence of your approach, but I must admit if that the only way to write a correct language server, it means we've failed to adequately specify the Nix language.

I'm happy you are using the codebase in exciting news ways — I do want that to be an option, but I don't want it to be the only option.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-07-24-nix-team-meeting-minutes-75/31112/1

@thufschmitt
Copy link
Member

Discussed during the Nix team meeting.

As @roberth said, we can accept such contributions if:

  1. We have proof that they don't degrade the parser performances
  2. They don't add too much complexity to the parser code

@thufschmitt
Copy link
Member

On a side note, there was a similar discussion on nickel, and the outcome was that trying to directly enrich the grammar (lalrpop in that case) would be a lot of work for some debatable results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc
Projects
None yet
Development

No branches or pull requests

6 participants