-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make error recovery in YAML adapters better #194
Comments
We first need to be clear about ikatyang/tree-sitter-yaml#15 before we proceed any further. |
We are at a point where we'd need if possible to work around this the best way we can, I recap here the thing and the possible discussed workarounds, hoping we can at least mitigate the issue. Some referencesand specifically this comment
Recap from my notes:The issue to recap: e.g:
(this typically happens while editing) passing the above to the parser, results currently in just an error with no nodes (both with and without the empty plugin):
a "similar" JSON with error results instead in a "good" parse result with error annotations but with the tree populated. Now looking at the tree-sitter YAML parse result, it seems that actually it provides a parse tree (for what it manages, i.e. up to the location of the error), but "wrapped" in an ERROR node:
while I guess CstVisitor "stops" when encountering the apidom/packages/apidom-parser-adapter-yaml-1-2/src/syntactic-analysis/visitors/CstVisitor.ts Lines 614 to 629 in 2984f59
We like to get whatever we can instead of nothing Possible workarounds1. creating surrogate document from top level ErrorThis seems possibly the easiest way to go without relying on intervention by tree-sitter-yaml maintainer (or our intervention on that code). We would not get the entire tree, but at least we would get a partial result, which would suffice to overcome at least our issues in completion scenarios. We discussed it and I guess we originally agree that it can be implemented, something like replacing the top level
You have elaborated about this in this comment in slack and related replies, and I am not sure I get the conclusion. In this comment you say:
I am not sure I understand your point, what would be exactly the problem, and would it stop us from obtaining a parse tree up to the point of the error? I see the parsed tree structure stays the same (for the parsed part) as the result of a successful parsing of good YAML, e.g. parsing:
produces:
So question here is if we can proceed with the plan. You also added this comment in slack:
If I get this correctly, it's about getting some result for the remaining part of the doc (after the error), but not sure what would be the result, can you elaborate a bit more on it? 2. relying on workaround to be implemented in YAML grammarI am referring to this comment:
I guess this wouldn't solve the situation above (partial key with no Also not sure about your comment to his proposal:
I believe this is related to what discussed above, but we don't have the workaround yet, correct? |
As the grammar doesn't use internal tree-sitter lexer, in many cases it generates CST that is nonsensical. There are key value pairs without parent mappings (flow or block) etc... That is exactly what tree-sitter internal lexer is for - to probe paths of what author most probably intended with incorrect document and provide this most probable correct version CST tree in case of error. If we would try to create some smart algorithm to compensate for this, I'd rather fork the YAML grammar and reimplement it in proper way. If you want to obtain the incorrect CST -> AST -> ApiDOM tree computed from the children of the top level error, this can be done by converting the top level Error to surrogate Document node and moving all Error node children as surrogate Document node children. This will result in ParseResultElement containing improperly constructed ApiDOM (having MemberElement directly as an item in ArrayElement without wrapping ObjectElement, etc...) and our tooling will most probably fail when operations are performed on ApiDOM like this. We need to accept that tree-sitter-yaml is dead. The author doesn't have any interest in the grammar and the grammar it self is written in a way that causes all these issues we wouldn't need to handle, if the grammar was written in standard way, using internal mechanisms of tree-sitter.
In some cases, the grammar stop parsing the source string and just consumes it. The solution here would be to create a surrogate Error that contains the consumed string as a child and compute the source maps. Then probably to represent this as ApiDOM we'd transform Error nodes like this into StringElement and Annotation(error=true).
I have no idea what effect this will have on overall grammar behavior, it can possibly make the behavior of the grammar more tree-sitter like, but it's just theory... Author will not be doing this modification as he has no time and there has not been an answer from him since March 2021.
This was addressing the issue with grammar consuming the rest of the source string. We can solve that on our side by workaround. I'm also suggesting here that making ad-hoc partial fixes is not a way to go, using internal tree-sitter lexer is... Given all that's been discussed here and elsewhere, I think it's time to either produce our own grammar by forking https://github.com/ikatyang/tree-sitter-yaml or switch to different underlying parser for YAML. We're currently use dead YAML grammar, that's not giving us key benefits of tree-sitter and maintainer went completely silent. Author/Maintainer also said this:
So there are only two cases according to him where external lexer is needed. The rest can be handled by tree-sitter lexer. To achieve that we have to do mofidications in scanner.cc |
bumped on this one, moved to |
@frantuma @char0n asyncapi: 2.4.0
info:
version: '1.0.0'
title: Something # Badly indented The error message is the entire contents... How can we enhance this experience? Other YAML parsers will gives us the coordinates, so something like "Bad indentation on Line 4" |
Here is another fragments that doesn't work in newest version openapi: 3.1.0
info:
summary: Update an existing pet
desc
title: test title |
I've just issued #2881 to provide maximum error recovery for YAML parsing. For this I retained support for Node.js@18 and Node.js@20 (Node.js env), but used older versions of I also did deep research of What we need to do with The external scanner/lexer is a classic very complex finite state machine, but any good C++ developer should be able to make changes there to compensate for tree-sitter evolution. Following YAML fragments and error recovery are now supported: asyncapi: 2.4.0
info:
version: '1.0.0'
title: Something # Badly indented openapi: 3.1.0
info:
summary: Update an existing pet
desc
title: test title Regarding the first example, we can provide better error message by inspect if the error message contains entire YAML fragment, which means - it's an YAML syntax error and not semantic linting error. |
Following YAML fragment produces
ERROR
node with children of validmap
,sequence
orscalar
.Ref tree-sitter/tree-sitter#2339
Ref ikatyang/tree-sitter-yaml#50
Ref swagger-api/swagger-editor#4217
The text was updated successfully, but these errors were encountered: