-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Simpler AST #355
Comments
There are several things here to comment on. Firstly, there's no question that the parser should be easier to call, so nobody has to make up a dummy Root.t and so on. In general, I think the AST from Model.Comment should contain enough information to recover the original text - that's fundamentally what ocamlformat needs. It's not missing much
I'm actually quite happy that it currently generates more warnings than octavius - providing that we can produce an AST in all cases octavius could. It'd be handy to feed those warnings back to ocamlformat in a useful way to be reported - it'd be neat if merlin could also do this! This is also why I'm not keen on leaving the references as strings, since warnings in parsing them couldn't be passed back to ocamlformat - essentially we keep all of the input validation happening at the same point. The raw markup is an issue where we're currently not doing what the ocamldoc manual says which is that when the backend isn't specified it gets treated as latex. That aside, we still need to have a way to distinguish between the backend being specified and odoc picking the default. There's also an issue with unrecognised tags, which again should contain enough info to recover the original text. |
As a general rule it is always better to keep the semantic representation of something separate from its syntactic representation. Then the syntactic representation can be kept as close to the actual syntax as possible without adding unnecessary details to the semantic representation. In this case, that means that This requires making syntactic versions of |
Oh, and the transformations that @jonludlam suggests should be applied to the |
I quite agree that it would be nice to separate the semantic and syntactic types - ocamlformat really is only interested in the syntax (I think!) but merlin will want the semantic tree. However, from a pragmatic point of view the semantic and syntactic representations are currently extremely similar, to the point that if you don't look carefully to spot the differences you might overlook them entirely. I'm worried that exposing both of these right now will be quite confusing. It's conceivable that they will diverge further though - they're are certainly more different now than when the split was first introduced as restrictions have moved from @lpw25 your point about syntactic types for the paths is quite interesting. That would indeed allow ocamlformat to reflow long references, which would be pretty useful. Right now it has essentially no choice but to treat them atomically. |
In my experience this happens with every tool that looks vaguely like a compiler: at first some of the semantic types are so close to the syntactic ones that people don't keep them separate, then over time the demands on the type diverge until eventually someone has to bite the bullet and disentangle the two use cases into separate types. Since separating a type into two tends to be a bit of a pain, these days I advocate for always keeping them separate from the beginning as a matter of principle. |
I agree that we may have to separate the types in the long term (and we already almost do, but the syntactic type is not public). However, I suggest to avoid this for now, until it is either needed by odoc itself (and not ocamlformat), or the parser becomes very stable. I think it's best to continue thinking of the parser and its data types as internal, even if they are used in ocamlformat. This is to speed odoc development. So, we will "bite the bullet" when odoc needs us to. That can break ocamlformat later, but I would recommend to account for that by communicating closely with the ocamlformat developers. In the worst case, if we cause major breakage, ocamlformat can vendor an older version of the odoc parser, until there is time to adapt ocamlformat to a newer one. Given that, I agree with @jonludlam's suggestion about modifying |
Hi,
First of all, thanks for this great library !
I'm trying to use Odoc in OCamlformat to format documentation comments.
It's been great so far but I ran into a few problems with the representation of comments (
Model.Comment
):containing_definition
argument that is hard to understand.References (see Reference.deconstruct #354)
I proposed in #351 to expose
Parser_.Ast
.This AST has some advantages:
non_link_inline_element
, etc..)It could then be simplified by moving some code from the parser to the
semantics
pass that buildModel.Comment
:-
or+
vsul
orol
)This information could also be in
Model.Comment
.Raw_markup of string option * string
)Alternatively, most of these could be fixed without exposing
Parser_.Ast
and only modifyingModel.Comment
.But I think this AST will be much easier to change in the future to allow more use cases.
We will probably receive issues about documentation comments on OCamlformat and Odoc will be used by a lot of projects.
I'll be happy to hear your opinions on this.
The text was updated successfully, but these errors were encountered: