Odoc parser opam package #685

jonludlam · 2021-06-09T23:29:55Z

No description provided.

While the opam package has been split into odoc_parser and odoc, the esy build remains one. Package build and install steps have been amended for this.

jonludlam · 2021-06-09T23:30:16Z

Need to figure out how to fix opam-dune-lint

Julow

I've been able to fix opam-dune-lint at the cost of increase Dune's version to 2.8: jonludlam#36

Julow · 2021-06-10T08:39:48Z

odoc-parser.opam

+maintainer: "Jon Ludlam <jon@recoil.org>"
+dev-repo: "git+https://github.com/ocaml/odoc.git"
+
+synopsis: "Parser for OCamldoc"


Should be "Odoc" right ? Same in the description below.
It's actually parsing an unspecified language that looks like ocamldoc.

I think of ocamldoc as referring to the language and the tool. The sense I'm trying to convey here is that it's the parser for the language - though you're right, there are extensions beyond what the original ocamldoc language specified.

I would argue we should transition to "the odoc markup language" (or a more catchy name). We already broke compat in various ways, and the distance is only going to grow with time.

Julow · 2021-06-10T08:46:59Z

src/parser/loc.ml

+        pos_cnum = span.end_.column;
+      }
+  in
+  Location.{ loc_start; loc_end; loc_ghost = false }


Shouldn't we continue to use a custom location type ?

compiler-libs is a big dependency and is used for just that

we already have custom types for that (span and with_location)

pos_bol is set to 0 because we don't have the value

it's not used

The intention here is a convenience for users of the library, though it needs a bit more work. When we first call Odoc_parser.parse_comment we pass in a Lexing.position to indicate the start position of the comment. The intent here is to be able to turn our internal Loc.t types back into the original type, more-or-less. The use of Location.t as a return type is because it conveniently represents a span between two Lexing.position values. While it does bring in a dependency on compiler-libs, I think it's pretty unlikely anyone using this library isn't already using compiler-libs, though we could always return a pair of Lexing.positions instead. However, in order to do this correctly we need to stash the original Lexing.position away and keep it for this operation. I'll give this a try and see how it looks.

As an example of where this function would be useful, see here: https://github.com/janestreet/ppx_js_style/blob/master/src/ppx_js_style.ml#L393-L432

Julow · 2021-06-10T17:26:26Z

src/parser/ast.ml

 type nestable_block_element =
  [ `Paragraph of inline_element with_location list
-  | `Code_block of string
+  | `Code_block of string option * string


Some comments from #678 (comment)

Should we parse the "language" field ? (the first word)
We might want to use it for syntax highlighting. This feature is targetted at Mdx, which uses this field.

Both arguments should have a location attached.

Good question. I've currently done the minimum possible here in the parser, in the spirit of how references are parsed, though this and the stripping of whitespace both seem reasonable to me.

and I agree with the location comment, fill fix, thanks!

Julow · 2021-06-10T17:35:45Z

src/parser/lexer.mll

 let raw_markup_target =
  ([^ ':' '%'] | '%'+ [^ ':' '%' '}'])* '%'*
+let code_block_meta =
+  ([^ '['])*


This parser doesn't allow arbitrary strings. I propose this syntax:

Optionally wrap with curly braces: {{@ocaml foo=[]}[ ... ]}.
Closing brace can be escaped \}, no balancing because it would clash with the other escape.
Not using quotes because it wouldn't be clear where they are allowed, this is consistent with other syntaxes (eg. links), I also thing quotes would be more common in the payload.

This PR just does what we agreed in the dev meeting. Since the first user of this field will be mdx, do you happen to know whether this is going to be an immediate problem? That is, whether there are any square brackets in any of the current uses of this field in md files?

I think mdx only need to be able to read file names, "relation", package names, commas, version numbers and pre-defined strings. Escaping isn't needed by Mdx for now.

IMO, we should always think about escaping when writing a language that accepts arbitrary strings unless we are fine with writing "accept any string that doesn't contain '['" in the doc.

Julow · 2021-06-10T17:38:02Z

src/parser/test/test.ml

+        {|
+        ((output
+          (((f.ml (1 0) (1 46))
+            (code_block "ocaml env=f1 version>=4.06 " "code goes here"))))


Trailing whitespaces should be removed.

- The parser package can't its tests run because of a circular dependency with ppx_expect. - Run the tests from Odoc's build script. This is the build script that Dune would have generated. However: - Increase the required Dune version to 2.8. Which breaks esy.

jonludlam · 2021-06-23T17:27:40Z

I think we've spent enough time struggling with the "parser library and odoc in the same repo" problem. I propose to put the parser into github.com/ocaml-doc/odoc-parser to sidestep the issue. We've discussed this possibility before on its own merits and it's never been obvious whether it's the right thing to do or not, but we didn't really take into account the problems we're encountering. While the problems are temporary and I'm sure they can be worked around in due course, they either stops us merging useful changes while we figure out some new fix or we end up turning off testing. By separating the repositories we fix these issues in the 'common' way and thus are much more likely to stay on the beaten path and thus spend less time searching for fixes.

We can always remerge them later if we like!

jonludlam added 10 commits June 9, 2021 22:49

Odoc-parser opam package

b8f1bc1

Remove esy.lock

2317824

Fix esy build and install

8af7771

While the opam package has been split into odoc_parser and odoc, the esy build remains one. Package build and install steps have been amended for this.

Add support for code blocks with metadata

24e035c

Add Odoc_parser.Warnings.to_loc

fb230d8

Sanitise Odoc_parser interface some more

9ac901a

More tidying of the Odoc_parser interface

11ab8a5

Rename Odoc_parser.Location to Loc

5dd46c7

github actions fix

bbcc7d1

Fix dune files pointing at odoc_parser rather than odoc-parser

5c8478f

Fix stray reference to Octavius

744baba

Julow reviewed Jun 10, 2021

View reviewed changes

Julow mentioned this pull request Jun 10, 2021

Overrides for dune.2.8.0 esy-ocaml/esy-opam-override#87

Closed

Julow and others added 2 commits June 10, 2021 23:38

Update esy dune to 2.8

1494842

jonludlam closed this Jul 3, 2021

Odoc parser opam package #685

Odoc parser opam package #685

Uh oh!

Conversation

jonludlam commented Jun 9, 2021

Uh oh!

jonludlam commented Jun 9, 2021

Uh oh!

Julow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonludlam commented Jun 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants