Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with odoc-parser.2.3.0 #1184

Merged
merged 4 commits into from
Oct 22, 2023

Conversation

panglesd
Copy link
Collaborator

This PR makes ocaml-lsp compatible with the soon release odoc-parser.2.3.0

odoc-parser adds two new constructs:

  • "Outputs" of code blocks. This is meant to be generated by the code blocks and promoted by mdx.
  • Tables

Tables are tricky, as odoc tables are allowed to have some block content inside ("nestable" block contents) while markdown is not.
This means that when translating odoc tables into cmarkit tables, we need to turn blocks into inlines. This issue is responsible for most of the code of the PR!

@panglesd panglesd marked this pull request as draft September 21, 2023 11:05
@panglesd
Copy link
Collaborator Author

I'm converting to draft as this should only be released when odoc-parser 2.3.0 is released.

panglesd added a commit to panglesd/ocaml-lsp that referenced this pull request Sep 21, 2023
Signed-off-by: Paul-Elliot <peada@free.fr>
crackcomm pushed a commit to crackcomm/ocaml-lsp that referenced this pull request Oct 2, 2023
Signed-off-by: Paul-Elliot <peada@free.fr>
@jonludlam
Copy link
Member

odoc-parser.2.3.0 is now released.

I'd like to suggest that the parser is vendored in order to increase compatibility. odoc-parser is also vendored by ocamlformat and mdx for similar reasons. Right now we can't co-install odoc and ocaml-lsp-server because of this.

Copy link
Contributor

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is not a draft and is ready to be reviewed.

@@ -165,6 +212,68 @@ let rec nestable_block_element_to_block
let meta = loc_to_meta location in
Block.Ext_math_block (code_block, meta)

and nestable_block_element_to_inlines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function might be a lot of maintenance and hurts the content.
Would it be reasonable to use raw HTML tags for tables ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this function is not ideal...

An advantage of not using proper table compared to raw html is that the output might be more readable even without translating it to html. Also, we get cmarkit type guaranty.

Raw html tags could be an idea, but their start-end semantics not that simple (https://spec.commonmark.org/0.30/#html-blocks).

In particular, it seems to break out of the containing block. For instance,

- first item
  + first subitem
  + second subitem

renders by github as:

  • first item
    • first subitem
    • second subitem

while with html inside (the blank line is the end condition fot the uninterpreted html)

- first item
  + first subitem
<table><tr><td>

useless table
</td></tr></table>

  + second subitem

renders as

  • first item
    • first subitem

useless table

  • second subitem

@voodoos would you like me to investigate the "raw html tag" solution?

Copy link
Collaborator

@voodoos voodoos Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, it seems to break out of the containing block

So in the case of tables the content would break out of the table ?
It looks like raw html blocks are only intended to be used at the top level...

Would using only raw html (for the entire table, not only the cell content) be an option ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the case of tables the content would break out of the table ?

What I meant is that the whole table would break out of its container block (recall that all of this is because, in odoc, tables are allowed to be nested in blocks, while it is not allowed in markdown).

For instance, in the example above, I had in mind a table inside a list (which is possible in odoc).

Julow suggested (IIUC) to render in this case the table, not by using markdown tables, but by embedding raw <table>/<tr>/<td> before the table/line/cell, and the corresponding closing tag after.
The markdown example above was thus the translation of a (nested) list containing a table, using this method.

However, we can see, from how github renders it, that in fact the table is not interpreted as inside the list, but "toplevel", and thus also breaks the rest of the list.
(I included a nested list to make it more visible that the included html breaks the flow of the list interpretation, but it might have created noise!)

It seems that cmarkit type system won't enforce constraint on raw html embedding, which is frightening!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that's not suitable since it would also break more visibly things like numbered lists (see below).
So far I think the more convincing workaround is your current proposition.

- first item
  1. first subitem
<table><tr><td>

useless table
</td></tr></table>

  2. second subitem
  • first item
    1. first subitem

useless table

  1. second subitem

Copy link
Collaborator Author

@panglesd panglesd Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: The example was wrong, it is in fact possible to have html inside list items in commonmark.

(With the help of cmarkit) I could generate the right markdown for the example above:

So:

1. first item
   1. first subitem
      <table><tr><td>
      
      useless table
      </td></tr></table>
      
   2. second subitem

renders as

  1. first item
    1. first subitem

      useless table

    2. second subitem

So I think the suggestion to use html is still valid, and the discussion open again. Speaking of tables, let me use one to summarize the differences:

Current approach Html embedding
Source 🟠 Source is not very readable due to tables not properly formatted, and blocks turned into inlines 🟠 Source is not very readable, due to html tags and blank lines added all over the place
Faithful rendering 🔴 The content is modified 🟢 The content is preserved
Compatibility 🟠 The renderer needs to know about tables, which is not in the commonmark standard 🟠 The parser needs to follow the commonmark standard, which is different from the original one if I understand correctly
Rendering target 🟢 No constraint 🔴 Has to be HTML/markdown, otherwise (eg in latex) embedded html will certainly be ignored
Scariness 🟢 Not scary, a problem won't leak outside of the table 🟠 Html embedding is a litlle bit scary as making a mistake can break the whole output...
Maintainability 🔴 Adds maintenance 🟢 Does not add maintenance

If the "rendering target" can handle embedded html, maybe the html embedding approach is better? The "content is preserved" is quite a strong benefit, I guess...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the specification, LSP clients that have the capability to parse markdown should follow the Github Flavored Markdown Specification.

That plays in favor of Html but they also warn that:

Please Note that clients might sanitize the return markdown. A client could
decide to remove HTML from the markdown to avoid script execution.

Copy link
Collaborator

@voodoos voodoos Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there is more: the client can advertise which tags will actually be allowed in markdown with the "MarkdownClientCapabilities.allowedTags" capability.

If we want to respect the protocol we need to have at least two fallbacks:

  • A text-only version when markdown is not allowed. (Right now ocaml-lsp simply prints the raw comment which is okay)
  • A markdown without html when required tags are not allowed

And we could do better by using raw html when the client accepts it.

In any case we need to keep the current version, and I'm not sure having the additional html one is really worth it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case we need to keep the current version, and I'm not sure having the additional html one is really worth it ?

I agree with this analysis, and I do not think it is worth it: in the current approach, the content is only hurt in the case of table/lists inside tables, which I think will happen very seldomly.

If it turns out a problem, we can add the html approach later.

CHANGES.md Outdated Show resolved Hide resolved
@voodoos voodoos marked this pull request as ready for review October 10, 2023 14:46
panglesd added a commit to panglesd/ocaml-lsp that referenced this pull request Oct 16, 2023
Signed-off-by: Paul-Elliot <peada@free.fr>
@panglesd
Copy link
Collaborator Author

The test is failing because the test depend on ocamlformat 0.24.1, which depends on odoc-parser >= "2.0.0" & < "2.3.0". So the dependencies and test dependencies are not co-installable!

The solutions are:

  • Switch to a higher ocamlformat version where odoc-parser has been vendored, or
  • vendor odoc-parser.

In any case, I guess it would make more sense in another PR?

@voodoos
Copy link
Collaborator

voodoos commented Oct 16, 2023

I'd like to suggest that the parser is vendored in order to increase compatibility

I am not fond of vendoring, but this looks like a legitimate use case. As we see with this PR, reducing co-instability with ocamlformat which is a commonly used tool along ocaml-lsp might impact a lot of users.

In any case, I guess it would make more sense in another PR?

I think we should do the vendoring in that PR, rename it Upgrade to odoc-parser 2.3.0 and allow the CI to run correctly.

@panglesd
Copy link
Collaborator Author

I think we should do the vendoring in that PR

Then I'll do that!

Just to be sure, would a vendoring in the style of dune or mdx, with a vendor directory and an update-odoc-parser.sh script downloading the sources from a given hash, be good for this project? Or (as I'm seeing a submodule directory) would you rather have the vendoring made with git submodules?

@voodoos
Copy link
Collaborator

voodoos commented Oct 16, 2023

Just to be sure, would a vendoring in the style of dune or mdx, with a vendor directory and an update-odoc-parser.sh script downloading the sources from a given hash, be good for this project? Or (as I'm seeing a submodule directory) would you rather have the vendoring made with git submodules?

@rgrinberg do you have a preference ?

@rgrinberg
Copy link
Member

Just copying the part that you need is good enough for me. Submodules are more useful when you're also developing the vendored component in conjunction with this repo. I don't have the intention to do that, and I assume you two don't either.

@rgrinberg
Copy link
Member

By the way, isn't this a breaking change in odoc-parser? Shouldn't there be a major version bump.

I also don't see anything in the change log regarding the breaking change.

@panglesd
Copy link
Collaborator Author

panglesd commented Oct 17, 2023

The odoc parser was in a separate repo (https://github.com/ocaml-doc/odoc-parser, the README needs to be updated to reflect this) but it was merged back in the odoc repository in ocaml/odoc#973.

Currently, the version number from odoc and odoc-parser are tied together (so, 2.3.0), but I cannot find a reason for this in the past discussions. I can only find:

We're in the happy position that the current releases of odoc-parser and odoc are such that we can coalesce the version numbers so we'll have 2.3.0 as the next release of both, which is one less thing to worry about.

but I agree that it's not exactly true, since odoc-parser needed a major bump. There is also ocaml/odoc#973 (comment) which opens up to untie the odoc and odoc-parser versions later, if I understand correctly.

panglesd added a commit to panglesd/ocaml-lsp that referenced this pull request Oct 17, 2023
Signed-off-by: Paul-Elliot <peada@free.fr>
@panglesd panglesd force-pushed the odoc-parser-compatibility branch 2 times, most recently from 0d3258b to 56796c7 Compare October 17, 2023 07:55
Signed-off-by: Paul-Elliot <peada@free.fr>
Signed-off-by: Paul-Elliot <peada@free.fr>
Comment on lines 3 to 5
(library
(name odoc_parser)
(libraries astring result camlp-streams))
Copy link
Collaborator Author

@panglesd panglesd Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dune's vendoring, all the dune files I took inspiration from (such as this one) have (wrapped false).

I'm not sure if it's needed, but it still compiles without it..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed. It's needed under the following circumstance: the code is being vendored in a library so we need to avoid name collisions with the original library if we link it to the same binary.

In this case, we're using this odoc parser only in ocamllsp, so we can drop it.

@panglesd
Copy link
Collaborator Author

The vendoring is ready for review. The compatiblity patch needs a new conclusion to the thread started by #1184 (comment) !

dune-project Outdated
@@ -56,7 +56,9 @@ possible and does not make any assumptions about IO.
ordering
dune-build-info
spawn
(odoc-parser (and (>= 2.0.0) (< 2.3.0)))
astring
result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this shim. we don't support versions of OCaml that don't have the result module.

@panglesd panglesd force-pushed the odoc-parser-compatibility branch 2 times, most recently from 4df27bb to db23285 Compare October 18, 2023 06:28
@panglesd
Copy link
Collaborator Author

I opened ocaml-opam/opam-repository-mingw#12 for the windows CI failure.

flake.nix Outdated Show resolved Hide resolved
flake.nix Outdated Show resolved Hide resolved
Signed-off-by: Paul-Elliot <peada@free.fr>
@rgrinberg rgrinberg added this to the 1.17.0 milestone Oct 22, 2023
_
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
@rgrinberg rgrinberg merged commit c6eafd8 into ocaml:master Oct 22, 2023
8 of 9 checks passed
voodoos added a commit to voodoos/opam-repository that referenced this pull request Dec 18, 2023
CHANGES:

## Fixes

- Fix missing super & subscripts in markdown documentation. (ocaml/ocaml-lsp#1170)

- Do not invoke dune at all if `--fallback-read-dot-merlin` flag is on. (ocaml/ocaml-lsp#1173)

- Fix semantic highlighting of infix operators that contain '.'. (ocaml/ocaml-lsp#1186)

- Disable highlighting unit as an enum member to fix comment highlighting bug. (ocaml/ocaml-lsp#1185)

- Improve type-on-hover and type-annotate efficiency by only formatting the type
  of the first enclosing. (ocaml/ocaml-lsp#1191, ocaml/ocaml-lsp#1196)

- Fix the encoding of URI's to match how vscode does it (ocaml/ocaml-lsp#1197)

- Fix parsing of completion prefixes (ocaml/ocaml-lsp#1181)

## Features

- Compatibility with Odoc 2.3.0, with support for the introduced syntax: tables,
  and "codeblock output" (ocaml/ocaml-lsp#1184)

- Display text of references in doc strings (ocaml/ocaml-lsp#1166)

- Add mark/remove unused actions for open, types, for loop indexes, modules,
  match cases, rec, and constructors (ocaml/ocaml-lsp#1141)

- Offer auto-completion for the keyword `in` (ocaml/ocaml-lsp#1217)
voodoos added a commit to voodoos/ocaml-lsp that referenced this pull request Jan 29, 2024
commit ccedef9
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Wed Jan 3 17:42:08 2024 +0100

    fmt

commit 1c3ebfd
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Wed Jan 3 17:41:52 2024 +0100

    rename: support project-wide but unsafe renaming

commit 7105c12
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Thu Nov 30 16:22:28 2023 +0100

    wip: make rename project wide

commit ba1d99a
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Thu Nov 30 14:49:28 2023 +0100

    Notify the user when the index is out of date

commit ef50a83
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Wed Nov 29 19:00:28 2023 +0100

    wip: get desync info from merlin

commit 5265d22
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Wed Nov 22 11:12:48 2023 +0100

    Handle new `unit_name` config directive

commit 2367339
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Thu Oct 26 15:50:49 2023 +0200

    Fix merlin config merging

commit 55ab533
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Wed Oct 25 16:33:38 2023 +0200

    config: accept index_file directive

commit e353220
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Fri Sep 29 14:23:32 2023 +0200

    Apply compiler libs changes

commit 69fea3a
Author: Ulysse Gérard <thevoodoos@gmail.com>
Date:   Tue Jun 13 17:42:00 2023 +0200

    Support project wide occurrences

commit d3d8de5
Author: Ulysse <5031221+voodoos@users.noreply.github.com>
Date:   Mon Dec 18 16:34:42 2023 +0100

    Prepare release 1.17.0 (ocaml#1219)

    * chore: restore lost version number in changes
         (erased in c8c1096)

    * chore: bump version number in changes before release

commit f9de892
Author: Ulugbek Abdullaev <ulugbekna@gmail.com>
Date:   Mon Dec 11 19:51:35 2023 +0100

    Completion for `in` keyword (ocaml#1217)

    feat(auto-completion): auto-complete `in` (in a hacky way) in .ml files

commit 9dc8de2
Author: Jules Aguillon <jules@j3s.fr>
Date:   Mon Nov 27 04:29:17 2023 +0100

    Upgrade OCamlformat to 0.26.1 (ocaml#1177)

commit 2ffaea0
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Sun Nov 26 20:29:02 2023 -0600

    chore: move dune-release to own shell (ocaml#1215)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit 17eec24
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Sun Nov 26 20:07:08 2023 -0600

    chore: update flakes (ocaml#1214)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit 73bf594
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Sun Nov 26 19:51:42 2023 -0600

    Completion edge cases (ocaml#1212)

    * Tests passing

    * Added support for whitespace in completion
    The solution here is to change all whitespace to spaces for ease of regex matching(all whitespace is equivelent semantically) and then remove all spaces from the prefix that's passed to merlin.

    Co-authored-by: faldor20 <eli.jambu@gmail.com>
    Co-authored-by: faldor20 <eli.jambu@yahoo.com>

commit 3b3b9bc
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Sun Nov 26 18:58:33 2023 -0600

    fix: hack to workaround dune bug (ocaml#1213)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit c6eafd8
Author: panglesd <peada@free.fr>
Date:   Sun Oct 22 09:32:24 2023 +0200

    Compatibility with odoc-parser.2.3.0 (ocaml#1184)

    * compatibility with odoc-parser.2.3.0

    Signed-off-by: Paul-Elliot <peada@free.fr>

commit 6f78852
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Tue Oct 17 23:36:08 2023 -0600

    fix(test): correctly qualify pp dep (ocaml#1203)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit 6bda990
Author: Bao Zhiyuan <71200607+bzy-debug@users.noreply.github.com>
Date:   Wed Oct 18 13:35:39 2023 +0800

    Fix uri to_string (ocaml#1197)

    The current implementation is not the same as uri.ts

    see microsoft/vscode-uri@96acdc0/src/uri.ts#L601

    the ts code just checks if authority is empty, while ocaml code checks
    if authority is "file"

    Co-authored-by: Bao Zhiyuan <baozhiyuan@idea.edu.cn>

commit e0c4df3
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Mon Oct 16 22:36:44 2023 -0600

    fix: odoc-parser constraints (ocaml#1204)

    we aren't compatible with 2.3.0

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit eca77cf
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Mon Oct 16 14:58:31 2023 -0600

    chore: more CHANGES formatting (ocaml#1202)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit 208683e
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Mon Oct 16 12:25:13 2023 -0600

    chore: better formatting in CHANGES (ocaml#1201)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit ce2e46b
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Mon Oct 16 11:21:19 2023 -0600

    chore: update flakes (ocaml#1195)

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit 841fe5c
Author: Rudi Grinberg <me@rgrinberg.com>
Date:   Mon Oct 16 11:16:05 2023 -0600

    refactor: fix against newer versions of dune (ocaml#1200)

    removes the incomplete pattern match warning on the progress

    Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

commit 672ca1b
Author: Ulysse <5031221+voodoos@users.noreply.github.com>
Date:   Wed Oct 4 00:14:05 2023 +0200

    Only return first result for type_enclosing. (ocaml#1196)

    in `type-annotate` code action.
nberth pushed a commit to nberth/opam-repository that referenced this pull request Jun 18, 2024
CHANGES:

## Fixes

- Fix missing super & subscripts in markdown documentation. (ocaml/ocaml-lsp#1170)

- Do not invoke dune at all if `--fallback-read-dot-merlin` flag is on. (ocaml/ocaml-lsp#1173)

- Fix semantic highlighting of infix operators that contain '.'. (ocaml/ocaml-lsp#1186)

- Disable highlighting unit as an enum member to fix comment highlighting bug. (ocaml/ocaml-lsp#1185)

- Improve type-on-hover and type-annotate efficiency by only formatting the type
  of the first enclosing. (ocaml/ocaml-lsp#1191, ocaml/ocaml-lsp#1196)

- Fix the encoding of URI's to match how vscode does it (ocaml/ocaml-lsp#1197)

- Fix parsing of completion prefixes (ocaml/ocaml-lsp#1181)

## Features

- Compatibility with Odoc 2.3.0, with support for the introduced syntax: tables,
  and "codeblock output" (ocaml/ocaml-lsp#1184)

- Display text of references in doc strings (ocaml/ocaml-lsp#1166)

- Add mark/remove unused actions for open, types, for loop indexes, modules,
  match cases, rec, and constructors (ocaml/ocaml-lsp#1141)

- Offer auto-completion for the keyword `in` (ocaml/ocaml-lsp#1217)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants