Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: allow utf8 encoding #919

Merged
merged 1 commit into from
Nov 17, 2022
Merged

Conversation

rgrinberg
Copy link
Member

Signed-off-by: Rudi Grinberg me@rgrinberg.com

ps-id: ac0de96a-2c5f-4c90-b1af-40acdc1a8344

@rgrinberg rgrinberg added this to the 1.15.0 milestone Nov 16, 2022
@rgrinberg rgrinberg force-pushed the ps/rr/feature__allow_utf8_encoding branch 2 times, most recently from f48f53d to bfd0f6e Compare November 16, 2022 03:48
@ulugbekna
Copy link
Collaborator

Will review by Friday

Copy link
Collaborator

@ulugbekna ulugbekna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for nits

CHANGES.md Outdated Show resolved Hide resolved
(change : TextDocumentContentChangeEvent.t) =
(* Changes can only be applied using utf16 offsets *)
let version =
match version with
| None -> t.version + 1
| None -> t.document.version + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly offtopic but since we're on this: the version bump happens for each change event within a single textDocument/didChange notification (vs. once per notification). Is this what we want? I don't recall the version being useful to either client or server, but still

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. i got rid of this weirdness

Comment on lines 152 to 161
String.concat
~sep:""
[ String.sub t.text ~pos:0 ~len:start_offset
[ String.sub t.document.text ~pos:0 ~len:start_offset
; change.text
; String.sub
t.text
t.document.text
~pos:end_offset
~len:(String.length t.text - end_offset)
~len:(String.length t.document.text - end_offset)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also offtopic a bit: Given how many times and how often a document is edited, should we could use a data structure more specialized for updates (tree/rope?) or at least do the string concat manually and avoid two copies that happen because of String.sub calls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batching updates sounds useful but a rope wouldn't help much since we'd need to convert to a string for merlin anyway.

@@ -2,7 +2,7 @@ open Types

type t

val make : DidOpenTextDocumentParams.t -> t
val make : encoding:[ `UTF8 | `UTF16 ] -> DidOpenTextDocumentParams.t -> t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Would it make sense to specify that it's "position encoding" rather than "encoding"?
  2. Since position encoding is specified for all documents (ie docs don't vary by their pos encoding), should we perhaps not keep encoding for each document? Would it be better if the lsp library client passes the pos encoding to apply_content_change? I just think it's a bit of misrepresentation of the domain that every document can have different pos encoding, but I agree there isn't obvious better solution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I suppose so. It really only applies to updates anyway
  2. This would be better if the encoding could be changed between updates but this isn't the case. So neither reflect the domain very well.

What would be a better long term fix is to general Document_store and promote it into the lsp library.

let find_offset ~utf8 ~utf16_range:range =
exception Outside

let rec find_nth_nl str nth pos len =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be defined within find_nth_nl definition below as an auxiliary fn it is?

@ulugbekna
Copy link
Collaborator

Also, do I understand correctly that the text document encoding is still only utf-8 and is passed in the rpc message header, while this new encoding negotiation is only for the encoding that's used for denoting positions in the text document?

@rgrinberg
Copy link
Member Author

Also, do I understand correctly that the text document encoding is still only utf-8 and is passed in the rpc message header, while this new encoding negotiation is only for the encoding that's used for denoting positions in the text document?

Yeah. The summary is:

  1. the encoding for protocol packets is always utf8
  2. we always store everything internally as utf8
  3. only character offsets are interpreted as utf8 or utf16 depending on the client.

@rgrinberg rgrinberg force-pushed the ps/rr/feature__allow_utf8_encoding branch from bfd0f6e to 4000f49 Compare November 17, 2022 21:28
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

ps-id: ac0de96a-2c5f-4c90-b1af-40acdc1a8344
@rgrinberg rgrinberg force-pushed the ps/rr/feature__allow_utf8_encoding branch from 4000f49 to 4ffee3c Compare November 17, 2022 21:45
@rgrinberg rgrinberg merged commit d1bd355 into master Nov 17, 2022
ulugbekna added a commit to ulugbekna/opam-repository that referenced this pull request Dec 16, 2022
CHANGES:

## Fixes

- Fix document syncing for ranges that span an entire line (ocaml/ocaml-lsp#927)

- Respect the client's completion item resolve and preSelect capabilities
  (ocaml/ocaml-lsp#925, ocaml/ocaml-lsp#936)

- Disable polling for dune's watch mode on Windows and OCaml 4.14.0 (ocaml/ocaml-lsp#935)

## Features

- Semantic highlighting support is enabled by default (ocaml/ocaml-lsp#933)

- Re-enable `ocamlformat-rpc` for formatting code snippets (but not files) (ocaml/ocaml-lsp#920, ocaml/ocaml-lsp#939)

  One needs to have either `ocamlformat` version > 0.21.0 or, otherwise,
  `ocamlformat-rpc` package installed.

- Add custom ocamllsp/hoverExtended request (ocaml/ocaml-lsp#561)

- Support utf-8 position encoding clients (ocaml/ocaml-lsp#919)

- Upgrade to merlin 4.7 and use merlin's `verbosity=smart` by default, which
  allows unwrapping module alias types (ocaml/ocaml-lsp#942)

## Fixes

- Fix semantic highlighting of "long identifiers," e.g., `Foo.Bar.x` (ocaml/ocaml-lsp#932)
ulugbekna added a commit to ulugbekna/opam-repository that referenced this pull request Dec 16, 2022
CHANGES:

## Fixes

- Fix document syncing for ranges that span an entire line (ocaml/ocaml-lsp#927)

- Respect the client's completion item resolve and preSelect capabilities
  (ocaml/ocaml-lsp#925, ocaml/ocaml-lsp#936)

- Disable polling for dune's watch mode on Windows and OCaml 4.14.0 (ocaml/ocaml-lsp#935)

## Features

- Semantic highlighting support is enabled by default (ocaml/ocaml-lsp#933)

- Re-enable `ocamlformat-rpc` for formatting code snippets (but not files) (ocaml/ocaml-lsp#920, ocaml/ocaml-lsp#939)

  One needs to have either `ocamlformat` version > 0.21.0 or, otherwise,
  `ocamlformat-rpc` package installed.

- Add custom ocamllsp/hoverExtended request (ocaml/ocaml-lsp#561)

- Support utf-8 position encoding clients (ocaml/ocaml-lsp#919)

- Upgrade to merlin 4.7 and use merlin's `verbosity=smart` by default, which
  allows unwrapping module alias types (ocaml/ocaml-lsp#942)

## Fixes

- Fix semantic highlighting of "long identifiers," e.g., `Foo.Bar.x` (ocaml/ocaml-lsp#932)
ulugbekna added a commit to ulugbekna/opam-repository that referenced this pull request Dec 16, 2022
CHANGES:

## Features

- Add support for OCaml 5.0

- Semantic highlighting support is enabled by default (ocaml/ocaml-lsp#933)

- Re-enable `ocamlformat-rpc` for formatting code snippets (but not files) (ocaml/ocaml-lsp#920, ocaml/ocaml-lsp#939)

  One needs to have either `ocamlformat` version > 0.21.0 or, otherwise,
  `ocamlformat-rpc` package installed.

- Add custom ocamllsp/hoverExtended request (ocaml/ocaml-lsp#561)

- Support utf-8 position encoding clients (ocaml/ocaml-lsp#919)

- Upgrade to merlin 4.7 and use merlin's `verbosity=smart` by default, which
  allows unwrapping module alias types (ocaml/ocaml-lsp#942)

## Fixes

- Fix document syncing for ranges that span an entire line (ocaml/ocaml-lsp#927)

- Respect the client's completion item resolve and preSelect capabilities
  (ocaml/ocaml-lsp#925, ocaml/ocaml-lsp#936)

- Disable polling for dune's watch mode on Windows and OCaml 4.14.0 (ocaml/ocaml-lsp#935)

- Fix semantic highlighting of "long identifiers," e.g., `Foo.Bar.x` (ocaml/ocaml-lsp#932)
ulugbekna added a commit to ulugbekna/opam-repository that referenced this pull request Jan 7, 2023
CHANGES:

## Features

- Enable [semantic highlighting](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_semanticTokens)
  support by default (ocaml/ocaml-lsp#933)

- Support connecting over pipes and socket. Pipes on Windows aren't yet
  supported (ocaml/ocaml-lsp#946)

  [More](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#implementationConsiderations)
  about communication channels in LSP specification.

- Re-enable `ocamlformat-rpc` for formatting code snippets (but not files and
  not on Windows) (ocaml/ocaml-lsp#920, ocaml/ocaml-lsp#939)

  One needs to have installed either `ocamlformat` package version > 0.21.0 or,
  otherwise, `ocamlformat-rpc` package. Note that previously `ocamlformat-rpc`
  came in a standalone OPAM package, but since `ocamlformat` version > 0.21.0,
  it comes within `ocamlformat` package.

- Add custom
  [`ocamllsp/hoverExtended`](https://github.com/ocaml/ocaml-lsp/blob/e165f6a3962c356adc7364b9ca71788e93489dd0/ocaml-lsp-server/docs/ocamllsp/hoverExtended-spec.md#L1)
  request (ocaml/ocaml-lsp#561)

- Support utf-8 position encoding clients (ocaml/ocaml-lsp#919)

  [More](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#position) about position encoding in LSP specification.

- Show unwrapped module alias types on hovering over module names. This is due
  to upgrading to merlin 4.7 and using merlin's `verbosity=smart` by default
  (ocaml/ocaml-lsp#942)

## Fixes

- Respect the client's completion item resolve and preSelect capabilities
  (ocaml/ocaml-lsp#925, ocaml/ocaml-lsp#936)

- Disable polling for dune's watch mode on Windows and OCaml 4.14.0 (ocaml/ocaml-lsp#935)

- Fix semantic highlighting of "long identifiers," e.g., `Foo.Bar.x` (ocaml/ocaml-lsp#932)

- Fix syncing of document contents:

  - For ranges that span an entire line (ocaml/ocaml-lsp#927)
  - Previously, whole line edits would incorrectly eat the newline characters (ocaml/ocaml-lsp#971)
ulugbekna added a commit to ulugbekna/opam-repository that referenced this pull request Jan 7, 2023
CHANGES:

## Features

- Add support for OCaml 5.0

- Enable [semantic highlighting](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_semanticTokens)
  support by default (ocaml/ocaml-lsp#933)

- Support connecting over pipes and socket. Pipes on Windows aren't yet
  supported (ocaml/ocaml-lsp#946)

  [More](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#implementationConsiderations)
  about communication channels in LSP specification.

- Re-enable `ocamlformat-rpc` for formatting code snippets (but not files and
  not on Windows) (ocaml/ocaml-lsp#920, ocaml/ocaml-lsp#939)

  One needs to have installed either `ocamlformat` package version > 0.21.0 or,
  otherwise, `ocamlformat-rpc` package. Note that previously `ocamlformat-rpc`
  came in a standalone OPAM package, but since `ocamlformat` version > 0.21.0,
  it comes within `ocamlformat` package.

- Add custom
  [`ocamllsp/hoverExtended`](https://github.com/ocaml/ocaml-lsp/blob/e165f6a3962c356adc7364b9ca71788e93489dd0/ocaml-lsp-server/docs/ocamllsp/hoverExtended-spec.md#L1)
  request (ocaml/ocaml-lsp#561)

- Support utf-8 position encoding clients (ocaml/ocaml-lsp#919)

  [More](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#position) about position encoding in LSP specification.

- Show unwrapped module alias types on hovering over module names. This is due
  to upgrading to merlin 4.7 and using merlin's `verbosity=smart` by default
  (ocaml/ocaml-lsp#942)

## Fixes

- Respect the client's completion item resolve and preSelect capabilities
  (ocaml/ocaml-lsp#925, ocaml/ocaml-lsp#936)

- Disable polling for dune's watch mode on Windows and OCaml 4.14.0 (ocaml/ocaml-lsp#935)

- Fix semantic highlighting of "long identifiers," e.g., `Foo.Bar.x` (ocaml/ocaml-lsp#932)

- Fix syncing of document contents:

  - For ranges that span an entire line (ocaml/ocaml-lsp#927)
  - Previously, whole line edits would incorrectly eat the newline characters (ocaml/ocaml-lsp#971)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants