Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Rust types from the official specification #151

Closed
ebkalderon opened this issue Jan 18, 2023 · 9 comments
Closed

Generate Rust types from the official specification #151

ebkalderon opened this issue Jan 18, 2023 · 9 comments
Assignees
Labels
feature-request Request for new features or functionality rust

Comments

@ebkalderon
Copy link

ebkalderon commented Jan 18, 2023

Introduction

It would be awesome if this project also supported generating Rust types from the official spec as well.

This ticket was created in response to a brief discussion with @brettcannon over at ebkalderon/tower-lsp#361 (comment) and is intended track future developments. Happy to get some productive dialog going! 😄

Background

Considering that installing and running a Python package as a prerequisite from a cargo build would be pretty messy, I would assume this effort would likely entail the creation of a pure Rust equivalent that can consume the same spec. Consider the following setup:

  • An lsprotocol-codegen crate would process the spec and generate a Rust file containing type and optionally trait definitions.
    • Provided as a library intended to be used from a Cargo build.rs script.
    • Essentially amounts to a simple generate() function that takes an input spec and some configuration switches.
    • Generates an lsprotocol.rs file in $OUT_DIR that can be included in a Rust project with:
      include!(concat!(env!("OUT_DIR"), "/lsprotocol.rs"));
    • Could also ship with a src/bin.rs that wraps this library and provides a CLI interface, should others prefer to use that.
    • See other popular build helper crates for an example of this interface, e.g. prost-build (Protobuf), tonic-build (gRPC), and dbus-codegen (dbus) for prior art.
    • Consider using schemafy for processing the JSON Schema (draft 4) file in this repo.
  • An lsprotocol crate would export native Rust types to the user.
    • Main crate that most downstream users would consume.
    • These types would be generated in advance by lsprotocol-codegen and vendored directly into the repository.
    • Updated periodically on an as-needed basis (perhaps automatically using a GitHub CI job) and then published to Crates.io.
    • Generated types would implement most common std traits (Clone, Debug, Default, Eq, PartialEq, etc).
    • Generated types would implement serde::Serialize and serde::Deserialize for easy (de)serialization to and from JSON using serde_json.
    • See the community de-facto standard Rust crate lsp-types for prior art.
      • Something with an API that looks very similar to this would be amazing! Dumb structs with all pub fields, enums used to represent multiple states, and everything implements Serialize and Deserialize.
      • The included Notification and Request traits are very handy for generic code. Would love to have that here too.
      • Not sure if the helper macros are essential to replicate. I rarely use them, personally, but some may find them nice.
      • Please note that the structs, enums, traits, and documentation in this particular crate are all updated by hand, at present. As such, a source of auto-generated Rust types would be wonderful to have (less manual labor for maintainers while delivering quicker updates to downstream users).

Ideally, the above Rust crates would share the same spec file as the Python project so the codegen for both could be tested for correctness in CI.

Open Questions

One notable open question is how we should best handle "proposed" features. It would be nice if downstream users of lsprotocol could opt into certain not-quite-standardized features in advance, provided they explicitly agree to the API instability when switching this on. The de-facto community standard crate lsp-types does precisely this (as does my own downstream LSP library for Rust tower-lsp) using an off-by-default proposed = [] Cargo feature which, if enabled at compile-time, would activate these types in the public API.

If we choose to offer the same thing here, it would be fairly trivial on the surface: the lsprotocol crate would expose Rust types for the entire superset of the LSP specification, marking certain types #[cfg(feature = "proposed")] and offering users an off-by-default proposed = [] feature they can choose to enable if they wish. This would fall in line with other popular LSP crates used in the community today.

However, what remains to be seen is how fancy we'd like to be with lsprotocol-codegen. Consider the following questions:

  1. Does the current spec file used in this repo include proposed features at all? Is this something the equivalent Python lsprotocol package supports today? Is this something we even want to support?
  2. If we do want to support proposed protocol features in this crate, how should lsprotocol-codegen implement said support?
    • Presumably, lsprotocol-codegen would output a single lsprotocol.rs file containing all types with some marked #[cfg(feature = "proposed")], ready to be vendored into this repo as lsprotocol/src/lib.rs and published directly to Crates.io as the lsprotocol crate.
    • Should we add a config switch to lsprotocol_codegen::generate() and the CLI interface (if we choose to provide one) for the user to change the "proposed" feature name string in the output #[cfg]s to something else?
    • Should we support a config switch for not including types for proposed features in the lsprotocol.rs output entirely?
    • I don't think either of the above two bullets are at all necessary for an MVP, but may be nice to have down the road.

Another open question is regarding specification version support in general. Do we support the latest version of the LSP spec only? Or do we want to be able to perform codegen for all available versions of the spec? I presume the former, but I think it's good to get solid confirmation on this.

Versioning

Crates published to Crates.io are expected to adhere to semantic versioning as strictly as possible. This is quite significant for us because adding new pub fields to an existing struct or new variants to an existing enum is considered a breaking API change, unless those types are explicitly marked #[non_exhaustive] (reference), at which point users are forced to include a _ => fallthrough case or wildcard .. when matching on or destructuring those types.

We should not annotate every single type in the generated code with #[non_exhaustive], of course, since this would severely hamper the crate's usability. Downstream code would not be able to directly construct instances of any of the generated structs, even with ..Default::default() syntax, even though all fields are pub (see bug report rust-lang/rust#70564). We should carefully consider whether or not to mark certain types #[non_exhaustive] and when it would be most applicable to do so.

Thankfully, serde seems to support serializing from and deserializing to #[non_exhaustive] types out of the box, so this should not be a factor in us deciding when or not to apply this attribute in the lsprotocol-codegen output.

@karthiknadig karthiknadig self-assigned this Jan 20, 2023
@karthiknadig karthiknadig added feature-request Request for new features or functionality and removed triage-needed labels Jan 20, 2023
@karthiknadig
Copy link
Member

It is wonderful to see interest in this.

Does the current spec file used in this repo include proposed features at all?

@dbaeumer Does the LSP model json contain proposed features.

Is this something the equivalent Python lsprotocol package supports today? Is this something we even want to support?

The python package does not explicitly enable or disable proposed feature. We could add support for this in the Rust code.

If we do want to support proposed protocol features in this crate, how should lsprotocol-codegen implement said support?

I like the first idea: lsprotocol-codegen would output a single lsprotocol.rs file containing all types with some marked #[cfg(feature = "proposed")], ready to be vendored into this repo as lsprotocol/src/lib.rs and published directly to Crates.io as the lsprotocol crate.

Do we support the latest version of the LSP spec only? Or do we want to be able to perform codegen for all available versions of the spec?

The LSP model in this repro was generated with 3.17.0. We currently generate from the latest only. Before this we did not have a JSON model for the LSP spec.

@ebkalderon
Copy link
Author

I don't think the current 3.17.0 specification includes any proposed features, but it would be awesome if the JSON model and the generator crate had support for this going forward. Previous versions did have some proposed features (inline value and inlay hints for a while, I think), so having a proposed feature present in the lsprotocol crate from the beginning would be nice, even if it's not used at the moment.

@dbaeumer
Copy link
Member

Yes, the model has support for proposed features. See the proposed properties in the meta meta model itself: https://github.com/microsoft/vscode-languageserver-node/blob/56c23c557e3568a9f56f42435fd5a80f9458957f/tools/src/metaModel.ts#L1

@karthiknadig
Copy link
Member

@ebkalderon I have started work on this. I created a meta item to track various bits of work. #164

@karthiknadig
Copy link
Member

karthiknadig commented Feb 16, 2023

Just created a PR with some code generation just to test out the plugin and setup some linting with rust. With #166 merged, you should be able to contribute to code generation.

#166 creates some integer enums from spec, it does not have code yet to include doc strings and other bits.

@ebkalderon
Copy link
Author

That's awesome! Really glad to see a pull request land on this. 😄

@ebkalderon
Copy link
Author

Love what I'm seeing in #166, though I think we should probably remove the Cargo.lock from the repository (Cargo FAQ).

I'm not familiar with this particular nuance of the specification, but does LSP include the full -3200 to -32099 range of non-standard error codes as defined in JSON-RPC 2.0? If so, then perhaps an ErrorCode data structure like this would be more semantically correct:

pub enum ErrorCode {
    ParseError,
    InvalidRequest,
    MethodNotFound,
    InvalidParams,
    InternalError,
    ServerNotInitialized,
    UnknownErrorCode,
    Other(i64),
}

I used macro_rules! in this example just to demonstrate the concept, but presumably the Python code generator would output the complete code block instead: Playground link

Any thoughts on this?

@karthiknadig
Copy link
Member

@ebkalderon Enum types that can be extended have "supportsCustomValues": true in the spec. So, ErrorCode, LSPErrorCode, CodeActionKinds, etc have supportsCustomValues set to true.

@karthiknadig
Copy link
Member

Published alpha version of lsprotocol : https://crates.io/crates/lsprotocol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality rust
Projects
None yet
Development

No branches or pull requests

4 participants