Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Starlark language server project proposal #1

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

cgrindel
Copy link
Owner

  • Proposal for Kickstarter-like project to implement Starlark language server

Copy link

@sluongng sluongng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments based on my personal experience of trying to go down this road once before and realizing that it was too difficult to walk alone. 🥲

As noted in bazelbuild/bazel#18703, Buck2 is very attractive to me because it includes a Starlark LSP with native rules supported out of the box. The developer experience is much better.

We have also had a recent thread in Slack that an OSS Starlark LSP is much desired by the community. So I hope this project gets funded 🤝

docs/project_proposal.md Show resolved Hide resolved
- code lens (provide contextual information and actions)
- code folding
- rename variables/functions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some org such as SourceGraph could be interested in LSIF / SCIP integration.

Especially in bigger repositories where you have to index a lot of files to support "Find references", having the ability to pre-seed the LSP index with LSIF might be necessary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jetbrain folks could be interested in including a BSP implementation as well.

| Why | Fancy features like go to definition, hover information and auto complete in your favorite editor/IDE. (see [Language Server Protocol] for more details) |
| Who | [Chuck Grindel] |
| When | As soon as we meet our funding goals. |
| Language | Implement the server using [Go]. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are 3 reasonable choices for implementation:

  • Java, as part of Bazel JVM daemon

  • Go, as proposed here

  • Rust

It would be interesting to have a discussion regarding pros/cons of each approach. I am not convinced that Go is the best solution here given the Java approach is more tightly integrated with Bazel and Rust offers better performance and a more advanced parser (supporting type hint, which gona be needed to provide Code Completion down the line).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I should update the proposal to include my reasoning for using Go. My plan is to leverage Starlark Go and LSP in Go. Plus, it could be easier to leverage code from buildifier and bulldozer, given that they are implemented in Go.

About using Java, the only real pro would be to leverage existing Bazel code. However, I am leary of trying to hook into Bazel internals. It feels brittle.

About using Rust, TBH, I did not consider it. My focus has primarily been on Go to leverage existing libraries as I stated above. Not having done any implementation in Rust, I don't have a good feel for the benefits.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust starlark implementation (by Meta) already included an LSP implementation that is mostly functional https://github.com/facebookexperimental/starlark-rust/tree/main/starlark/src/lsp. This is different from Buck2's own starlark LSP, which is extended this LSP to provide Buck2's native rules and builtin functions. So you could already use this LSP implementation today to navigate within 1 file in a Bazel project.

The LSP is also built on top of the same library that made rust-analyzer, the canonical LSP for rust ecosystem and battle-tested by Meta.

So I suspect you would get a much higher ROI by using Rust over Go in this regard.

About using Java, the only real pro would be to leverage existing Bazel code. However, I am leary of trying to hook into Bazel internals. It feels brittle.

I would say the same for buildifier and Starlark Go code base having worked with them in the past.
However, there is no denying that the source of truth IS Bazel's Java.


Please feel free to pick which ever is more comfortable to you. I think they are all good approaches with different tradeoffs.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sluongng Ah. I was not aware of the Rust Starlark LSP. I will check it out.

docs/project_proposal.md Outdated Show resolved Hide resolved

1. Initial server implementation plus support for go to definition and find references.
1. Support hover information.
1. Support auto completion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, if implemented now, autocompletion in Starlark would be, at best, the same with Python/JS without using type hinting: fuzzy and inaccurate.

You would only get better, more accurate completions if type hint is supported in current Bazel's Java Starlark implementation, similar to how the Rust implementation supports.

So the cost of implementing this phase should need to be accounted for early so that we could pick the right implementation with the lowest friction to accomplish the goal.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be challenging as well to provide Bazel's native rules autocompletion for different Bazel versions.

There isn't an API today in Bazel daemon that you could query for all the native rules definitions/attributes and use that as a basis for completion / displaying documentation on hovering. So if the implementation is independent of Bazel JVM daemon, I could only see this being hardcoded to a small subset of Bazel versions.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been a little worried that my estimates for auto-completion is low. 😬

There will be some issues with native rules. I think building in a database of signatures by Bazel version may be the only way forward. In fact, my estimate for getting this right for multiple Bazel versions is low.

docs/project_proposal.md Show resolved Hide resolved
@sluongng
Copy link

This is a separate point, but I would suggest including anonymized telemetry data by default.

As a free open-source project, it would help drive future funding and feature decisions related to the project.
There should be an opt-out config knob that lets organizations who are sensitive to data collection disable it if they wish to.

I think this should be established as early as phase 1 initial release.
So you might be interested to include some server-side ops cost to collect, store and analyze the telemetry data (i.e. Honeycomb/Datadog free-tier).


- linting
- formatting
- code lens (provide contextual information and actions)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried to do this, one of the main roadblocks was resolving third party rulesets to the correct versions in a non-blocking way.
As far as I understand it, one would need to either:

  • Have their own execution engine for repository rules, or
  • Rely on bazel sync calls to resolve workspace dependencies.

The second approach is the obvious one, but it has the drawback of blocking bazel for potentially long stretches. So, say that one:

  • opens a bazel workspace in VScode -> initial bazel sync starts.
  • Makes a small edit to one file
  • Wants to run bazel test //... to test everything because they have remote execution and they expect only one test to be run.

In that case, would we block until the bazel sync is complete? Preempt the sync command? Something else?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like intellij needs to index a Maven project on startup(including download the maven dependencies), we gonna have to index the Bazel project with the LSP to provide a responsive UX (read "low latency").

This index could include:

  • Prefetching external deps onto disk (using bazel query //<target>)
  • Warm up Bazel JVM daemon
  • Build up reference map in-memory

with future configurations that (1) allow doing those lazily or in a background async process and (2) do things in smaller, fine-grained chunks.

But I think it's reasonable for an early implementation to assume a warm Bazel JVM with all external deps already available on disk. We could iterate over time based on the telemetry data I proposed above.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I planned to accept the cost of indexing the world for the first pass of the language server.

🤔 Thinking out loud, can we leverage the Bazel cache to cache the indexing results? If we wrote an aspect that defines actions to do the indexing, the results could be cached. The downside is that the indexing actions would be blocked or block the user's activity. I know that rules_xcodeproj handles this by creating separate output paths for indexing and more mainline activity.

@cgrindel
Copy link
Owner Author

I would suggest including anonymized telemetry data by default.

I thought about adding this to the proposal. I was worried that folks might be scared off and or might not want to help pay for it as part of the Kickstarter.

@manapointer
Copy link

manapointer commented Jul 1, 2023

Found this thread and would love to volunteer for this effort! I've started work on something similar, albeit in Rust, using the libraries underlying rust-analyzer (e.g. their LSP implementation, rowan for representing ASTs, salsa for incremental computation etc).

One thing that dissuaded me from continuing on the project was around macro support (e.g. hover for documentation and autocomplete for attributes), especially the pattern of wrapping rules with macros which from what I can tell is used by most popular rules libraries. I was planning on following Stardoc's lead and generating documentation for rules by carrying out a fake evaluation of build files and recording invocations to rule(), but that doesn't work in the case of macros because rule() doesn't actually get invoked when evaluating macro (function) definitions.

@cgrindel
Copy link
Owner Author

cgrindel commented Jul 2, 2023

@manapointer Were you working on the Meta rust language server or do you have your own?

@manapointer
Copy link

manapointer commented Jul 2, 2023

I was working on my own - got as far as starting work on Starlark type inference/initial IDE features like goto defs within a single file before putting the project down for a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants