An oft-requested feature is an improved Tiltfile
editing experience in IDEs/editors.
Specifically, for the scope of this spec, we'll consider two high-level aspects:
- Syntax highlighting
- Advanced IDE features (autocomplete, go to definition, etc.)
Explicitly out of scope for consideration here is any kind of integration with tilt up
like was done for the vscode-tilt-status extension.
The focus here is on language-level (Starlark) static analysis.
Our primary target is Visual Studio Code due to its ubiquity and robust extension system. Additionally, it's available as a target for embedding on the web, which makes it attractive to enable in-app Tiltfile editing in the future. (Currently, this is not strictly a requirement or planned, but it has surfaced in product conversations multiple times and a POC was built in summer 2021.)
When making technical decisions, however, it's important to consider the broader IDE ecosystem to enable additional extensions for other IDEs & editors while avoiding creating an unsustainable maintenance burden. More concretely, we should avoid writing as much vendor-specific code as possible.
The secondary targets the JetBrains family of IDEs (IntelliJ, GoLand, PyCharm, etc). Note that these are referred to uniformly under the umbrella of "IntelliJ" throughout the remainder of this document, as this is the underlying base for all JetBrains IDEs.
All other targets, including popular terminal-based editors like vim or Emacs are excluded for the moment.
However, whenever possible, we should strive to make choices enabling broad editor support in the future without bikeshedding or falling for YAGNI.
The TextMate Grammar format is popularly used as an interchange format for language syntax highlighting.
There is a robust, official TextMate Grammar for Starlark available in the syntaxes/
subdirectory of the vscode-bazel
extension.
VSCode uses this natively as outlined in the VSCode Extension Syntax Highlighting Guide.
IntelliJ has native support for loading TextMate bundles. However, this is not the mechanism extensions are expected to use; JetBrains has their own codegen tool, Grammar-Kit, to convert a BNF to a compatible parser.
Given the existence of an official TextMate grammar combined with its necessity to support VSCode, this is sufficient for now. As we won't have a native IntelliJ extension right now, providing instructions for IntelliJ users to load the TextMate bundle is sufficient. In the future, if we develop an IntelliJ extension, we can investigate programmatically registering TextMate bundles or alternative approaches.
Historically, providing advanced IDE features has meant a lot of code duplication and maintenance, as there was no standard and IDEs are written in a variety of languages.
Currently, there are no Starlark (or derivatives, e.g. Bazel) open-source IDE extensions using either LSP or vendor-specific techniques.
In recent years, Microsoft has championed the Language Server Protocol, which is attractive for several reasons:
- Language analysis logic can be written in any language
- Async/non-blocking API (prevent editor freezes)
- Reduces vendor-specific extension code
- Capability based (do not have to support all possible features)
(These are not the only benefits, e.g. there are some nice security/stability side effects from running the LSP server as a separate process.)
VSCode has leaned heavily into using LSP for language extensions. As a result, the VSCode LSP client handles everything already - no bridge code between LSP<>VSCode is required.
This makes LSP extremely attractive for our use case. First, it opens the possibility of re-use of the LSP server for future IDE extensions. Additionally, and perhaps more importantly, it dramatically simplifies development of the VSCode extension in particular, as vscode-languageserver-node handles everything for the editor UX and is battle-tested.
IntelliJ does not have robust first-party LSP support. The [lsp4intellij][] project provides a bridge between LSP<>IntelliJ for extensions (similar to vscode-languageserver-node). Its maintenance/development status is questionable as of February 2022. Additionally, there is a generic intellij-lsp plugin, so a motivated user could configure it manually to use our LSP server now. Its maintenance/development status is similarly questionable as of February 2022.
While editor support beyond VSCode is still immature, LSP is quickly growing in popularity, especially as it makes it practical to support "niche" languages such as Tiltfile-Starlark.
Even in a VSCode-only world, LSP is a desirable target because of the abstractions provided. This will reduce the amount of error-prone UI interfacing code we need to write and dramatically simplify testing.
The remainder of this document will explore nuances of developing an LSP server.
LSP is a relatively recent standard (the first public release was in 2017) for a complex topic (programming language analysis). As a result, the ecosystem is a bit chaotic: there are often multiple competing LSP servers for a given language and high-level documentation can be sparse.
Luckily, the LSP specification itself is extremely readable, and the JSON-RPC2 protocol makes introspection human-friendly.
The go.lsp.dev project provides a JSON-RPC2 server and client implementation in addition to Go structs for the LSP wire messages.
It's an exported version of the inaccessible implementation used for the official Go LSP server (gopls
) from golang.org/x/tools/internal/lsp.
Note: it appears to have diverged at this point; stewardship and affiliation (if any) to Go team is unclear as of February 2022.
Developing an LSP in Go is a natural choice:
- Possible to integrate directly into Tilt (
tilt lsp
command) eliminating extra dependencies - Tilt team familiarity with Go
- Official Starlark parser implementation exists
- See caveats in Starlark Parsing section below
While writing code, it's very normal for the file to be syntactically invalid. This is often at odds with language parsers, which strictly follow a formal grammar.
Microsoft wrote a fantastic overview of lessons learned from writing a fault tolerant PHP parser for use in VSCode.
The starlark-go parser stops at the first error it encounters.
Furthermore, it does this via calling recover()
from a panic()
rather than more idiomatic Go error-handling.
That said, the starlark-go parsing code is hand-written and roughly ~1000 LOC, so creating a fault-tolerant version is not insurmountable. It's unlikely we'd be able to upstream this, but the Starlark language spec is a slow-moving target, so the ongoing maintenance burden here would be minimal.
Another approach worth exploring is to generate a Tree-sitter grammar. Tree-sitter is designed for fast parsing, incremental updates, and graceful error recovery (all important properties for an LSP server). It's written in dependency-free C, which makes bindings for a variety of languages practical including Go and WASM.
Microsoft even uses it in the vscode-anycode extension, which provides generic LSP features using any Tree-sitter AST.
Tree-sitter also has highlighting functionality that can be used to provide semantic syntax highlighting that's more accurate than the TextMate-based regex approach. (This would also be useful for a future IntelliJ extension: see the notes in the Syntax Highlighting section.)
We should write our LSP in Go.
On the LSP protocol & communication: the LSP Go packages, while under-documented, are feature complete and real world tested via gopls
.
(Note: this is specifically in reference to the protocol & JSON-RPC2 communication aspects, not the actual language analysis logic in gopls
.)
On Starlark analysis: there are no viable Starlark parsers written in another language: both the Go and Java implementations have extremely strict, hand-rolled parsers, and the Rust implementation is now maintained under the facebookexperimental
org and has some syntax differences.
The most compelling, modern generic parser/AST library, Tree-sitter, is written in C, and has bindings to many languages, including Go.
We can use the existing Python Tree-sitter grammar to prototype Tiltfile
LSP functionality (e.g. autocomplete).
If successful, we can adapt the Python Tree-sitter grammar to the Starlark dialect.
If unsuccessful, we can experiment with forking the starlark-go
parser and making it more lenient.
A Tilt user showed off their VSCode setup in (tilt-dev/tilt#4734):
- VSCode Python extension
- Tilt API definitions (
api.py
) converted to Python type stubs (.pyi
) - Magic
import
statement inTiltfile
to trigger code completion- Not valid Starlark; must be removed before Tiltfile is executed
This configuration as-is has two big downsides:
- Potentially incorrect/misleading syntax error reporting where Starlark and Python differ
- Necessity of magic
import
statement that must be manually added and then removed while editing
If we were to adopt this as a more general Tilt-sanctioned approach, it would also necessitate users having a functional Python installation, as the underlying analysis/parser code is written in Python. Beyond being an extra requirement, the end-user Python packaging and distribution ecosystem is fraught with issues, which will inevitably increase the support burden.
Furthermore, while it's likely a Tilt extension taking this approach could obviate the need for a magic import
statement, it's unlikely we'd be able to adequately reconcile semantic/syntactical differences between Python and Starlark.
For example, Python uses import
while Starlark has load
(in addition to the Tilt-specific load_dynamic
and include
functions).
To handle cross-file symbol resolution properly, the underlying Python tooling would presumably need to be forked (and then maintained).