Add Gazelle Clojure plugin#84
Draft
miridius wants to merge 1 commit into
Draft
Conversation
9d0a929 to
a4c9c5b
Compare
a4c9c5b to
ee50b0b
Compare
miridius
added a commit
that referenced
this pull request
May 11, 2026
`{:k v}` was emitting as `{"k" : "v"}` (with space before colon);
buildifier canonicalizes to `{"k": "v"}` (no space).
Caught when comparing gen_srcs output against `buildifier`-formatted
BUILD files generated by the Gazelle plugin (#84) — the gen_srcs
output was the only thing buildifier wanted to rewrite.
0f89d27 to
2137d8b
Compare
ee50b0b to
b2e3a9d
Compare
miridius
added a commit
that referenced
this pull request
May 11, 2026
The persistent-classloader-test `can-compile-and-GC` failed on the previous run — WeakReference still reachable. The test runs alongside the Go toolchain compile (#84 adds the Gazelle plugin), which puts extra GC pressure on the JVM. Retriggering to confirm flake.
miridius
added a commit
that referenced
this pull request
May 11, 2026
The persistent-classloader-test `can-compile-and-GC` failed on the previous run — WeakReference still reachable. The test runs alongside the Go toolchain compile (#84 adds the Gazelle plugin), which puts extra GC pressure on the JVM. Retriggering to confirm flake.
e99b535 to
fde02a0
Compare
b40e54d to
731ce83
Compare
fde02a0 to
26f5c67
Compare
c0a9ebf to
e6c6037
Compare
26f5c67 to
c229c75
Compare
c70fc88 to
ae6ce4b
Compare
Gazelle language plugin for Clojure, replacing `gen_srcs` with a proper
Gazelle extension: Go plugin + long-running Clojure parser subprocess
over newline-delimited JSON on stdio.
`gen_srcs` is a standalone `bazel run` target — no Gazelle lifecycle,
no directives, no incremental updates, no interop with other language
plugins. The plugin reuses the existing `gen-build/ns-rules` for rule
construction (AOT decisions, test-attr passthrough, clojure_library /
clojure_test / clojure_binary / java_library emission) so the same
Clojure code drives both gen_srcs and the plugin — Go just translates
`{:type :attrs}` specs into Gazelle `*rule.Rule`. Deps resolution
stays in Go's Resolve because Gazelle's cross-package index isn't
visible to the subprocess; per-rule static deps that don't need the
index are pre-merged Clojure-side and seeded into `depSet` so Resolve
adds only the intra-repo lookup + per-target deps_bazel overrides.
Verified against a large internal Clojure monorepo (~1300 BUILD
files): byte-identical to `gen_srcs` on every file gen_srcs touches.
The remaining content diffs are dirs gen_srcs ignores (`.circleci`,
`test-data`, etc.) where Gazelle additionally cleans up stale
`clojure_test` symbols from old loads — Gazelle being more thorough,
not a bug.
Plugin (gazelle/):
- `language.Language` + `LifecycleManager` + `Configurer` + `Resolver`.
- Generates `clojure_library`, `clojure_test`, `clojure_binary`,
`java_library`, and `__clj_lib` / `__clj_files` rollup targets.
- Directives: `clojure_enabled`, `clojure_deps_edn`,
`clojure_deps_repo`, `clojure_aliases`. Per-package
`clojure_deps_repo` overrides retag externally-resolved labels in
Resolve; pre-baked seed deps stay on the root tag since they're
resolved against the root deps.edn.
- Root-module-name auto-detect: reads `module(name = "...")` from
`MODULE.bazel` to canonicalize self-referencing labels
(`@<root>//foo` → `@@//foo`).
- Subdir rollup uses a bottom-up cache so `__clj_lib` membership is
O(1) per child instead of a WalkDir per directory.
- Intermediate-only directories (no direct `.clj`/`.cljs`/`.cljc`/`.js`
but `clojureSubdirPaths` non-empty) still get their `__clj_lib` /
`__clj_files` rollup so consumers of `//foo:__clj_lib` keep working
when `foo/` is an aggregator.
Failure semantics:
Parser startup, transport errors, per-file parse exceptions, walk
errors under `subdirHasClojureFiles`, malformed `deps_bazel` shapes,
unknown rule kinds, and a missing `rules_clojure` bzlmod dep all
`log.Fatalf`. A silent log-and-continue would return empty
`GenerateResult` for previously-rule-bearing packages, which Gazelle
interprets as "delete every rule" — a green run that scorches the
build graph is worse than a noisy exit. Clojure-side parse failures
travel back as tagged `{:error :file}` entries so the Go side aborts
explicitly instead of dropping the file group. The error message
includes the full exception cause-chain (class + message per link)
so the root-cause class isn't collapsed by an outer wrap.
Clojure parser (`gazelle_server.clj`): long-running subprocess reusing
the existing `namespace.parse` and `gen_build` infrastructure. Init
once (deps.edn → basis is expensive); parse RPCs per-directory carry
the file list + workspace-relative subdir paths the rollup needs.
Rules stay keyword-shaped through aggregation; the wire conversion
(`rule-spec->wire`) is applied once at the response boundary so the
`__clj_lib` / `__clj_files` extraction reads `:name` / `:resources` as
keywords rather than strings. Resource-only `.clj` files paired with
a declaring `.cljs` sibling surface the cljs namespace instead of
silently dropping the group. Top-level `catch Throwable` (not
`Exception`) so AssertionError / OutOfMemoryError can't kill the
subprocess silently. The request loop validates parse requests
against an `s/keys` spec and returns a structured error envelope on
malformed input.
Wire types (`gazelle/clojureparser`):
- `InitResponse.DepsBazel` is a typed `DepsBazel{Deps map[label]
DepsBazelTarget}` rather than `map[string]interface{}`, so malformed
`:bazel` input fails at `json.Unmarshal` instead of being silently
dropped by nested type assertions in Resolve.
- `Platform` type alias used on `NamespaceInfo.Platforms` and
`Requires` keys for self-documentation.
- `Init` and `Parse` decode the response in a single pass via an
embedded `Message`-carrying envelope so `dep_ns_labels` payloads
aren't parsed twice.
Build / module:
- `MODULE.bazel`: `bazel_dep`s for `rules_go` (0.60.0) and `gazelle`
(0.47.0); set up `go_deps` from `//gazelle:go.mod` via the
`@gazelle//:extensions.bzl` extension. Replaces an old
WORKSPACE-based `http_archive` setup — WORKSPACE was emptied by the
bzlmod migration in #79. `gazelle` is pinned to 0.47.0 because
0.51.0 introduces a v2 module path that `go test` outside Bazel
can't resolve cleanly.
- Gazelle BUILD files use the bzlmod canonical repo names
(`@rules_go`, `@gazelle`).
- `gazelle/deps.bzl` deleted (WORKSPACE-era helper, dead under bzlmod).
- `target/` added to `.gitignore` (Go test artefacts).
- `bootstrap_gazelle_server.clj` reuses `bootstrap_gen_build/nses-to-
compile` to keep the two deploy-jar AOT lists in sync.
- `CLOJURE_MAVEN_REPOSITORY` env var overrides `$HOME/.m2/repository`
so CI / containers can point at a pre-populated cache.
Tests:
- `gazelle/` (Go): wire-format translator (`buildRule` / `applyAttr`
with unknown-kind + nil-attr rejection); config tree walk; source
root prefix matching; subdir Clojure detection; `Imports` covering
intra-repo / AOT-fallback / multi-AOT; `mergeDepsBazelTargetDeps`;
`Configure` lifecycle including root-module-name discovery and
per-package directive overrides; subprocess lifecycle including
death-on-EOF / shutdown idempotency / malformed-JSON / error-
envelope detection; real-jar integration round-trip (gated on
`GAZELLE_INTEGRATION_TEST_REQUIRED=1` so CI fails loud on a missing
fixture instead of silently skipping).
- `test/rules_clojure/gazelle_server_test.clj`: wire read/write
round-trip, alias colon-stripping, per-platform parse, JS-only
groups, parse-failure error envelope, no-ns-form skip, resource-
only-with-cljs-sibling fallthrough, parse-before-init guard,
malformed-request validation, deterministic namespace ordering, and
`__clj_lib` / `__clj_files` rollup_rules coverage (clj-only, subdir-
only, empty, JS-only `:rules` populated).
- `test/rules_clojure/gen_build_test.clj` gains `test-path?` unit
coverage + `rollup-rules` unit tests.
bb45ed4 to
e7f57c1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
Bazel reads
BUILD.bazelfiles to figure out what to build. In a large Clojure repo those files describe everyclojure_library/clojure_test/clojure_binarytarget plus the deps between them. Hand-maintaining them at scale is impractical, so we generate them from the source tree.We've been doing that with
gen_srcs— a standalonebazel runtarget. You invoke it manually; it walks the repo, parses every Clojure namespace, computes deps, and writes BUILD files. It works, but it's outside the normal Bazel lifecycle: no incremental updates, no directives to control behavior per package, no way to interop with other languages also generating BUILD files in the same repo.This PR replaces
gen_srcswith a proper Gazelle plugin.What's Gazelle?
Gazelle is a BUILD-file generator that runs as a Bazel build target. You write a language plugin (in Go — Gazelle's own implementation language) that tells it how to find your source files and what rules to emit. Gazelle handles the rest: directory walk, BUILD-file parsing, rule merging, cross-package dep resolution. Most modern Bazel rule sets ship a Gazelle plugin (Go, Java, Python, Rust, etc.) so users can keep their BUILD files in sync with one command.
Why a subprocess?
Gazelle plugins are Go code. But all our existing rule-construction logic — AOT decisions, test-attr passthrough, dep label generation — is in Clojure (
rules-clojure.gen-build). Rewriting it in Go would mean two implementations of the same rules; they'd drift, and bugs in one wouldn't be caught by the other.Instead, this plugin follows the Java Gazelle plugin architecture: the Go side is just glue — it starts a long-running Clojure subprocess and asks it questions over stdio. The Clojure side owns the rules.
flowchart LR Bazel[bazel run //:gazelle] --> Plugin[Go plugin<br/>gazelle/] Plugin <-->|JSON lines<br/>on stdio| Server[Clojure server<br/>gazelle_server.clj] Server --> GenBuild[rules-clojure.gen-build<br/>shared with gen_srcs] Plugin --> Builds[BUILD.bazel files]Life of a Gazelle run
Gazelle walks the repo top-down, calling four hook points on every package it visits. Our plugin implements all four:
sequenceDiagram participant G as Gazelle participant P as Go plugin participant S as Clojure server Note over G,S: Once per run, at startup G->>P: Configure(root) P->>S: spawn subprocess P->>S: {type:"init", deps_edn_path, ...} S-->>P: {dep_ns_labels, deps_bazel, source_paths, ...} Note over G,S: Once per package, bottom-up G->>P: Configure(pkg) P->>P: record per-package directives G->>P: GenerateRules(pkg) P->>S: {type:"parse", dir, files, clojure_subdir_paths} S->>S: read each (ns ...) form S->>S: call gen-build/ns-rules S-->>P: {namespaces, rollup_rules} P-->>G: clojure_library / clojure_test / __clj_lib / __clj_files Note over G,S: After all GenerateRules done G->>P: Resolve(rule) P->>P: intra-repo lookup + dep_ns_labels + deps_bazel P-->>G: rule.deps = [...] Note over G,S: At shutdown G->>P: AfterResolvingDeps P->>S: close stdin S-->>P: exits# gazelle:directives. On the root call, auto-discoverdeps.edn+MODULE.bazeland start the Clojure subprocess.{kind, attrs}specs into Gazelle*rule.Ruleobjects.clojure_library, walk its:requires and fill in the rule's:depsagainst the cross-package index Gazelle built during indexing.Wire protocol
The Clojure server speaks newline-delimited JSON on stdio. One request per line, one response per line. Two request types:
initdep_ns_labels),deps_bazeloverrides, ignore/source pathsparse__clj_lib/__clj_filesrollup rulesA typed Go envelope embeds an optional
messagefield so Init/Parse decode in a singlejson.Unmarshalpass — the same payload doesn't get parsed twice whendep_ns_labelsis multi-MB. Malformed JSON surfaces with thejson.SyntaxErroroffset and a window of bytes around it so big-response debugging isn't reduced to "first 200 bytes".Why JSON over stdio (and not gRPC, sockets, etc.)
log.Fatalfs immediately. No half-alive runner racing the next request.bazel run //src/rules_clojure:gazelle_serverby hand and paste JSON at it.Architecture: Clojure owns the rules
Rule construction lives in
rules-clojure.gen-build, not in Go. That includes::bazel/clojure_librarymetadata on thensform):bazel/clojure_testfor size/tags/timeout):require→ dep label mappingclojure_binaryemission for:bazel/clojure_binarymetadatajava_libraryfor.jsfiles alongside.cljs__clj_lib/__clj_filesrollup compositionThe Go side calls
gen-build/ns-rulesper basename group andgen-build/rollup-rulesper package; both return[{:type :clojure_library :attrs {...}} ...]Clojure data. Go translates verbatim into Gazelle*rule.RuleviabuildRule+applyAttr. Same Clojure code drives bothgen_srcsand the plugin — one source of truth.What Go actually does
Three things only:
{:type :clojure_library :attrs {:name "core" ...}}→rule.NewRule("clojure_library", "core")+r.SetAttr("name", "core")etc.clojure_libraryrule, walk its:requires and resolve each to a Bazel label:(:require [my.foo])to//src/my:foowhen another package generated it).init'sdep_ns_labelsmap (Maven coords resolved at startup).deps_bazel(the user's:bazelmap indeps.edn).The static deps that don't need Gazelle's index (
org_clojure_clojure,clojure-library-args,ns-library-meta, pre-resolved import-deps / gen-class-deps) are pre-merged Clojure-side and seeded into the dep set from the rule's existing:deps. Resolve only adds what genuinely needs the index.Configuration
Per-package directives in BUILD-file comments:
Auto-discovered at startup:
deps.ednat the workspace root (the source of dep coords).MODULE.bazelroot module name — used to canonicalize self-referencing labels (@<root>//foo→@@//foo).Important
Migrating from
gen_srcs?gen_srcstakes aliases via CLI args (:aliases [:bazel :cider :dev ...]); the plugin reads them from a# gazelle:clojure_aliases :bazel,:cider,:dev,...directive in the rootBUILD.bazel. Without this directive the plugin starts with no aliases active, which silently shrinks the basis and produces rules missing deps for namespaces that come from alias-gated artefacts (e.g. ClojureScript deps under a:frontendalias). Symptom: gazelle's output is missing@deps//:org_clojure_clojurescripton cross-platform rules. Fix: add the directive matching your oldgen_srcsinvocation.Failure semantics
Empty
GenerateResultfor a previously-rule-bearing package looks identical to Gazelle as "delete every rule". A green run that scorches the build graph is worse than a noisy exit, so the plugin fails loud on:{:error :file}entries — the Go sidelog.Fatalfs on receipt)subdirHasClojureFiles(permission denied, broken symlink, etc.)deps_bazelshapes (typed at thejson.Unmarshalboundary, not via runtime type-asserts)clojure_library/clojure_test/clojure_binary/java_library/filegroup)rules_clojurebazel_depinMODULE.bazelOn the Clojure side: the request loop catches
Throwable(not justException) soAssertionError/OutOfMemoryErrorcan't silently kill the subprocess, and the returned:errormessage includes the full exception cause-chain (class + message per link) so a wrappedEOFExceptiondoesn't collapse into a generic outer message.Parse requests are validated against an
s/keysspec at the server boundary, so a malformed wire payload returns a structured{type:"error", message:"..."}envelope instead of a crypticClassCastExceptiondeep in the parser pipeline.Subdir rollup cache
Each package emits a
__clj_lib/__clj_filesrollup that aggregates the package's own rules plus the rollups of any Clojure-bearing subdirectories. Naively that's an O(n)WalkDirper package — quadratic across the tree.Instead, Gazelle's bottom-up walk lets us record
hasClojureContent[rel]for each visited package and consult it as an O(1) lookup when the parent gets generated. Falls back to the on-disk walk for any subdir we somehow haven't visited yet (defensive — shouldn't happen in normal Gazelle ordering).Intermediate-only directories (no direct
.clj/.cljs/.cljc/.jsfiles, but Clojure-bearing subdirs) still emit their rollup rules so consumers of//foo:__clj_libkeep working whenfoo/is an aggregator with code only infoo/bar/.Validation: byte-identical to gen_srcs
Verified against a large internal Clojure monorepo (~1300 BUILD files): the plugin produces byte-identical output to
gen_srcson every filegen_srcstouches.The handful of remaining content diffs are in dirs
gen_srcsignores (.circleci,test-data, etc.) where Gazelle additionally cleans up staleclojure_testsymbols from oldload(...)statements — Gazelle being more thorough, not a bug.Build / module changes
bazel_dep(name = "rules_go", version = "0.60.0")bazel_dep(name = "gazelle", version = "0.47.0")go testoutside Bazel can't resolve cleanly.go_deps.from_file(go_mod = "//gazelle:go.mod")http_archivesetup — WORKSPACE was emptied by #79.gazelle/deps.bzltarget/to.gitignorebootstrap_gazelle_server.cljreusesbootstrap_gen_build/nses-to-compileCLOJURE_MAVEN_REPOSITORYenv override$HOME/.m2.Wire types (
gazelle/clojureparser)Runnerexec.Cmd+ stdin/stdout pipes + abufio.Scannerwith a 10 MB line buffer). All methods serialize on an internal mutex; safe for concurrent use.InitRequest/InitResponseDepNsLabels(Maven coord → Bazel label map),DepsBazel(per-target overrides),IgnorePaths,SourcePaths.ParseRequest/ParseResponseNamespaceInfoper basename group plus the rollup rules for the directory.RuleSpec{Kind, Attrs}pair the Clojure server hands back.Kindis a closed set;Attrsismap[string]interface{}because the attr shape varies per kind.NamespaceInfoclj/cljs), platforms set, rules to emit.Erroris non-empty if the parse failed;:requires and:nsare absent for JS-only groups.DepsBazel/DepsBazelTarget:bazelmap:Deps map[label]DepsBazelTargetwhereDepsBazelTarget{Deps []string}lists extra labels to merge into a target's:deps. Other keys under:bazeldecode silently and stay inert.Tests
Go (
gazelle/)buildRule/applyAttrhappy path + unknown-kind, missing-name, non-string-name, nil-attr, and dict-valued-attr rejection.MODULE.bazel.Imports: intra-repo lookup, AOT-fallback for pre-existing rules, multi-AOT (oneImportSpecperaotentry, not justaot[0]).mergeDepsBazelTargetDeps: absent, happy path, unmatched label.Configurelifecycle: root + sub-package, extension-disable directive, alias parsing.lifecycle_test.go, against a shell-script stub):Initrejects malformed JSON,Initrejects error envelopes, exchange marks the runner dead on EOF + short-circuits subsequent calls withErrRunnerDead,Shutdownis idempotent and safe on a nil receiver.TestInitRoundTrip/TestParseRoundTrip): gated onGAZELLE_INTEGRATION_TEST_REQUIRED=1so CI fails loud on a missing fixture instead of silently skipping. Locally the skip is the right behavior because the jar may not be built yet.Clojure (
test/rules_clojure/)read-request/write-response): JSON round-trip, EOF returns nil, multi-message stream.strip-leading-colon: handles:fooandfooshapes.handle-parseper-platform:.clj,.cljc,.cljs, JS-only, mixed clj+js basename collapse.handle-parseerror paths: broken.cljreturns tagged:errorentry; resource-only.cljwith no(ns ...)returns empty:namespaces; resource-only.cljpaired with a declaring.cljssibling surfaces the cljs namespace.handle-requestdispatch: parse-before-init rejected with structured error; unknown request type rejected.__clj_lib/__clj_filesrollup: clj-only, subdir-only, empty, JS-only:rulespopulated.s/keysspec validation.rollup-rulesunit tests (gen_build_test.clj): empty, lib-only, subdirs-only, mixed.test-path?matrix:.clj/.cljctest files match,.cljsdoesn't (clojure_test only runs on the JVM).