Skip to content

gen_srcs: drop data = [] / srcs = [] noise from BUILD output#98

Merged
miridius merged 1 commit into
mainfrom
dave/gen-srcs-drop-empty-arrays
May 18, 2026
Merged

gen_srcs: drop data = [] / srcs = [] noise from BUILD output#98
miridius merged 1 commit into
mainfrom
dave/gen-srcs-drop-empty-arrays

Conversation

@miridius
Copy link
Copy Markdown
Contributor

@miridius miridius commented May 17, 2026

Problem

gen_srcs always emits data = [] and srcs = [] on filegroups that have no subdir-data or no in-dir sources:

filegroup(
    name = "__clj_files",
    srcs = ["foo.clj"],
    data = [],            # noise
)

Buildifier doesn't strip empty list attrs and humans never write them, so every generated BUILD file carries a few attributes that mean nothing.

Solution

emit-bazel-kwargs now skips entries whose value is an empty collection before rendering:

(->> (:x kwargs)
     (remove (fn [[_ v]] (and (coll? v) (empty? v))))
     sort-kwargs-entries
     ...)

Same machinery, two fewer lines per filegroup output.

Verified

Regenerating a downstream consumer's BUILD files via bazel run //:gen_srcs wrote 996 BUILD files; the entire diff is:

911 -    data = [],
 85 -    srcs = [],

i.e. exactly the empty-attr lines, no semantic changes.

Related: #99 (independent fix surfaced by the same investigation).

`filegroup(name="…", srcs=[…], data=[])` and similar carried noise:
`data = []` / `srcs = []` declare nothing and just bloat the file.
emit-bazel-kwargs now skips entries whose value is an empty collection,
so generated BUILD files match what a human would write.

Reduces churn on the gazelle-plugin migration in particular: the bb
parser already omits empty attrs, and without this fix every banksy
filegroup with no subdir-data showed up as a diff (`-    data = [],`)
on the cross-cutting plugin PR.

Co-authored-by: Claude <noreply@anthropic.com>
@miridius miridius force-pushed the dave/gen-srcs-drop-empty-arrays branch from f146146 to 4c8998f Compare May 17, 2026 07:25
@miridius miridius marked this pull request as ready for review May 17, 2026 07:56
@miridius miridius requested a review from a team May 17, 2026 07:56
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
miridius added a commit that referenced this pull request May 17, 2026
The Go plugin spawns a long-lived `bb gazelle_server.bb` subprocess and
speaks newline-JSON over its stdio. The bb script bypasses tools.deps
entirely: it reads deps.edn as EDN, parses `@deps/BUILD.bazel` directly,
and caches per-jar ns scans on disk keyed by sha256 of the BUILD file +
the active `:bazel :no-aot` set.

Why
- Cold start ~30ms vs JVM ~1s. No daemon mode needed.
- 3.5–4x faster full-repo regen than `gen_srcs` on banksy (~1.15s warm
  vs ~4.5s) plus a new sub-second path-scoped mode for on-file-save
  workflows.
- edamame on jar contents catches reader-conditional / macro-heavy
  CLJS files that tools.reader silently dropped — see PR #99 for the
  matching fix in gen_srcs itself.

Key implementation details
- `scan-jar` uses `with-open` (closes the JarFile on exception),
  reduce+transients (avoids per-entry atom CAS), and `str/ends-with?` +
  `subs` for jar-entry classification (~20x faster than per-entry
  regex on banksy's 686k entries).
- `make-lazy-src-ns-resolver` emits `//src-path:leaf` for top-level
  namespaces with no parent segment — the previous `//src-path/:leaf`
  shape was an invalid Bazel label.
- `parse-deps-build` block-walks `@deps/BUILD.bazel` and uses re-seq
  for the AOT name list (re-find captured only the first ns).
- `load-or-build-cache` writes data files first, sha sentinel last; a
  crash between leaves no sha and forces a rebuild rather than a
  matching-sha + truncated-data combination. Cache key includes the
  `:bazel :no-aot` set since it participates in label selection.
- `ns->dep-label` / `class->dep-label` honour the configured
  `deps-repo-tag` rather than hardcoding `@deps//:`.
- `read-request` returns a sentinel for malformed JSON so the server
  can respond with an error envelope and continue, not die.
- `handle-init` validates that `deps.edn` exists + parses, with the
  path attached to the error.
- `find-output-base`'s on-disk cache is invalidated when the cached
  output_base's `external/` dir is missing (covers --output_base
  switches and bazel version changes).
- `clojure_binary` declares `MergeableAttrs` for `runtime_deps`,
  `args`, `jvm_flags` so user edits reconcile with regenerated
  values on re-run.
- `run_bb_test.sh` honours `GAZELLE_INTEGRATION_TEST_REQUIRED=1` —
  fails loud when `bb` is missing rather than silently skipping.

Removed
- `RepositoryDir` and `RootModuleName` from the wire protocol. Neither
  was consumed by the bb server; the latter was documented as
  "self-label canonicalization" but no canonicalization existed
  anywhere in the codebase.
- `src/rules_clojure/gazelle_server.clj` (JVM parser superseded by bb).
- `src/rules_clojure/bootstrap_gazelle_server.clj` and its genrule
  pipeline.
- `test/rules_clojure/gazelle_server_test.clj` (JVM-side tests).

Verified on banksy: parity with `gen_srcs` modulo the cljs.spec.alpha /
cljs-branch-of-timbre / etc. additions which are bb-side correctness
wins. PR #98 (`drop empty array attrs`) and PR #99
(`honour :features for reader conditionals`) close the gen_srcs gaps
on the other side.

Co-authored-by: Claude <noreply@anthropic.com>
@miridius miridius merged commit 4a8f483 into main May 18, 2026
1 check passed
miridius added a commit that referenced this pull request May 18, 2026
Bugs fixed:
- parse-ns-form / read-ns-from-jar-entry now use edamame/parse-string-all
  so a leading (set! *warn-on-reflection* true) before (ns ...) no longer
  hides the namespace.
- gen-dir's rollup-rules :lib-deps now uses emitted rule :name attrs
  rather than path basenames, so an ns-binary-meta :name override still
  produces valid deps.
- gazelle/clojureparser.Runner.Shutdown sets dead=true to short-circuit a
  subsequent Parse that would otherwise race a closed stdin.
- receive() copies the scanner buffer before returning so callers can
  hold the slice safely.
- find-output-base throws on non-zero bazel info exit instead of
  returning empty and propagating a misleading "@deps/BUILD.bazel not
  found under /external" later.
- handle-parse throws when no source-path matches rel-dir (was silently
  emitting rules with resource_strip_prefix="").
- Cache transit-read is wrapped so a corrupt cache file (truncated
  transit, partial-write from a killed prior run) triggers a clean
  rebuild instead of crashing handle-init.
- applyAttr rejects non-integer floats for int Bazel attrs (would have
  silently truncated).
- ClojureExtensionDirective renamed to ClojureEnabledDirective so the
  Go-side name matches the user-facing directive value.

Test improvements:
- parse-deps-build-multi-aot-entries now builds a real jar so all three
  AOT namespaces actually appear in clj-ns->label (was a smoke test).
- resolve-deps-build-override + probe-bzlmod-deps-build extracted as
  testable helpers; new tests for canonical / apparent / missing
  branches and the override existence checks.
- New tests for ApparentLoads (remapped module, missing module fatal,
  Loads/ApparentLoads delegation) and subdirHasClojureFiles walkErr
  fatal path.
- TestImportsRuleNsHit now asserts ImportSpec.Lang.
- TestGenerateRulesFatalsOnParserDeath pins the actual fatal message.
- Exception-chain ExecutionException-without-cause now asserts the
  wrapper message is preserved.
- fatal-error-detection JDK-class-hierarchy tautology removed.

Code quality:
- Dead `resolved` atom in resolve-ns-deps dropped.
- basename / file-ext delegate to babashka.fs/strip-ext and /extension.
- rule-spec->wire uses update-keys.
- handle-parse threads rel-dir into resolve-ns-deps so the
  unresolved-requires warning shows which directory.

Comments and docs:
- Go comments referencing rule construction now point at
  gazelle_server.bb's ns-rules, not gen-build / gen_build.clj.
- DepsEdn() docstring corrected (absolute path, not workspace-relative).
- Platform-keys cross-reference fixed to Platform* constants.
- Cache key docstring at top of gazelle_server.bb names all three
  inputs (BUILD content + format version + no-aot set).
- "pick the platform-appropriate one and stop" replaced with the actual
  behaviour (both labels emitted, map semantics dedupe).
- emdashes in docstrings / println output replaced with parens per
  project convention.
- "Bazel built-ins" comment for java_library qualified.
- clojure_test rule kind declares MergeableAttrs for env/tags/jvm_flags/
  size/timeout so user edits aren't clobbered.

Test infrastructure:
- bb tests guard their entry point with (when (= *file* ...)) so
  load-file callers don't trigger System/exit.
- CircleCI installs babashka and sets GAZELLE_INTEGRATION_TEST_REQUIRED=1
  so the bb-side integration tests can't silently no-op in CI.

PR description (#100):
- Drop "13 bb-side unit tests" (actual count is 42).
- Clarify relation to #98 / #99 (this PR is standalone; parity claims
  assume those land).
- "Alternative to gen_srcs" rather than "Replaces".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants