Skip to content

feat: honor spec.director.shard.{by,healthy} in cluster VCL (#36)#41

Merged
jensens merged 1 commit intomainfrom
feat/shard-by-healthy-plumbing
Apr 18, 2026
Merged

feat: honor spec.director.shard.{by,healthy} in cluster VCL (#36)#41
jensens merged 1 commit intomainfrom
feat/shard-by-healthy-plumbing

Conversation

@jensens
Copy link
Copy Markdown
Member

@jensens jensens commented Apr 18, 2026

Summary

spec.director.shard.by (HASH|URL) and spec.director.shard.healthy (CHOSEN|ALL) were persisted on the CR and defaulted by the webhook, but the generator hardcoded by=URL in vcl_recv.vcl.tmpl:20 and omitted healthy entirely — so users setting these fields saw no effect.

Fix

Thread both values through TemplateData (ClusterShardBy / ClusterShardHealthy) and emit them in the cluster-peer director's request-time .backend() call:

set req.backend_hint = <director>.backend(by=<by>[, healthy=<hl>]);
  • by defaults to URL — preserves pre-fix behaviour.
  • healthy, when unset, is omitted so Varnish uses its own default.

Tests

  • TestGenerate_ClusterShardBy_DefaultsToURL — default path, no healthy= leaked.
  • TestGenerate_ClusterShardBy_HonorsSpecOverridespec.director.shard.by: HASH lands.
  • TestGenerate_ClusterShardHealthy_HonorsSpecOverridespec.director.shard.healthy: ALL lands alongside by=.

Full go test ./... green.

Out of scope

Per-backend directors (default: shard) leave by/healthy to the user's snippet — they are request-time .backend() arguments and there is no operator-emitted default routing for multi-backend configs. Users who want per-backend by/healthy must pass them in their own vcl_recv snippet, e.g.:

set req.backend_hint = plone.backend(by=HASH, healthy=ALL);

Closes #36

🤖 Generated with Claude Code

Previously spec.director.shard.by and spec.director.shard.healthy
were persisted on the CR and defaulted by the webhook, but the
generator hardcoded by=URL in vcl_recv and omitted healthy
entirely — so users setting these fields saw no effect.

Thread both through TemplateData and into the cluster-peer
director's .backend() call at request time:

  set req.backend_hint = <director>.backend(by=<by>[, healthy=<hl>]);

- by defaults to URL (matches pre-fix behaviour).
- healthy, when unset, is omitted so Varnish uses its own default.

Per-backend directors (default: shard) still leave by/healthy to the
user's snippet — they're request-time .backend() args and there is
no operator-emitted default routing for multi-backend configs.

Closes #36
@jensens jensens force-pushed the feat/shard-by-healthy-plumbing branch from f5a8e03 to b191b06 Compare April 18, 2026 13:32
@jensens jensens merged commit 0f1b0ff into main Apr 18, 2026
8 checks passed
@jensens jensens deleted the feat/shard-by-healthy-plumbing branch April 18, 2026 13:40
jensens added a commit that referenced this pull request Apr 18, 2026
…ting

Critical, blocking-for-v0.4.2 issues raised by the end-to-end code
review of the combined #40 + #41 + #42 changeset:

- C1: cmd/operator/main.go never read POD_IP, so the OperatorIP field
  that #40 plumbed through the generator was always empty in
  production. Added os.Getenv("POD_IP") at reconciler construction
  so the PURGE ACL actually contains the operator pod IP and
  invalidation-proxy requests pass the ACL.

- C2: the BAN handler emitted the deprecated bare ban() form — the
  project's own architecture docs explicitly require std.ban().
  Switched to std.ban() and added a std.ban_error() check that
  returns synth(400) with the compile error when the X-Vinyl-Ban
  expression is malformed. Previously a bad expression silently
  returned 200 with no invalidation. Regression-guard assertion
  locks in the shape.

- I2: ban-lurker-friendly x-url / x-host headers were only emitted
  when HasXkey. With BAN enabled but xkey disabled, users writing
  ban("obj.http.x-url ~ ...") would find the header missing and
  fall back to req.* bans that accumulate O(n*m). Widened the gate
  to {{- if or .HasXkey .HasBAN }} in both vcl_backend_response and
  vcl_deliver.

- I3: trim on the BAN block (cosmetic — no stray blank line).

- I4: docs/sources/reference/vinylcache-spec.md updated:
  * shard.by/healthy note clarifies cluster-peer is auto, per-backend
    is user-snippet.
  * New invalidation rows for ban.enabled, ban.allowedSources,
    ban.rateLimitPerMinute (latter explicitly marked inert).
  * Security note + usage example for the X-Vinyl-Ban header.

- M1/M5: BAN-handler test now extracts the handler block and asserts
  on that substring (was: flat-file substring matches that could
  match noise elsewhere). Added TestGenerate_BAN_BanLurkerHeadersEmitted
  and TestGenerate_BAN_RateLimit_CurrentlyInert to lock in the
  deferred rate-limit state.

- E2E: vcl-validation cleanup timeout bumped from default (30s) to
  60s — the flake that reappeared in #42's CI run was the same
  symptom this test hit before the v0.4.0 release.

Refs #18 #36 #39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Plumb shard.by / shard.healthy through to per-backend director VCL

1 participant