Skip to content

Add VelocityHealthCheck for Kubernetes readiness probe (follow-up to spike #35329) #35602

@dsolistorres

Description

@dsolistorres

Description

Spike #35329 identified that the Velocity engine can enter a broken state where the global macro library fails to load while engine.init() reports success (see companion ticket #35601 for the underlying fix). There is currently no health check that would catch this state before K8s starts routing traffic to the pod.

dotCMS already has a health-check framework with readiness/liveness separation — see dotCMS/src/main/java/com/dotcms/health/api/HealthCheck.java and the existing CDI checks at dotCMS/src/main/java/com/dotcms/health/checks/cdi/{DatabaseHealthCheck, ElasticsearchHealthCheck, CacheHealthCheck}.java. A VelocityHealthCheck slots in alongside these.

The probe is simple — evaluate a tiny VTL string that invokes #renderMarks($null) and check whether the output contains the literal string #renderMarks (which means the macro was not resolved). The proof-of-concept logic already exists in the spike repro at VelocityUtil.verifyMacroLibraryLoaded() on branch spike-35329-velocity-repro and should be moved into the new health check.

Acceptance Criteria

  • New class com.dotcms.health.checks.cdi.VelocityHealthCheck extends HealthCheckBase, registered via CDI alongside the other CDI health checks.
  • isReadinessCheck() returns true; isLivenessCheck() returns false (one-time init failures should not cause pod restart loops).
  • getName() returns "velocity".
  • getOrder() runs after database and cache checks (they are prerequisites for Velocity).
  • performCheck() evaluates the probe template #renderMarks($null) through VelocityUtil.getEngine(). Returns DOWN when output contains literal #renderMarks; UP otherwise.
  • Probe completes in under 50ms per call (acceptable for K8s probe cadence at 5–10s intervals).
  • /readyz endpoint (already wired up in HealthProbeServlet) reflects DOWN when the macro library failed to load.
  • Unit test: with a mock engine where #renderMarks is registered, returns UP.
  • Unit test: with a mock engine that renders the probe input verbatim, returns DOWN.

Additional Context

  • Companion ticket: Velocity: fail loud when global macro library load fails (follow-up to spike #35329) #35601 (velocimacro.library.fail-on-missing flag) — together they form defense-in-depth: the fail-loud flag prevents the broken state on most failures, the health check prevents traffic from reaching any pod that still slips through.
  • Cloud team: Confirm with the cloud team if additional steps are required to use the new VelocityHealthCheck in the cloud environment infrastructure.
  • File added (expected): dotCMS/src/main/java/com/dotcms/health/checks/cdi/VelocityHealthCheck.java plus tests.

Metadata

Metadata

Assignees

Type

No fields configured for Task.

Projects

Status

In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions