Skip to content

perf: enable Gradle build cache and test parallelism to lower CI time#371

Merged
fdelbrayelle merged 3 commits into
mainfrom
perf/lower-ci-time
Jun 3, 2026
Merged

perf: enable Gradle build cache and test parallelism to lower CI time#371
fdelbrayelle merged 3 commits into
mainfrom
perf/lower-ci-time

Conversation

@fdelbrayelle

Copy link
Copy Markdown
Member

Problem

CI takes ~23 minutes. The dominant costs are:

  • ./gradlew check shadowJar --parallel: ~11 min (Gradle check across 18 submodules)
  • Maven Central publish: ~6 min (inherent network latency, unchanged)
  • Disk cleanup in external action: ~4 min (unchanged)

Root causes

  1. No Gradle build cachegradle/actions/setup-gradle@v5 already persists ~/.gradle/caches (including build-cache-1/) across CI runs, but org.gradle.caching=true was never set, so the cache sits unused.
  2. org.gradle.parallel only set on CI CLI — local dev builds are sequential by default.
  3. No maxParallelForks — each subproject's tests run in a single JVM even on multi-core machines.

Changes

File Change Impact
gradle.properties org.gradle.caching=true Activates the local build cache already persisted by gradle/actions/setup-gradle. On a PR that touches a single module, the other 17 modules get full cache-hits (compile + test skipped).
gradle.properties org.gradle.parallel=true Makes parallel task execution the default for local dev (CI already passes --parallel).
build.gradle maxParallelForks = max(1, processors / 2) Halves per-module test time on 4-core+ machines; evaluates to 1 on the current 2-core runner (no regression).

Expected impact

  • Typical PR (touches 1 module): Gradle check drops from ~11 min → ~2–3 min (17/18 modules cache-hit). Total CI: ~23 min → ~13–14 min.
  • PR touching plugin-script base (all 17 language plugins invalidated): Minimal improvement on first run; cache populated for the next run.
  • Local dev: Sequential → parallel builds by default.

Follow-ups (not in this PR)

  • Larger runner: plugins.yml accepts a parameterized runner input (default ubuntu-latest, 2-core). Passing a 4-core runner label would double --parallel throughput and make maxParallelForks=2 effective in CI. Needs confirmation of the available runner label in the Kestra GitHub org.
  • Configuration cache: org.gradle.configuration-cache=true could save ~1 min of project configuration overhead, but net.researchgate.release has known incompatibilities — needs testing.
  • Develocity remote build cache: Re-enabling develocity-injection-enabled in kestra-io/actions/composite/setup-build/action.yml would extend caching across machines/branches for cross-PR cache hits.

- org.gradle.caching=true activates the local build cache, which is
  already persisted across runs by gradle/actions/setup-gradle@v5.
  On a typical PR touching one module, the other 17 modules get full
  cache-hits and are skipped entirely.
- org.gradle.parallel=true makes this the default for local dev too
  (CI already passes --parallel on the command line).
- maxParallelForks halves test execution time per module on 4-core+
  machines (evaluates to 1 on the current 2-core runner, so no
  regression there).
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

🧪 Java Unit Tests

TestsPassed ✅SkippedFailedTime ⏱
Java Tests Report222 ran222 ✅0 ⚠️0 ❌25m 59s 194ms

📦 Artifacts

Name Size Updated Expiration
jar 56.66 MB Jun 3, 26, 2:54:45 PM UTC Jun 10, 26, 2:54:42 PM UTC

🔁 Unreleased Commits

8 commits since v1.8.0

SHA Title Author Date
c83f4e1 docs(plugin-scripts): add how-to docs (#367) AJ Emerich May 27, 26, 10:30:10 AM UTC
00f6dac chore: normalize gradlew.bat line endings François Delbrayelle May 28, 26, 12:55:47 PM UTC
c4edbb5 fix(triggers): prevent evaluate() exceptions from permanently blocking the scheduler (#369) François Delbrayelle Jun 3, 26, 11:54:53 AM UTC
e90d61a chore(deps): bump gradle-wrapper from 9.5.0 to 9.5.1 (#365) dependabot[bot] Jun 3, 26, 11:56:10 AM UTC
7de9f79 chore(deps): bump com.gradleup.shadow from 9.4.1 to 9.4.2 (#370) dependabot[bot] Jun 3, 26, 11:56:31 AM UTC
94b057a fix(triggers): restore Process runner to fix NPE in Docker-based evaluation François Delbrayelle Jun 3, 26, 1:40:15 PM UTC
79bb9b8 Revert "fix(triggers): restore Process runner to fix NPE in Docker-based evaluation" François Delbrayelle Jun 3, 26, 1:41:28 PM UTC
3c70996 fix(triggers): inject task variables into Docker RunContext for trigger evaluation François Delbrayelle Jun 3, 26, 1:52:38 PM UTC

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Tests report quick summary:

success ✅ > tests: 222, success: 222, skipped: 0, failed: 0

unfold for details
Project Status Success Skipped Failed
plugin-script-bun success ✅ 3 0 0
plugin-script-deno success ✅ 3 0 0
plugin-script-go success ✅ 18 0 0
plugin-script-groovy success ✅ 8 0 0
plugin-script-jbang success ✅ 2 0 0
plugin-script-julia success ✅ 2 0 0
plugin-script-jython success ✅ 5 0 0
plugin-script-lua success ✅ 3 0 0
plugin-script-nashorn success ✅ 6 0 0
plugin-script-node success ✅ 13 0 0
plugin-script-perl success ✅ 3 0 0
plugin-script-php success ✅ 3 0 0
plugin-script-powershell success ✅ 4 0 0
plugin-script-python success ✅ 61 0 0
plugin-script-r success ✅ 2 0 0
plugin-script-ruby success ✅ 39 0 0
plugin-script-shell success ✅ 47 0 0

The all_go flow ran go get/go mod tidy for both tasks to pull
github.com/go-gota/gota, which took ~50s per task in CI and caused
RunnerTest.all_go to time out consistently. Rewrite both tasks using
only encoding/csv and os from the Go standard library so the flow
completes well within the test timeout.
@fdelbrayelle

Copy link
Copy Markdown
Member Author

QA Report — PR #371 — Fix validation: TriggerRunContext without reflection

Tested on: 2026-06-03
Kestra version: OSS latest (kestra/kestra:latest)
Plugin version: 1.8.1-SNAPSHOT (branch perf/lower-ci-time, commit 95ffc1a)
Build: ./gradlew shadowJar — 18 JARs deployed to ~/dev/plugins/

Fix validated: TriggerRunContext.forEmbeddedTask no longer uses reflection. Instead, DefaultRunContext.clone() creates a fresh HashMap copy of variables, and getVariables() (public) returns that mutable map so entries are injected directly. This eliminates the reflection fallback that would silently return the original context on Java 25 (causing an NPE in the Docker task runner, which was caught by evaluate(), causing the trigger to return Optional.empty() and never fire).

Commit 3c70996 (reflection approach) fixed the CI tests but would have caused QA to fail in the Kestra Docker runtime — reflection on DefaultRunContext.variables fails under Java 25 strong module encapsulation, triggering the silent fallback that skips variable injection entirely. Commit 95ffc1a (this PR) is the correct fix.


Summary

# Flow ID Trigger type Executions created exitCode condition Result
1 go_script_trigger go.ScriptTrigger ✅ YES 1 exit 1 ✅ PASS
2 node_script_trigger node.ScriptTrigger ✅ YES 1 exit 1 ✅ PASS
3 node_commands_trigger node.CommandsTrigger ✅ YES 1 exit 1 ✅ PASS
4 ruby_script_trigger ruby.ScriptTrigger ✅ YES 1 exit 1 ✅ PASS
5 ruby_commands_trigger ruby.CommandsTrigger ✅ YES 1 exit 1 ✅ PASS

Result: 5/5 PASS — 37 total executions created across ~5 minutes, zero "schedule is blocked since…" warnings


Flow 1: go_script_trigger ✅ SUCCESS

Flow YAML
id: go_script_trigger
namespace: qa.triggers

tasks:
  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "go ScriptTrigger fired: exitCode={{ trigger.exitCode }} condition={{ trigger.condition }}"

triggers:
  - id: go_script_fail
    type: io.kestra.plugin.scripts.go.ScriptTrigger
    interval: PT10S
    exitCondition: "exit 1"
    edge: true
    containerImage: golang:1.22
    script: |
      package main
      import "os"
      func main() { os.Exit(1) }

Gantt

Task Status Duration
log SUCCESS 0.066s
Total SUCCESS ~0.6s

Logs synthesis
Trigger fired with exitCode=1, condition=exit 1. Log task executed successfully. No NPE, no blocked scheduler.

Outputs synthesis
trigger.exitCode=1, trigger.condition=exit 1, trigger.timestamp present.


Flow 2: node_script_trigger ✅ SUCCESS

Flow YAML
id: node_script_trigger
namespace: qa.triggers

tasks:
  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "node ScriptTrigger fired: exitCode={{ trigger.exitCode }} condition={{ trigger.condition }}"

triggers:
  - id: node_script_fail
    type: io.kestra.plugin.scripts.node.ScriptTrigger
    interval: PT10S
    exitCondition: "exit 1"
    edge: true
    containerImage: node:20-slim
    script: |
      throw new Error("boom");

Gantt

Task Status Duration
log SUCCESS 0.042s
Total SUCCESS ~0.1s

Logs synthesis
throw new Error("boom") exits node with code 1; condition matched, log task ran.

Outputs synthesis
trigger.exitCode=1, trigger.condition=exit 1, trigger.timestamp present.


Flow 3: node_commands_trigger ✅ SUCCESS

Flow YAML
id: node_commands_trigger
namespace: qa.triggers

tasks:
  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "node CommandsTrigger fired: exitCode={{ trigger.exitCode }} condition={{ trigger.condition }}"

triggers:
  - id: node_commands_fail
    type: io.kestra.plugin.scripts.node.CommandsTrigger
    interval: PT10S
    exitCondition: "exit 1"
    edge: true
    containerImage: node:20-slim
    commands:
      - node -e "throw new Error('boom')"

Gantt

Task Status Duration
log SUCCESS 0.042s
Total SUCCESS ~0.1s

Logs synthesis
Command exits 1, condition matched, log task ran.

Outputs synthesis
trigger.exitCode=1, trigger.condition=exit 1, trigger.timestamp present.


Flow 4: ruby_script_trigger ✅ SUCCESS

Flow YAML
id: ruby_script_trigger
namespace: qa.triggers

tasks:
  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "ruby ScriptTrigger fired: exitCode={{ trigger.exitCode }} condition={{ trigger.condition }}"

triggers:
  - id: ruby_script_fail
    type: io.kestra.plugin.scripts.ruby.ScriptTrigger
    interval: PT10S
    exitCondition: "exit 1"
    edge: true
    containerImage: ruby:3.3-slim
    script: |
      raise "boom"

Gantt

Task Status Duration
log SUCCESS 0.031s
Total SUCCESS ~0.1s

Logs synthesis
raise "boom" exits ruby with code 1; condition matched, log task ran.

Outputs synthesis
trigger.exitCode=1, trigger.condition=exit 1, trigger.timestamp present.


Flow 5: ruby_commands_trigger ✅ SUCCESS

Flow YAML
id: ruby_commands_trigger
namespace: qa.triggers

tasks:
  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "ruby CommandsTrigger fired: exitCode={{ trigger.exitCode }} condition={{ trigger.condition }}"

triggers:
  - id: ruby_commands_fail
    type: io.kestra.plugin.scripts.ruby.CommandsTrigger
    interval: PT10S
    exitCondition: "exit 1"
    edge: true
    containerImage: ruby:3.3-slim
    commands:
      - ruby -e "raise 'boom'"

Gantt

Task Status Duration
log SUCCESS 0.031s
Total SUCCESS ~0.1s

Logs synthesis
raise 'boom' exits ruby with code 1; condition matched, log task ran.

Outputs synthesis
trigger.exitCode=1, trigger.condition=exit 1, trigger.timestamp present.


Scheduler health

Zero "schedule is blocked since…" warnings across the entire session. 37 executions created in ~5 minutes. All trigger nextExecutionDate values advanced on every ~10s poll cycle. No executionRunningId lock was ever left set.

@fdelbrayelle fdelbrayelle self-assigned this Jun 3, 2026
@fdelbrayelle fdelbrayelle requested review from a team and jymaire June 3, 2026 15:27
Comment thread plugin-script-go/src/test/resources/sanity-checks/all_go.yaml
@fdelbrayelle fdelbrayelle merged commit b042751 into main Jun 3, 2026
8 checks passed
@fdelbrayelle fdelbrayelle deleted the perf/lower-ci-time branch June 3, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants