Task Summary
The amber CI job currently runs every Scala test in WorkflowExecutionService (66 spec files) inside a single matrix entry that always installs both Scala and Python dependencies. Only a handful of tests actually need Python at runtime (they spawn Python UDF workers via the e2e harness); the rest are pure-Scala unit tests that pay for the Python install on every run and conflate "needs Python" failures with engine-internal regressions.
Split into two jobs, incrementally:
- Add a class-level ScalaTest tag annotation
@IntegrationTest (FQN org.apache.texera.amber.tags.IntegrationTest) under amber/src/test/scala/.... ScalaTest will pick this up via its tag annotation machinery, so no per-test taggedAs(...) is required.
- Introduce a new
amber-integration job in build.yml that mirrors the existing amber job's setup (JDK + sbt + Postgres) plus Python and runs only tests tagged IntegrationTest: sbt 'WorkflowExecutionService/testOnly * -- -n org.apache.texera.amber.tags.IntegrationTest'.
- Modify the existing
amber job to skip the same tag (-l ...) and drop its Python setup.
- Wire
run_amber_integration through precheck / required-checks.yml so the new job is gated identically to amber.
- As the first migration, annotate
engine/e2e/ReconfigurationSpec.scala (5 tests, the only e2e spec that actually spawns Python UDFs). Other e2e specs (DataProcessingSpec, PauseSpec, BatchSizePropagationSpec, PythonWorkflowWorkerSpec) can move in follow-up PRs as they are reviewed.
Task Type
Task Summary
The amber CI job currently runs every Scala test in
WorkflowExecutionService(66 spec files) inside a single matrix entry that always installs both Scala and Python dependencies. Only a handful of tests actually need Python at runtime (they spawn Python UDF workers via the e2e harness); the rest are pure-Scala unit tests that pay for the Python install on every run and conflate "needs Python" failures with engine-internal regressions.Split into two jobs, incrementally:
@IntegrationTest(FQNorg.apache.texera.amber.tags.IntegrationTest) underamber/src/test/scala/.... ScalaTest will pick this up via its tag annotation machinery, so no per-testtaggedAs(...)is required.amber-integrationjob inbuild.ymlthat mirrors the existingamberjob's setup (JDK + sbt + Postgres) plus Python and runs only tests taggedIntegrationTest:sbt 'WorkflowExecutionService/testOnly * -- -n org.apache.texera.amber.tags.IntegrationTest'.amberjob to skip the same tag (-l ...) and drop its Python setup.run_amber_integrationthroughprecheck/required-checks.ymlso the new job is gated identically toamber.engine/e2e/ReconfigurationSpec.scala(5 tests, the only e2e spec that actually spawns Python UDFs). Other e2e specs (DataProcessingSpec,PauseSpec,BatchSizePropagationSpec,PythonWorkflowWorkerSpec) can move in follow-up PRs as they are reviewed.Task Type