What happened?
Today every Docker image (texera-access-control-service, texera-config-service, texera-file-service, texera-workflow-compiling-service, texera-workflow-computing-unit-managing-service, texera-dashboard-service, texera-workflow-execution-coordinator, texera-workflow-execution-runner, texera-agent-service) ships the same monolithic root LICENSE-binary and NOTICE-binary containing the union of every third-party dependency across all services, frontend, agent-service, and Python.
The issue: the LICENSE in each image overstates what is actually bundled in that image. For example, texera-access-control-service ships only 116 third-party jars but its /texera/LICENSE references ~860 entries spanning frontend Angular packages, Python packages, agent-service npm packages, and jars only relevant to other services. ASF's licensing guidance is that the LICENSE in a binary distribution must describe the contents of that distribution.
How to reproduce?
docker run --rm --entrypoint sh ghcr.io/apache/texera-access-control-service:latest -c 'cat /texera/LICENSE | grep "^ - " | wc -l'
- Compare with
docker run --rm --entrypoint sh ghcr.io/apache/texera-access-control-service:latest -c 'ls /texera/lib/ | wc -l'
- The LICENSE references many more third-party components than the image actually bundles.
Expected behavior
Each Docker image's /texera/LICENSE should describe only the third-party components actually bundled in that image:
- Standalone Scala services: only the jars under their
lib/.
WorkflowExecutionService images (dashboard, coordinator, runner): the amber/ jars, plus Python deps for coordinator and runner, plus the frontend Angular bundle for the dashboard.
texera-agent-service: only the bun-installed npm packages.
Version
1.1.0-incubating (Pre-release/Master)
What happened?
Today every Docker image (
texera-access-control-service,texera-config-service,texera-file-service,texera-workflow-compiling-service,texera-workflow-computing-unit-managing-service,texera-dashboard-service,texera-workflow-execution-coordinator,texera-workflow-execution-runner,texera-agent-service) ships the same monolithic rootLICENSE-binaryandNOTICE-binarycontaining the union of every third-party dependency across all services, frontend, agent-service, and Python.The issue: the LICENSE in each image overstates what is actually bundled in that image. For example,
texera-access-control-serviceships only 116 third-party jars but its/texera/LICENSEreferences ~860 entries spanning frontend Angular packages, Python packages, agent-service npm packages, and jars only relevant to other services. ASF's licensing guidance is that the LICENSE in a binary distribution must describe the contents of that distribution.How to reproduce?
docker run --rm --entrypoint sh ghcr.io/apache/texera-access-control-service:latest -c 'cat /texera/LICENSE | grep "^ - " | wc -l'docker run --rm --entrypoint sh ghcr.io/apache/texera-access-control-service:latest -c 'ls /texera/lib/ | wc -l'Expected behavior
Each Docker image's
/texera/LICENSEshould describe only the third-party components actually bundled in that image:lib/.WorkflowExecutionServiceimages (dashboard, coordinator, runner): theamber/jars, plus Python deps for coordinator and runner, plus the frontend Angular bundle for the dashboard.texera-agent-service: only the bun-installed npm packages.Version
1.1.0-incubating (Pre-release/Master)