Skip to content

Add LICENSE-binary for binary distributions (Docker images and dist zips) #4395

@bobbai00

Description

@bobbai00

Task Summary

Binary distributions (Docker images, sbt-native-packager dist zips) bundle ~450 third-party dependency jars alongside Texera's own jars. Per ASF policy, the LICENSE file shipped with a binary distribution must describe all bundled contents, not just the source release.

Currently, binary distributions copy the repo-root LICENSE, which only covers vendored source code (mbknor, Angular formly, TypeFox, SVGRepo). It does not account for the hundreds of third-party jars in lib/.

What needs to be done:

  • Add a LICENSE-binary file that lists all non-Apache-2.0 bundled dependencies grouped by license (MIT, BSD, EPL, MPL, CDDL, etc.), with full license text for each in the licenses/ directory.
  • Add a tools/licensing/collect_binary_licenses.sh helper script (modeled after Flink's) that extracts META-INF/LICENSE and META-INF/NOTICE from each bundled jar for review.
  • Add a tools/licensing/check_binary_deps.sh script and CI workflow that compares actual bundled jars against a known list (known-binary-deps.txt) and fails if a new dependency is added without updating LICENSE-binary.
  • Wire LICENSE-binary into Dockerfiles and dist zips so it replaces the repo-root LICENSE in binary artifacts.

Priority

P1 – High

Task Type

  • Code Implementation
  • Documentation
  • Refactor / Cleanup
  • Testing / QA
  • DevOps / Deployment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions