Skip to content

Auto-generate per-module NOTICE-binary from jars' META-INF/NOTICE #4674

@bobbai00

Description

@bobbai00

What happened?

After #4668 lands, the per-module NOTICE-binary files describe each Docker image's bundled third-party content, but they're hand-curated subsets of the previously-curated root NOTICE-binary. Hand-curated NOTICE files rot fast — every dep bump silently drifts the committed content from what the jars' META-INF/NOTICE actually carry.

ASF compliance under Apache-2.0 §4(d) requires reproducing the attribution notices in every Apache-2.0 dep's bundled NOTICE file. Those notices live in each jar's META-INF/NOTICE. The right source of truth is the jars themselves.

Proposed change

Add a generator that produces each <module>/NOTICE-binary from the actual bundled jars:

  1. Walks the module's lib/ dir.
  2. For each jar, extracts every META-INF/NOTICE-style file.
  3. Dedupes by content hash so jars sharing an upstream NOTICE collapse into one block.
  4. Emits one block per unique blob with a synthesized project heading + the verbatim upstream content.
  5. Optional --extras for non-jar attributions (Apache-2.0 Python wheels like aiohttp + Matplotlib that don't ship a NOTICE inside any jar).

Then add a CI check that regenerates <module>/NOTICE-binary against the freshly-built dist lib/ and diffs against the committed file. Drift fails the build with a one-line fix-up command.

Version

1.1.0-incubating (Pre-release/Master)

Depends on

This change requires #4668 to land first (which introduces the per-module NOTICE-binary files in the first place).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions