Skip to content

fix(copyright): improve zlib verification attribution#832

Merged
mstykow merged 2 commits intomainfrom
verify/autotools-zlib
May 1, 2026
Merged

fix(copyright): improve zlib verification attribution#832
mstykow merged 2 commits intomainfrom
verify/autotools-zlib

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented May 1, 2026

Summary

  • improve generic author extraction for plaintext contributor rosters and wrapped written on top of ... by prose, plus trim prose tails after contact-bearing attributions
  • verify madler/zlib with compare-outputs --profile common, rerun narrow copyright golden coverage, and record the benchmark snapshot in docs/BENCHMARKS.md
  • regenerate docs/benchmarks/scan-duration-vs-files.svg after adding the zlib row

Issues

  • Covers: madler/zlib verification work from docs/implementation-plans/package-detection/PARSER_VERIFICATION_SCORECARD.md

Scope and exclusions

  • Included:
    • generic copyright/author heuristic fixes in the shared detector pipeline
    • focused regression tests for the new author-extraction paths
    • zlib compare-output verification and benchmark documentation update
  • Explicit exclusions:
    • no scorecard status change, because the Autotools row is already marked verified
    • no parser or golden fixture updates were needed

Intentional differences from Python

  • Provenant keeps the more specific LicenseRef-scancode-info-zip-2009-01 AND Zlib classification on contrib/minizip/unzip.c instead of collapsing it to plain Zlib

Follow-up work

  • Created or intentionally deferred:
    • no new follow-up item created; the remaining compare tail is limited to legacy attribution/doc differences outside this benchmark row’s recorded end-state advantages

Expected-output fixture changes

  • Files changed: none
  • Why the new expected output is correct: shared detector changes passed test_golden_authors and test_golden_ics, so no owned fixture drift required an update

Verification

  • cargo test plaintext_roster_lines_extract_individual_authors --lib
  • cargo test written_on_top_of_line_extracts_author --lib
  • cargo test refine_author_truncates_trailing_prose_after_contact --lib
  • cargo test written_by_author_email_for_project_is_extracted --lib
  • cargo test --features golden-tests test_golden_authors --test copyright_golden
  • cargo test --features golden-tests test_golden_ics --test copyright_golden
  • cargo run --manifest-path xtask/Cargo.toml --bin compare-outputs -- --repo-url https://github.com/madler/zlib.git --repo-ref f9dd6009be3ed32415edf1e89d1bc38380ecb95d --profile common
  • cargo run --manifest-path xtask/Cargo.toml --bin generate-benchmark-chart

Compare artifacts

  • Final compare run: .provenant/compare-runs/20260501T105917Z-zlib-47314/
  • Prior triage run: .provenant/compare-runs/20260501T104122Z-zlib-6507/

mstykow added 2 commits May 1, 2026 13:06
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
@mstykow mstykow merged commit ef3cad8 into main May 1, 2026
15 checks passed
@mstykow mstykow deleted the verify/autotools-zlib branch May 1, 2026 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant