Skip to content

fix(arch): verify Arch Linux compare targets#772

Merged
mstykow merged 3 commits intomainfrom
verify/arch-linux-parser
Apr 23, 2026
Merged

fix(arch): verify Arch Linux compare targets#772
mstykow merged 3 commits intomainfrom
verify/arch-linux-parser

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented Apr 23, 2026

Summary

  • Verify Arch Linux row 22 end-to-end with compare-outputs --profile common on archlinux/packaging/packages/pacman, archlinux/packaging/packages/grep, and the local .PKGINFO sample at testdata/arch/pkginfo/basic.
  • Fix shared maintainer author extraction so maintainer-style comments keep the person identity without a synthetic Maintainers prefix.
  • Record the verified Arch runs in docs/BENCHMARKS.md, regenerate docs/benchmarks/scan-duration-vs-files.svg, and mark row 22 as verified in the scorecard.
  • Compare artifacts reviewed:
    • .provenant/compare-runs/20260423T104715Z-pacman-79411
    • .provenant/compare-runs/20260423T105234Z-grep-86472
    • .provenant/compare-runs/20260423T105313Z-basic-87766

Issues

  • Covers: parser verification scorecard row 22 (Arch Linux)

Scope and exclusions

  • Included:
    • Arch source-package verification for pacman and grep
    • Arch built-package verification for .PKGINFO
    • Shared maintainer author normalization used by PKGBUILD-style comment lines
    • Benchmark row and chart regeneration for the recorded Arch runs
  • Explicit exclusions:
    • No new PKGBUILD parser surface beyond the existing compare-driven verification work
    • No unrelated benchmark backfills or scorecard rewrites outside row 22

Intentional differences from Python

  • On archlinux/packaging/packages/grep, Provenant keeps the real PKGBUILD maintainer spelling Sébastien Luttringer <seblu@archlinux.org> instead of degrading it to ASCII.
  • On the same PKGBUILD, Provenant preserves the trailing slash in https://www.gnu.org/software/grep/ because that is the literal file text.
  • Provenant keeps GPL-3.0-or-later on that PKGBUILD and rejects ScanCode's extra LGPL-2.0-or-later, which comes from a low-coverage 16.67% sequence match spanning lines 10-11 rather than a real second declared license.

Expected-output fixture changes

  • Files changed: none
  • Why the new expected output is correct: no parser .expected.json or golden YAML fixtures changed; cargo run --manifest-path xtask/Cargo.toml --bin update-copyright-golden -- authors --list-mismatches --show-diff --filter maintainer reported would update 0 file(s) after the maintainer normalization fix.

mstykow added 3 commits April 23, 2026 13:00
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
@mstykow mstykow merged commit 3f8b05b into main Apr 23, 2026
15 checks passed
@mstykow mstykow deleted the verify/arch-linux-parser branch April 23, 2026 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant