Skip to content

fix(nuget): narrow legacy project.json recognition for GitLab verification#830

Merged
mstykow merged 8 commits intomainfrom
verify/gitlab-ruby-pr826
May 1, 2026
Merged

fix(nuget): narrow legacy project.json recognition for GitLab verification#830
mstykow merged 8 commits intomainfrom
verify/gitlab-ruby-pr826

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented Apr 30, 2026

Summary

  • verify gitlabhq/gitlabhq with the maintained compare-outputs workflow in the context of PR 826's Ruby changes, and record the reviewed benchmark snapshot in docs/BENCHMARKS.md
  • stop treating generic GitLab export project.json fixtures as legacy NuGet manifests by requiring legacy NuGet shape keys before parser extraction or fallback rows
  • align the NuGet improvement docs and generated supported-formats matrix with the narrowed legacy project.json behavior

Issues

  • Covers: GitLab Ruby-lane verification for the PR 826 static-resolution work
  • Closes:

Scope and exclusions

  • Included:
    • src/parsers/nuget/project_json.rs
    • focused regression coverage in src/parsers/nuget/nuget_test.rs
    • docs/BENCHMARKS.md and docs/benchmarks/scan-duration-vs-files.svg
    • docs/improvements/nuget-parser.md and generated docs/SUPPORTED_FORMATS.md
  • Explicit exclusions:
    • no broader parser behavior changes outside legacy project.json recognition
    • no benchmark claims beyond the recorded gitlabhq/gitlabhq compare snapshot

Intentional differences from Python

  • legacy NuGet project.json support now requires legacy-manifest shape rather than filename alone, which keeps generic GitLab export fixtures out of NuGet output while preserving fallback rows for malformed but still probable legacy NuGet manifests
  • the GitLab benchmark explicitly validates the Ruby slice of PR 826: Provenant resolves real gem versions from local Ruby constants where ScanCode still leaves ::VERSION placeholders

Follow-up work

  • Created or intentionally deferred:
    • reviewed compare artifacts:
      • .provenant/compare-runs/20260430T174411Z-gitlabhq-45077
      • .provenant/compare-runs/20260430T194138Z-gitlabhq-85785
    • no new goldens were required for this change; the focused parser and golden checks passed unchanged
    • validations run:
      • cargo test test_project_json_
      • cargo test --features golden-tests test_golden_project_json
      • cargo run --manifest-path xtask/Cargo.toml --bin generate-benchmark-chart
      • cargo run --manifest-path xtask/Cargo.toml --bin generate-supported-formats

mstykow and others added 8 commits April 30, 2026 22:09
Avoid false positive NuGet package extraction from generic GitLab export fixtures while preserving fallback rows for malformed legacy manifests.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Align the supported-formats and improvement docs with the narrowed legacy project.json parser so the user-facing NuGet surface matches the actual recognition rules.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Record the reviewed compare-outputs result for gitlabhq/gitlabhq and refresh the benchmark chart summary with the new same-host timing data.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Drop generic field-label, translation-placeholder, enum-blob, and (c)-prefixed code-expression noise in the refiner while preserving validated maintainer, structured-credits, obfuscated-contact, and real holder/copyright detections.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Tighten code-call and generic-label filtering so valid collective holders, lowercase company suffixes, and parenthetical variants survive the new noise cleanup without reopening the repo-specific junk buckets that motivated the refiner pass.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Preserve lowercase company suffix holders from parsed token spans and recover lowercase hyphenated names from copyright-plus-URL forms so real project and company identities survive detector cleanup without broad holder backfill regressions.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Add regression coverage for collective holders, lowercase company suffixes, lowercase handle emails, lowercase hyphenated project names, and detector token-span holder recovery so the branch-local CI breakage stays fixed under future detector and refiner changes.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Update the owned copyright golden expectations to reflect the new detector/refiner end state: junk code fragments stay holder-free, malformed (c) fixtures drop out, and Project Mayo keeps the improved holder while preserving the shorter copyright variant that still survives extraction.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
@mstykow mstykow merged commit 8111e7b into main May 1, 2026
15 checks passed
@mstykow mstykow deleted the verify/gitlab-ruby-pr826 branch May 1, 2026 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant