Skip to content

Add fuzz tests and fix config parsing panics#3010

Merged
michaeldwan merged 1 commit into
md/gh-releases-multi-file-sourcesfrom
config-fuzz-testing
May 8, 2026
Merged

Add fuzz tests and fix config parsing panics#3010
michaeldwan merged 1 commit into
md/gh-releases-multi-file-sourcesfrom
config-fuzz-testing

Conversation

@markphelps
Copy link
Copy Markdown
Collaborator

This PR adds Go fuzz tests for config parsing and fixes two categories of crashes found during fuzzing.

Changes

Fuzz tests (pkg/config/fuzz_test.go)

  • FuzzParseBytes — fuzzes raw YAML → configFile parsing
  • FuzzFromYAML — fuzzes YAML → Config struct conversion
  • FuzzConfigComplete — fuzzes the full path including CUDA resolution, requirements loading, and environment parsing

Bug fixes

1. Panic on invalid CUDA version strings
version.Matches(), Equal(), Greater(), and StripPatch() called MustVersion(), which panics on invalid input like 11.A. Replaced with graceful NewVersion error handling.

Also added minor version validation to validateCudaVersion() so invalid versions are rejected early with a clean error.

2. Stack overflow from yaml v4 error wrapping
The yaml v4 library's LoadErrors type creates a self-referential error cycle when custom UnmarshalYAML implementations wrap the callback error with %w. Any subsequent errors.Is call triggers infinite recursion.

Fixed by using %v instead of %w when wrapping yaml parse errors, with an explanatory comment. Filed upstream: yaml/go-yaml#345

Regression tests

  • TestInvalidVersionsDoNotPanic — verifies graceful handling of invalid version strings
  • TestInvalidWeightsSourceDoesNotStackOverflow — verifies errors.Is doesn't crash on yaml parse errors
  • Fuzz corpus entry for the 11.A case

Add Go fuzz tests for config parsing (FuzzParseBytes, FuzzFromYAML,
FuzzConfigComplete) and fix two categories of crashes found:

1. version.Matches/Equal/Greater panicked on invalid version strings
   like "11.A". Replace MustVersion with graceful NewVersion handling.

2. yaml v4 LoadErrors created infinite recursion when errors.Is
   traversed wrapped errors. Use %v instead of %w when wrapping yaml
   parse errors to break the self-referential cycle.

Also validate CUDA minor version in validateCudaVersion().
@markphelps markphelps requested a review from a team as a code owner May 8, 2026 15:00
@markphelps markphelps changed the base branch from main to md/gh-releases-multi-file-sources May 8, 2026 15:07
@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 8, 2026

@markphelps Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

@markphelps
Copy link
Copy Markdown
Collaborator Author

/bonk review

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Add fuzz tests and fix config parsing panics

Excellent PR! The changes are well-structured, thoroughly tested, and fix real crashes found through fuzzing.

Summary of Changes

✅ Fuzz tests (pkg/config/fuzz_test.go)

  • Three well-designed fuzz targets covering different layers: raw parsing, struct conversion, and full completion
  • Good seed corpus with valid and edge-case YAML inputs

✅ Bug fix: Invalid CUDA version handling (pkg/util/version/version.go)

  • Replaces panic-prone MustVersion() with graceful error handling in Equal(), EqualMinor(), Greater(), Matches(), and StripPatch()
  • Sensible fallback behavior: returns false for comparisons and original string for StripPatch()
  • Clear error messages in NewVersion() for debugging

✅ Bug fix: YAML v4 stack overflow (pkg/config/parse.go:38)

  • Changes %w to %v when wrapping yaml parse errors
  • Prevents infinite recursion in errors.Is() traversals due to yaml v4's buggy LoadErrors.Is()
  • Excellent explanatory comment with upstream issue link

✅ Additional validation (pkg/config/config.go:436-439)

  • Validates minor version is numeric in validateCudaVersion()
  • Catches invalid versions like 11.A early with clean error messages

✅ Regression tests

  • TestInvalidVersionsDoNotPanic - comprehensive coverage of version helpers
  • TestInvalidWeightsSourceDoesNotStackOverflow - verifies yaml error handling
  • Fuzz corpus entry for the 11.A case

Code Quality

  • Follows existing Go patterns and conventions
  • Proper use of testify assertions (require/assert)
  • Clear comments explaining rationale for non-obvious changes
  • All new tests pass

Suggestions (non-blocking)

The PR is ready to merge as-is. Nice work on catching these edge cases with fuzzing and providing solid fixes!

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 8, 2026

Posted an approving review on PR #3010. The PR is well-structured with:

  1. Fuzz tests for config parsing at multiple layers
  2. Two critical bug fixes:
    • Invalid CUDA version strings now handled gracefully (no panics)
    • YAML v4 error wrapping fixed to prevent stack overflow
  3. Comprehensive regression tests covering both fixes
  4. Early validation for malformed CUDA versions

All tests pass and the code follows Go best practices. The PR is ready to merge.

github run

@michaeldwan michaeldwan merged commit 09ba90b into md/gh-releases-multi-file-sources May 8, 2026
74 of 77 checks passed
@michaeldwan michaeldwan deleted the config-fuzz-testing branch May 8, 2026 16:25
michaeldwan added a commit that referenced this pull request May 8, 2026
* Support multi-source weights and HTTPS weight sources

Weights can now declare multiple sources that merge into a single target
directory (last-in-wins for path conflicts). This covers the common case
of pulling .pth files from multiple GitHub release URLs into one weights
dir.

Config: source field accepts an object or array via custom unmarshalers.
Lockfile: Source -> Sources[] with ImportedAt moved to entry level.
WeightSpec: URI/Include/Exclude -> Sources[]SourceSpec.
New HTTPSource for https:// and http:// URIs with ETag fingerprinting.
Multi-source merge in resolveInventory with per-file owner tracking.

* Address review feedback on multi-source weights

- Reject HTTP URIs with embedded credentials in NewHTTPSource and
  normalizeHTTPURI; the URI is recorded in weights.lock and would
  leak via git. Extract parseHTTPURI helper so the rule lives in
  one place.
- Treat weak ETags (W/"...") as no ETag — they don't promise content
  identity per RFC 7232 §2.3. Document ETag as a cache hint, not a
  fingerprint.
- Log HEAD failures and weak-ETag fallbacks at debug level.
- Add MarshalYAML on WeightSourceList for symmetry with MarshalJSON;
  without it a single-source weight round-trips through YAML as a
  one-element sequence instead of a mapping.
- Add a "source is required" check in validateWeights so users see
  an actionable error instead of the schema's "must be a mapping".
- Surface multi-source path collisions at info level — silent
  shadowing is easy to miss in cog.yaml diffs.
- Show "(+N more)" indicator in cog weights status for multi-source
  weights so the truncation is visible.
- Render multi-source plan output on multiple lines instead of
  joining URIs with commas (URLs can contain commas).
- Drop the unused CombineFingerprints helper and merged.Fingerprint
  field; per-source fingerprints are tracked separately and used
  wherever drift matters.

* Cover multi-source weight paths with tests

Every existing weight test instantiated a single-element source list,
leaving the multi-source merge, override, and per-source fingerprint
paths exercised only in production. Add tests for:

- merging disjoint files from two sources into one inventory and
  resolving overlapping paths last-in-wins (table-driven, since the
  two cases share setup shape)
- per-source fingerprint drift (mutating one source leaves the
  other's fingerprint stable)
- PlanImport surfacing drift on a specific source index
- YAML and JSON parsing of the array form of source:
- MarshalYAML/MarshalJSON shape contract (mapping for single,
  sequence/array for multi) with round-trip through unmarshal
- validateWeights accepting the multi-source form and attributing
  pattern errors to the right source index

The new tests live alongside the existing single-source tests they
parallel — multi-source isn't a separate concept, just a different
length of the source list.

* fix: add fuzz tests and fix config parsing panics (#3010)

Add Go fuzz tests for config parsing (FuzzParseBytes, FuzzFromYAML,
FuzzConfigComplete) and fix two categories of crashes found:

1. version.Matches/Equal/Greater panicked on invalid version strings
   like "11.A". Replace MustVersion with graceful NewVersion handling.

2. yaml v4 LoadErrors created infinite recursion when errors.Is
   traversed wrapped errors. Use %v instead of %w when wrapping yaml
   parse errors to break the self-referential cycle.

Also validate CUDA minor version in validateCudaVersion().

* Test weak ETag fallback to SHA256 in HTTP source

---------

Co-authored-by: Mark Phelps <209477+markphelps@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants