Skip to content

Fix libyaml_emitter_fuzzer: inverted copy_event return value check#15096

Merged
DavidKorczynski merged 1 commit intogoogle:masterfrom
OwenSanzas:fix/libyaml-emitter-fuzzer
Mar 7, 2026
Merged

Fix libyaml_emitter_fuzzer: inverted copy_event return value check#15096
DavidKorczynski merged 1 commit intogoogle:masterfrom
OwenSanzas:fix/libyaml-emitter-fuzzer

Conversation

@OwenSanzas
Copy link
Copy Markdown
Contributor

Summary

The libyaml_emitter_fuzzer has an inverted return value check that makes the entire emitter path dead code.

Bug 1: Inverted logic in copy_event check

copy_event() wraps yaml_*_event_initialize() functions, which return 1 on success, 0 on failure (standard libyaml convention). However, the check on line 250:

if (copy_event(&events[event_number++], &event)) {
    yaml_event_delete(&event);
    goto delete_parser;   // error path
}

enters the error path when copy_event succeeds (returns 1). For every valid YAML input, the fuzzer short-circuits on the very first event. The following code is never reached:

  • yaml_emitter_emit() — the entire emitter path
  • The second parser loop (re-parse emitted output and compare events)
  • The events_equal() comparison logic

Roughly 50% of the fuzzer's logic is dead code.

Fix: if (copy_event(...))if (!copy_event(...))

Bug 2: Pre-increment causes UB on failure path

event_number++ was inside the copy_event() call as a post-increment. When copy_event fails (returns 0 — rare, OOM only), event_number has already been incremented, but events[old_number] was not initialized. The cleanup loop then calls yaml_event_delete() on an uninitialized event — undefined behavior.

Fix: Move event_number++ to after the success check.

Bug 3: NULL buffer passed to yaml_parser_set_input_string

When the emitter produces no output (e.g., empty YAML stream), out.buf remains NULL. The second parser loop passes this NULL to yaml_parser_set_input_string(), which has assert(input). This was masked by Bug 1 since the second parser loop was never reached.

Fix: Add if (!out.buf || out.size == 0) goto error; before the second parser loop.

Coverage comparison (60 seconds, AddressSanitizer, libFuzzer fork mode)

Metric Original Fixed Change
Edges 166 2479 +1393%
Features 437 10285 +2253%
Corpus 135 2258 +1573%

The ~15x edge coverage increase confirms that the emitter, write handler, and re-parser code paths were entirely dead in the original fuzzer.

copy_event() returns 1 on success and 0 on failure (standard libyaml
convention). However, line 250 checks `if (copy_event(...))` which
enters the error path on SUCCESS and continues on FAILURE. This means
every valid YAML input triggers the error path on the very first event,
making the entire emitter path dead code:

- yaml_emitter_emit() is never called
- The second parser loop (re-parse emitted output) is never reached
- The event comparison logic is never executed

Roughly 50% of the fuzzer's logic has never been tested.

Additionally:
- event_number++ was inside the copy_event call (post-increment),
  so on failure the index advances and cleanup deletes an
  uninitialized event (UB). Moved increment after success check.
- Added NULL check for out.buf before passing to
  yaml_parser_set_input_string(), which asserts on NULL input.
  This was another latent bug masked by the same logic inversion.

Coverage comparison (60s, AddressSanitizer, libFuzzer fork mode):

| Metric   | Original | Fixed  | Change     |
|----------|----------|--------|------------|
| Edges    | 166      | 2479   | **+1393%** |
| Features | 437      | 10285  | **+2253%** |
| Corpus   | 135      | 2258   | +1573%     |

The ~15x edge coverage increase confirms the emitter, write handler,
and re-parser code paths were entirely dead in the original.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 6, 2026

OwenSanzas is a new contributor to projects/libyaml. The PR must be approved by known contributors before it can be merged. The past contributors are: arthurscchan, hunsche, perlpunk, ingydotnet, alex, rjotwani, cvediver, Dor1s, ssbr (unverified), sigmavirus24 (unverified), inferno-chromium (unverified), mikea (unverified)

Copy link
Copy Markdown
Collaborator

@DavidKorczynski DavidKorczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! Do you have any estimate for how much overall coverage gain from this? One part is the coverage of the harness itself, but the project overall has very high coverage already: https://storage.googleapis.com/oss-fuzz-coverage/libyaml/reports/20260304/linux/src/report.html

@OwenSanzas
Copy link
Copy Markdown
Contributor Author

Thanks! I ran llvm-cov with the developer-provided seed corpus (libyaml/examples/*), coverage sanitizer, 60 seconds each:

Source file Original Fixed OSS-Fuzz (all fuzzers)
api.c 14.12% 47.00% 74.62%
emitter.c 0.00% 70.84% 88.40%
parser.c 3.03% 81.49% 93.60%
reader.c 89.27% 89.70% 94.85%
scanner.c 3.50% 78.21% 95.56%
writer.c 0.00% 81.01% 88.61%
TOTAL 8.58% 71.76% 89.80%

The original emitter fuzzer hits 0% of emitter.c and writer.c — the inverted check kills the emitter path entirely. Other fuzzers already cover most of this code, so the project-level gain is probably small, but at least the emitter fuzzer is now actually doing its job.

Copy link
Copy Markdown
Collaborator

@DavidKorczynski DavidKorczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clarifications

@DavidKorczynski DavidKorczynski merged commit d94076d into google:master Mar 7, 2026
17 checks passed
@OwenSanzas
Copy link
Copy Markdown
Contributor Author

Thanks for the approval!

OwenSanzas added a commit to OwenSanzas/all-you-need-is-a-fuzzing-brain.github.io that referenced this pull request Apr 30, 2026
…ter, #15096)

Per user direction we cannot use the OpenSSL provider example because
that paper is still under review. Switched to a publicly-visible,
already-merged OSS-Fuzz PR:

  google/oss-fuzz#15096
  Fix libyaml_emitter_fuzzer: inverted copy_event return value check

The bug is a single inverted boolean: copy_event() follows the libyaml
convention 'success returns 1', but the harness treats 1 as failure
and short-circuits to the error-cleanup goto on every successful copy.
Result: ~50% of the harness (the entire emitter pipeline plus the
second re-parser loop) is dead code. Fix is one '!' character. After
merging, edge coverage went from 166 to 2479 (+1393%, ~15x).

Maps cleanly to P2.4 (Return value handling). Cited PR is public and
linkable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants