Skip to content

OSS-Fuzz/ClusterFuzz finding 66812 #839

Closed
@hartwork

Description

@hartwork

Issue

This is the public side of soon-public (access protected) oss-fuzz Expat finding:
Issue 66812: expat:xml_parse_fuzzer_UTF-16: Timeout in xml_parse_fuzzer_UTF-16.

The three key (access protected) contained links are:

The reproducer (original attached as file clusterfuzz-testcase-minimized-xml_parse_fuzzer_UTF-16-5187173185814528-timeout-original.xml.txt) is essentially this two-file case:

File one.xml:

<!DOCTYPE doc SYSTEM "two.dtd">
<doc>&g1;</doc>

File two.dtd:

<!ENTITY % p1 '%p1'>
<!ENTITY g1 '%p1;'>

The original's SHA245 sum is 6b870c78cff9efe41f1060277f434da98b9f78e9159baf4cb95f034e890ac087.

The regression link effectively links to be47f6d...716fd10 (which makes good sense since these changes increase fuzzing coverage).

Analysis

ClusterFuzz uncovered two things at once here:

  • Direct recursion (a -> a in contrast to indirect recursion a -> b -> a) of parameter entities (reference syntax %name;) was previously not detected in the external subset — file two.dtd in the example above) — but it is forbidden by the XML spec and was also causing undefined behavior at runtime.
    # ./fuzz/xml_parse_fuzzer_UTF-8 [..]/clusterfuzz-testcase-minimized-xml_parse_fuzzer_UTF-16-5187173185814528-timeout-original.xml.txt
    INFO: Running with entropic power schedule (0xFF, 100).
    INFO: Seed: 3441536382
    INFO: Loaded 1 modules   (21351 inline 8-bit counters): 21351 [0x558df02c6000, 0x558df02cb367), 
    INFO: Loaded 1 PC tables (21351 PCs): 21351 [0x558df02cb368,0x558df031e9d8), 
    ./fuzz/xml_parse_fuzzer_UTF-8: Running 1 inputs 1 time(s) each.
    Running: [..]/clusterfuzz-testcase-minimized-xml_parse_fuzzer_UTF-16-5187173185814528-timeout-original.xml.txt
    [..]/expat/lib/xmlparse.c:6273:46: runtime error: applying zero offset to null pointer
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior [..]/expat/lib/xmlparse.c:6273:46 in 
    Pull request Reject direct parameter entity recursion (part of #839) #841 addresses that problem.
  • The way that the fuzzing code uses external parsers (created via function XML_ExternalEntityParserCreate) caused a timeout revealing that due to the lack of any direct input bytes from the parent parser, the amplification ratio calculated by accountingGetCurrentAmplification was constantly reported as 1.0 and hence had little chance of stopping billion laughs attacks in practice. Pull request [CVE-2024-28757] Prevent billion laughs attacks in isolated external parser (part of #839) #842 addresses that problem.

Two related side notes:

  • ClusterFuzz leaked part of this finding — the recursion aspect — to the public corpus a few days ago as case f9b6ba558667913f4554395e039c01f6d8217b43 that later disappeared from the public corpus again.

  • Statement…

    If the same entity is declared more than once, the first declaration encountered is binding;

    …in section 4.2 Entity Declarations of the XML spec is worth noting, emphasis mine.

CC @catenacyber @RMJ10 @Snild-Sony

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions