Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary reader for BJData creates incorrect SAX events #3503

Closed
nlohmann opened this issue May 20, 2022 · 2 comments · Fixed by #3505
Closed

Binary reader for BJData creates incorrect SAX events #3503

nlohmann opened this issue May 20, 2022 · 2 comments · Fixed by #3505
Assignees
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON kind: bug release item: 🐛 bug fix solution: proposed fix a fix for the issue has been proposed and waits for confirmation

Comments

@nlohmann
Copy link
Owner

The BJData reader seem to open an object with fixed size of 3 which is never closed, yielding a problem in the SAX parser. I fuzzed this branch (together with the assertions from #3498) and got the following failing input: crash2.bjdata.zip

This generates the following SAX events:

<array>
    <array>
        <object>
            <key key="��" />
            <array>
                <number_integer val="27739" />
                <boolean val="true" />
                <object size="3">                   <!-- open an object with size 3 -->
                    <key key="_ArraySize_" />
                    <array size="2">
                        <number_integer val="6689" />
                        <number_integer val="0" />
                    </array>                        <!-- the first object entry -->
                    <array size="0">                <!-- ERROR: a value without key -->
                    </array>
                    <boolean val="true" />          <!-- ERROR: a value without key -->
                    <boolean val="true" />          <!-- ERROR: a value without key -->
                    <boolean val="true" />          <!-- ERROR: a value without key -->
                    <boolean val="true" />          <!-- ERROR: a value without key -->
                </array>                            <!-- ERROR: closing array (not object) -->
                <key key="" />
                <number_unsigned val="30069" />
                <parse_error id="71" token="<end of file>" />

Again, for line

if (JSON_HEDLEY_UNLIKELY(!sax->start_object(3) || !sax->key(key) || !sax->start_array(dim.size())))

we need to check

  • Why is the size 3 hard-coded?
  • Where to add the required sax->end_object() call?

Originally posted by @nlohmann in #3502 (comment)

@nlohmann
Copy link
Owner Author

I just ran the fuzzer again and wanted to add another (shorter) example: minimized-from-48f6d80fc879d78470bda5e99e4a1213d4d963a1.zip

<object>
    <key key="N" />
    <array>
        <object size="3">
            <key key="_ArraySize_" />
            <array size="2">
                <number_integer val="9216" />
                <number_integer val="792633534417207296" />
            </array>
            <array size="0">. <!-- expected a key here -->
            </array>
        </array>              <!-- expected a end_object() here -->

@fangq
Copy link
Contributor

fangq commented May 20, 2022

hi @nlohmann, sorry for the late reply

Why is the size 3 hard-coded?

it is sort of a "specification" I want to ensure a bi-directional mapping between a bjdata ND array to a JSON object, as described here.

Where to add the required sax->end_object() call?

it is added at the end of this block. This if-branch uses the bit-8 of size_and_type.second (type marker) to tell if an object-based ND array has been partially written, and continue finishing it in the above block. However, looks like this is not robust, and does not always enter this branch even an object has been partially written.

I will work on a more robust way to implement this.

fangq added a commit to NeuroJSON/json that referenced this issue May 22, 2022
@nlohmann nlohmann added solution: proposed fix a fix for the issue has been proposed and waits for confirmation release item: 🐛 bug fix labels May 22, 2022
@nlohmann nlohmann added this to the Release 3.11.0 milestone May 22, 2022
@nlohmann nlohmann self-assigned this May 22, 2022
nlohmann pushed a commit that referenced this issue May 23, 2022
* Prevent ndarray size vector from recursive use, fix #3503

* fix ci error

* complete coverage

* add missing coverage

* fix style issue in added test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON kind: bug release item: 🐛 bug fix solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants