Skip to content

Commit

Permalink
DRIVERS-1621: Prohibit null bytes in document field names and regex c…
Browse files Browse the repository at this point in the history
…omponents (#1051)

* DRIVERS-1621: Prohibit null bytes in document field names and regex components

* Clarify "encoding BSON" for prose tests

* Fix RST syntax error

* Clarify type-specific rules for parseErrors

* Do not require drivers to validate parseErrors strings as regular JSON

Co-authored-by: Kevin Albertson <kevin.albertson@10gen.com>
  • Loading branch information
jmikola and kevinAlbs committed Sep 2, 2021
1 parent 1ff5fac commit c64fc79
Show file tree
Hide file tree
Showing 4 changed files with 98 additions and 27 deletions.
96 changes: 72 additions & 24 deletions source/bson-corpus/bson-corpus.rst
Expand Up @@ -9,8 +9,8 @@ BSON Corpus
:Status: Approved
:Type: Standards
:Minimum Server Version: N/A
:Last Modified: July 20, 2017
:Version: 2.0
:Last Modified: September 2, 2021
:Version: 2.1

.. contents::

Expand Down Expand Up @@ -140,7 +140,7 @@ additional assertions. For each case, keys include:
JSON document. Because this is itself embedded as a *string* inside a JSON
document, characters like quote and backslash are escaped. It may be
present for deprecated types and is the Canonical Extended JSON
representation of ``converted_bson`.
representation of ``converted_bson``.

* ``lossy`` (optional) -- boolean; present (and true) iff ``canonical_bson``
can't be represented exactly with extended JSON (e.g. NaN with a payload).
Expand All @@ -167,15 +167,6 @@ be encoded to the ``bson_type`` under test. For each case, keys include:
* ``string``: a text or numeric representation of an input that can't be
parsed to a valid value of the given type.

Drivers MUST parse the extended JSON input using a regular JSON parser
(not an extended JSON one) and verify the input is parsed successfully.
This serves to verify that the parse error test cases test extended
JSON-specific error conditions and that they do not have,
for example, unintended spelling errors.

Drivers SHOULD parse the extended JSON input using the extended JSON parser
and verify the parsing produces an extended JSON parse error.

Extended JSON encoding, escaping and ordering
---------------------------------------------

Expand Down Expand Up @@ -314,21 +305,48 @@ manner.
Testing parsing errors
----------------------

The interpretation of ``parseErrors`` is type-specific. For example,
helpers for creating Decimal128 values may parse strings to convert them
to binary Decimal128 values. The ``parseErrors`` cases are strings that
will *not* convert correctly.
The interpretation of ``parseErrors`` is type-specific. The structure of test
cases within ``parseErrors`` is described in `Parse error case keys`_.

The documentation for a type (if any) will specify how to use these
cases for testing.
Drivers SHOULD test that each case results in a parsing error (e.g. parsing
Extended JSON, constructing a language type). Implementations MAY test
assertions in an implementation-specific manner.

For type "0x00" (i.e. top-level documents), the ``parseErrors`` entries have a
``description`` field and an ``string`` field. Parsing the ``string`` field
as Extended JSON MUST result in an error.

Drivers SHOULD test that each case results in a parse error.
Implementations MAY test assertions in an implementation-specific
manner.
Top-level Document (type 0x00)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For type "0x00" (i.e. top-level documents), the ``string`` field contains input
for an Extended JSON parser. Drivers MUST parse the Extended JSON input using an
Extended JSON parser and verify that doing so yields an Extended JSON parsing
error.

Drivers SHOULD also parse the Extended JSON input using a regular JSON parser (not
an Extended JSON one) and verify the input is parsed successfully. This serves
to verify that the ``parseErrors`` test cases are testing Extended JSON-specific
error conditions and that they do not have, for example, unintended syntax
errors.

Note: due to the generic nature of these tests, they may also be used to test
Extended JSON parsing errors for various BSON types appearing within a document.


Binary (type 0x05)
~~~~~~~~~~~~~~~~~~

For type "0x05" (i.e. binary), the rules for handling ``parseErrors`` are the
same as those for `Top-level Document (type 0x00)`_.


Decimal128 (type 0x13)
~~~~~~~~~~~~~~~~~~~~~~

For type "0x13" (i.e. Decimal128), the ``string`` field contains input for a
Decimal128 parser that converts string input to a binary Decimal128 value (e.g.
Decimal128 constructor). Drivers MUST assert that these strings cannot be
successfully converted to a binary Decimal128 value and that parsing the string
produces an error.


Deprecated types
----------------
Expand All @@ -338,6 +356,29 @@ Implementations MAY ignore or modify them to match legacy treatment of
deprecated types. The ``converted_bson`` and ``converted_extjson`` fields MAY
be used to test conversion to a standard type or MAY be ignored.

Prose Tests
===========

The following tests have not yet been automated, but MUST still be tested.

1. Prohibit null bytes in null-terminated strings when encoding BSON
--------------------------------------------------------------------

The BSON spec uses null-terminated strings to represent document field names and
regex components (i.e. pattern and flags/options). Drivers MUST assert that null
bytes are prohibited in the following contexts when encoding BSON (i.e. creating
raw BSON bytes or constructing BSON-specific type classes):

* Field name within a root document
* Field name within a sub-document
* Pattern for a regular expression
* Flags/options for a regular expression

Depending on how drivers implement BSON encoding, they MAY expect an error when
constructing a type class (e.g. BSON Document or Regex class) or when encoding a
language representation to BSON (e.g. converting a dictionary, which might allow
null bytes in its keys, to raw BSON bytes).

Implementation Notes
====================

Expand Down Expand Up @@ -456,6 +497,13 @@ assertions. This makes for easier and safer test case development.
Changes
=======

Version 2.1 - September 2, 2021

* Add spec and prose tests for prohibiting null bytes in null-terminated strings
within document field names and regular expressions.

* Clarify type-specific rules for ``parseErrors``.

Version 2.0 - May 26, 2017

* Revised to be consistent with Extended JSON spec 2.0: valid case fields
Expand Down
4 changes: 4 additions & 0 deletions source/bson-corpus/tests/document.json
Expand Up @@ -51,6 +51,10 @@
{
"description": "Invalid subdocument: bad string length in field",
"bson": "1C00000003666F6F001200000002626172000500000062617A000000"
},
{
"description": "Null byte in sub-document key",
"bson": "150000000378000D00000010610000010000000000"
}
]
}
4 changes: 2 additions & 2 deletions source/bson-corpus/tests/regex.json
Expand Up @@ -54,11 +54,11 @@
],
"decodeErrors": [
{
"description": "embedded null in pattern",
"description": "Null byte in pattern string",
"bson": "0F0000000B610061006300696D0000"
},
{
"description": "embedded null in flags",
"description": "Null byte in flags string",
"bson": "100000000B61006162630069006D0000"
}
]
Expand Down
21 changes: 20 additions & 1 deletion source/bson-corpus/tests/top.json
Expand Up @@ -79,6 +79,10 @@
{
"description": "Document truncated mid-key",
"bson": "1200000002666F"
},
{
"description": "Null byte in document key",
"bson": "0D000000107800000100000000"
}
],
"parseErrors": [
Expand Down Expand Up @@ -241,7 +245,22 @@
{
"description": "Bad DBpointer (extra field)",
"string": "{\"a\": {\"$dbPointer\": {\"a\": {\"$numberInt\": \"1\"}, \"$id\": {\"$oid\": \"56e1fc72e0c917e9c4714161\"}, \"c\": {\"$numberInt\": \"2\"}, \"$ref\": \"b\"}}}"
},
{
"description" : "Null byte in document key",
"string" : "{\"a\\u0000\": 1 }"
},
{
"description" : "Null byte in sub-document key",
"string" : "{\"a\" : {\"b\\u0000\": 1 }}"
},
{
"description": "Null byte in $regularExpression pattern",
"string": "{\"a\" : {\"$regularExpression\" : { \"pattern\": \"b\\u0000\", \"options\" : \"i\"}}}"
},
{
"description": "Null byte in $regularExpression options",
"string": "{\"a\" : {\"$regularExpression\" : { \"pattern\": \"b\", \"options\" : \"i\\u0000\"}}}"
}

]
}

0 comments on commit c64fc79

Please sign in to comment.