fix(NODE-6124): utf8 validation is insufficiently strict #680
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: this PR description is copied over from the 6.x branch fix.
Description
Outside of web, our
toUTF8
function was insufficiently strict and allowed overlong encodings.What is changing?
Change our functionality to use js
TextDecoder
to double check utf8 input when a replacement character is detected.Is there new documentation needed for these changes?
No.
What is the motivation for this change?
Drivers wide initiative to make UTF-8 validation strict and consistent.
UTF-8 validation now throws a
BSONError
on overlong encodings in Node.jsSpecifically, this affects
deserialize
when utf8 validation is enabled, which is the default.An overlong encoding is when the number of bytes in an encoding is inflated by padding the code point with leading 0s (see here for more information).
Double check the following
npm run check:lint
scripttype(NODE-xxxx)[!]: description
feat(NODE-1234)!: rewriting everything in coffeescript