Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: mangled unmarshal string result [1.14 backport] #38106

gopherbot opened this issue Mar 27, 2020 · 4 comments

encoding/json: mangled unmarshal string result [1.14 backport] #38106

gopherbot opened this issue Mar 27, 2020 · 4 comments


Copy link

@gopherbot gopherbot commented Mar 27, 2020

@dsnet requested issue #38105 to be considered for backport to the next 1.14 minor release.

@gopherbot, please open a backport issue for 1.14.

Copy link

@mvdan mvdan commented Apr 16, 2020

The CL fixing this issue has been available since March - it's proving difficult to get the change properly approved during this time, so any help is appreciated. It would be a shame for it to get pushed back to 1.14.4 once again.

Copy link

@dmitshur dmitshur commented May 6, 2020

Approved as this is a Go 1.14-specific regression with no workaround.

Copy link

@gopherbot gopherbot commented May 8, 2020

Change mentions this issue: [release-branch.go1.14] encoding/json: don't mangle strings in an edge case when decoding

Copy link

@gopherbot gopherbot commented May 27, 2020

Closed by merging 846c00e to release-branch.go1.14.

@gopherbot gopherbot closed this May 27, 2020
gopherbot pushed a commit that referenced this issue May 27, 2020
…e case when decoding

The added comment contains some context. The original optimization
assumed that each call to unquoteBytes (or unquote) followed its
corresponding call to rescanLiteral. Otherwise, unquoting a literal
might use d.safeUnquote from another re-scanned literal.

Unfortunately, this assumption is wrong. When decoding {"foo": "bar"}
into a map[T]string where T implements TextUnmarshaler, the sequence of
calls would be as follows:

	1) rescanLiteral "foo"
	2) unquoteBytes "foo"
	3) rescanLiteral "bar"
	4) unquoteBytes "foo" (for UnmarshalText)
	5) unquoteBytes "bar"

Note that the call to UnmarshalText happens in literalStore, which
repeats the work to unquote the input string literal. But, since that
happens after we've re-scanned "bar", we're using the wrong safeUnquote
field value.

In the added test case, the second string had a non-zero number of safe
bytes, and the first string had none since it was all non-ASCII. Thus,
"safely" unquoting a number of the first string's bytes could cut a rune
in half, and thus mangle the runes.

A rather simple fix, without a full revert, is to only allow one use of
safeUnquote per call to unquoteBytes. Each call to rescanLiteral when
we have a string is soon followed by a call to unquoteBytes, so it's no
longer possible for us to use the wrong index.

Also add a test case from #38126, which is the same underlying bug, but
affecting the ",string" option.

Before the fix, the test would fail, just like in the original two issues:

	--- FAIL: TestUnmarshalRescanLiteralMangledUnquote (0.00s)
	    decode_test.go:2443: Key "开源" does not exist in map: map[开���:12345开源]
	    decode_test.go:2458: Unmarshal unexpected error: json: invalid use of ,string struct tag, trying to unmarshal "\"aaa\tbbb\"" into string

Fixes #38106.
For #38105.
For #38126.

Change-Id: I761e54924e9a971a4f9eaa70bbf72014bb1476e6
Run-TryBot: Daniel Martí <>
TryBot-Result: Gobot Gobot <>
Reviewed-by: Joe Tsai <>
(cherry picked from commit 55361a2)
Run-TryBot: Dmitri Shuralyov <>
Reviewed-by: Daniel Martí <>
@urso urso mentioned this issue May 29, 2020
5 of 5 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.