Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: length greater than capacity in backtrace in unquoteBytes crash #34292

Open
raokrutarth opened this issue Sep 13, 2019 · 7 comments

Comments

@raokrutarth
Copy link

commented Sep 13, 2019

What version of Go are you using (go version)?

go 1.11

Does this issue reproduce with the latest release?

Not clear. Unable to reproduce issue with same release as well.

What operating system and processor architecture are you using (go env)?

x86-64, Linux. Can provide additional details if necessary.

What did you do?

Saw the below back trace during an application crash

2019-09-12T19:38:17.382003+00:00 8400S hpe-tsdbd[5589]: panic: runtime error: invalid memory address or nil pointer dereference 2019-09-12T19:38:17.382268+00:00 8400S hpe-tsdbd[5589]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x55d2d3b52165] 2019-09-12T19:38:17.382547+00:00 8400S hpe-tsdbd[5589]: goroutine 26 [running]: 2019-09-12T19:38:17.382685+00:00 8400S hpe-tsdbd[5589]: panic(0x55d2d428f500, 0x55d2d472de00) 2019-09-12T19:38:17.382813+00:00 8400S hpe-tsdbd[5589]: #011/usr/lib/go/src/runtime/panic.go:556 +0x2cf fp=0xc000e0ced0 sp=0xc000e0ce40 pc=0x55d2d38b9eaf 2019-09-12T19:38:17.382936+00:00 8400S hpe-tsdbd[5589]: runtime.panicmem() 2019-09-12T19:38:17.383063+00:00 8400S hpe-tsdbd[5589]: #011/usr/lib/go/src/runtime/panic.go:82 +0x60 fp=0xc000e0cef0 sp=0xc000e0ced0 pc=0x55d2d38b8eb0 2019-09-12T19:38:17.383199+00:00 8400S hpe-tsdbd[5589]: runtime.sigpanic() 2019-09-12T19:38:17.383322+00:00 8400S hpe-tsdbd[5589]: #011/usr/lib/go/src/runtime/signal_unix.go:390 +0x186 fp=0xc000e0cf40 sp=0xc000e0cef0 pc=0x55d2d38cffb6 2019-09-12T19:38:17.383445+00:00 8400S hpe-tsdbd[5589]: encoding/json.unquoteBytes(0xc00132ed75, 0x20, 0x328a, 0x55d2d425a7a0, 0xc00090fb10, 0x98, 0x55d2d42df980) 2019-09-12T19:38:17.383566+00:00 8400S hpe-tsdbd[5589]: #011/usr/lib/go/src/encoding/json/decode.go:1210 +0x125 fp=0xc000e0cfe8 sp=0xc000e0cf40 pc=0x55d2d3b52165 2019-09-12T19:38:17.383686+00:00 8400S hpe-tsdbd[5589]: encoding/json.(*decodeState).object(0xc00015ce60, 0x55d2d4280760, 0xc000e34240, 0x195, 0xc00015ce88, 0x7b) 2019-09-12T19:38:17.383805+00:00 8400S hpe-tsdbd[5589]: #011/usr/lib/go/src/encoding/json/decode.go:692 +0x972 fp=0xc000e0d2b0 sp=0xc000e0cfe8 pc=0x55d2d3b4d062

Backtrace showed the item array had a larger length than capacity:

#12 0x000055d2d38cffb6 in runtime.sigpanic () at /usr/lib/go/src/runtime/signal_unix.go:390
g = 0xc0001c8780
#13 0x000055d2d3b52165 in encoding/json.unquoteBytes (ok=, s=..., t=...) at /usr/lib/go/src/encoding/json/decode.go:1229
c =
size =
b =
r = 30
w =
#14 0x000055d2d3b4d062 in encoding/json.(*decodeState).object (d=0xc00015ce60, v=..., ~r1=...) at /usr/lib/go/src/encoding/json/decode.go:692
destring = false
key = {array = 0xc00132ed15 "vsx_virtual_gw_mac_v6":{"type":{"min":0,"key":{"maxLength":17,"minLength":17,"type":"string"}}},"mgmd_querier_max_response_time":{"type":{"max":2,"min":0,"key":{"type":"string","enum":["set",["igmp",""..., len = 21, cap = 13035}
subv = {typ = 0x55d2d42df980, ptr = 0xc000839d70, flag = 409}
item = {array = 0xc00132ed75 ""mgmd_querier_max_response_time":{"type":{"max":2,"min":0,"key":{"type":"string","enum":["set",["igmp","mld"]]},"value":{"minInteger":10,"maxInteger":128,"type":"integer"}}},"vsx_peer_mac":{"type":{"m"..., len = 824636187520, cap = 12939}
start = 6480
mapElem = {typ = 0x55d2d42df980, ptr = 0xc000839d70, flag = 409}
u = {tab = 0x55d2d38ea224 <runtime.raise+20>, data = }
#15 0x000055d2d3b4b06f in encoding/json.(*decodeState).value (d=0xc00015ce60, v=..., ~r1=...) at /usr/lib/go/src/encoding/json/decode.go:383
err =
#16 0x000055d2d3b4ddf0 in encoding/json.(*decodeState).object (d=0xc00015ce60, v=..., ~r1=...) at /usr/lib/go/src/encoding/json/decode.go:781
err =
destring = false
key = {array = 0xc001327ab3 "columns":{"ospfv3_hello_interval":{"type":{"min":0,"key":{"minInteger":1,"maxInteger":65535,"type":"integer"}}},"mac_auth_statistics":{"type":{"max":2,"min":0,"key":"string","value":"integer"}},"q_pro"..., len = 7, cap = 42317}
subv = {typ = 0x55d2d4280760, ptr = 0xc000e34240, flag = 405}
item = {array = 0xc001327ab2 ""columns":{"ospfv3_hello_interval":{"type":{"min":0,"key":{"minInteger":1,"maxInteger":65535,"type":"integer"}}},"mac_auth_statistics":{"type":{"max":2,"min":0,"key":"string","value":"integer"}},"q_pr"..., len = 9, cap = 42318}
start =
mapElem = {typ = 0x0, ptr = 0x0, flag = 0}
u = {tab = 0x55d2d38ea224 <runtime.raise+20>, data = }

More specifically

item = {array = 0xc00132ed75 ""mgmd_querier_max_response_time":{"type":{"max":2,"min":0,"key":{"type":"string","enum":["set",["igmp","mld"]]},"value":{"minInteger":10,"maxInteger":128,"type":"integer"}}},"vsx_peer_mac":{"type":{"m"..., len = 824636187520, cap = 12939}

Any chance the way in which item is read in https://github.com/golang/go/blob/release-branch.go1.11/src/encoding/json/decode.go#L692 can return a corrupted array of bytes?

What did you expect to see?

Length should be smaller than capacity.

What did you see instead?

Length was 824636187520 and capacity was 12939.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2019

This will be hard to debug without a reproducible case. A complete stack trace would also help (not just the encoding/json part, so we can see where the data came from).

Could you run your application with the -race flag? A data race on the source byte array might manifest like this.

@raokrutarth

This comment has been minimized.

Copy link
Author

commented Sep 14, 2019

Do you mind pointing me to an explanation of a data race? Not sure encoding/json has such a case

@raokrutarth

This comment has been minimized.

Copy link
Author

commented Sep 14, 2019

@odeke-em

This comment has been minimized.

Copy link
Member

commented Sep 14, 2019

Hello @raokrutarth, thank you for filing this issue and welcome to the Go project!

Thank you @randall77 for the stewarding!

@raokrutarth in deed, as Keith requested in #34292 (comment), a complete stack trace would be just the output you'd get from go run main.go for example for this program https://play.golang.org/p/EHrhjS2hhOl or inlined

package main

func hey(i int) {
	if i <= 0 {
		panic("panicked on hey")
	}
	there(i - 1)
}

func there(i int) {
	if i <= 0 {
		panic("panicked on hey")
	}
	hey(i - 1)
}

func main() {
	hey(4)
}

its full stacktrace will be composed of crumbs from both hey and there as per

$ go run main.go
panic: panicked on hey

goroutine 1 [running]:
main.hey(0x0, 0x0)
	/tmp/sandbox826229368/prog.go:5 +0x80
main.there(0x1, 0x0)
	/tmp/sandbox826229368/prog.go:14 +0x40
main.hey(0x2, 0x3528)
	/tmp/sandbox826229368/prog.go:7 +0x40
main.there(0x3, 0x0)
	/tmp/sandbox826229368/prog.go:14 +0x40
main.hey(0x4, 0x430070)
	/tmp/sandbox826229368/prog.go:7 +0x40
main.main()
	/tmp/sandbox826229368/prog.go:18 +0x20

aka
Screen Shot 2019-09-13 at 9 44 46 PM

so please paste that output instead that of a GDB backtrace, the output from the crash.

Full backtrace https://pastebin.com/xGHtHbby

That's a GDB backtrace, we'd like a Go stacktrace from directly running go run main.go that triggers the panic you reported. A minimal code snippet of isolated code would be helpful to trigger the panic could help us troubleshoot better and faster.

Do you mind pointing me to an explanation of a data race? Not sure encoding/json has such a case

For an explanation of a data race, I'd say:

A data race is a condition involving concurrent accesses to shared data
in which at least one access is a write.

Here is an in-depth and exciting explanation from Dmitry Vyukov https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong.
Keith's own work in this field has been quite extensive and pioneering, and inspired many parts of Go so that was a great and well timed question :)

Here is a runnable example of a data race that'll cause your code to panic at runtime

package main

import "sync"

func withMap(m map[string]int, i int) {
	if i&1 == 0 {
		m["key"] = 10
	} else {
		v := m["key"]
		if v == 0 {
			m["key"] = 11
		}
	}
}

func main() {
	var waitForAllGs sync.WaitGroup
	defer waitForAllGs.Wait()

	// A data race occurs when there are concurrent accesses
	// to shared data and at least one of the accesses is a write.
	shared := make(map[string]int)

	n := 10
	pauseTillAccess := make(chan bool)

	for i := 0; i < n; i++ {
		waitForAllGs.Add(1)
		go func(id int) {
			defer waitForAllGs.Done()
			pauseTillAccess <- true
			withMap(shared, id)
		}(i)
	}

	for i := 0; i < n; i++ {
		<-pauseTillAccess
	}
}

Hope that helps, please don't hesitate to ask for more explanations.

In regards to what Keith meant by running with -race, if you do

go run -race main.go

or when building your binary

go build -race -o main main.go

which will use our builtin race detector and that will help catch data races at runtime.

Thank you.

@odeke-em odeke-em changed the title length greater than capacity in backtrace in encoding/json's unquoteBytes crash encoding/json: length greater than capacity in backtrace in unquoteBytes crash Sep 14, 2019
@mvdan

This comment has been minimized.

Copy link
Member

commented Sep 14, 2019

Have you tried your program on Go 1.13? I don't think we'd fix a json panic in 1.11 at this point; see https://github.com/golang/go/wiki/Go-Release-Cycle#release-maintenance. Following that doc, not even security issues would get fixed in a 1.11 bugfix release.

@raokrutarth

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

@odeke-em Here's the stacktrace https://pastebin.com/ZfLHtFHu

@mvdan I'm not in a position to upgrade the application and the issue is not yet reproducible. Once we have ball park idea about the root cause, we might be able to come up with a reproducible test.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Sep 16, 2019

If this only happened once, it was most likely some sort of memory error, either corruption due to cgo or unsafe code or a race condition, or a hardware error.

We aren't going to be able to find a root cause if we can't reproduce it.

To answer your question:

Any chance the way in which item is read in https://github.com/golang/go/blob/release-branch.go1.11/src/encoding/json/decode.go#L692 can return a corrupted array of bytes?

No. The only way it could would be if there is pre-existing memory corruption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.