DecodeVarint: Fix panic on truncated input; Add tests #12

evanj · 2021-11-04T21:17:56Z

Replace references to cb.len with cb.Len() to ensure we are checking
the difference from cb.len - cb.index. When some values have already
been parsed, DecodeVarint will be called with cb.index > 0, which
means the existing bounds checks were wrong

Commit 775411c added faster decoding of varints, but omitted some
of the bounds checks. This means that using this library to parse
truncated inputs now panics, instead of returning an error.

Add a fuzz test to ensure this continues to work. Add a deterministic
unit test extracted from the fuzz test as an easy to understand
example.

Replace references to cb.len with cb.Len() to ensure we are checking the difference from cb.len - cb.index. When some values have already been parsed, DecodeVarint will be called with cb.index > 0, which means the existing bounds checks were wrong Commit 775411c added faster decoding of varints, but omitted some of the bounds checks. This means that using this library to parse truncated inputs now panics, instead of returning an error. Add a fuzz test to ensure this continues to work. Add a deterministic unit test extracted from the fuzz test as an easy to understand example.

evanj · 2021-11-04T21:19:59Z

src/codec/decode.go

@@ -45,15 +45,15 @@ func init() {
 //
 // This implementation is inlined from https://github.com/dennwc/varint to avoid the call-site overhead
 func (cb *Buffer) DecodeVarint() (uint64, error) {
-	if cb.len == 0 {
+	if cb.Len() == 0 {


I'm ... actually not 100% sure these check replacements are correct. They do pass the unit tests, and the fuzz test does fail if you remove any of them, which seems ... good?

richardartoul · 2021-11-05T22:43:29Z

@evanj I'm approving this so C.I runs, but lets ask @kevinconaway to review it

kevinconaway · 2021-11-08T15:43:01Z

Will you please help me understand how the length is changing?

evanj · 2021-11-10T14:23:55Z

The length isn't changing. It is just that the existing code only checks cb.len, while the remaining space in the buffer is actually cb.len - cb.index (which is what gets returned by cb.Len()). The way the buffer gets used is the entire protocol buffer message is put in in, then cb.index is advanced towards the end.

For correctly encoded data, it doesn't matter: the current encoder will never "run off the end". For example, even in the "buffer has > 10 space" branch, if the Varint is terminated, it will return early: https://github.com/richardartoul/molecule/blob/master/src/codec/decode.go#L59-L62

The problem is when the varint is incorrectly encoded, this causes it to run off the end of the buffer.

As a simple example, consider the protocol buffer message:

payload := &simple.Simple{Int64: 42}

This encodes as: 0x20 0x2a. This gets decoded with two calls to DecodeVarint:

DecodeVarint index=0 len=2
DecodeVarint index=1 len=2

If you modify the payload to be 0x20 0xff then it runs off the end:

DecodeVarint index=0 len=2
DecodeVarint index=1 len=2
panic: runtime error: index out of range [2] with length 2 [recovered]

goroutine 35 [running]:
testing.tRunner.func1.2({0x13812e0, 0xc00014e168})
	/usr/local/Cellar/go/1.17.2/libexec/src/testing/testing.go:1209 +0x24e
testing.tRunner.func1()
	/usr/local/Cellar/go/1.17.2/libexec/src/testing/testing.go:1212 +0x218
panic({0x13812e0, 0xc00014e168})
	/usr/local/Cellar/go/1.17.2/libexec/src/runtime/panic.go:1038 +0x215
github.com/richardartoul/molecule/src/codec.(*Buffer).DecodeVarint(0xc000133f10)
	/Users/evan.jones/molecule/src/codec/decode.go:170 +0x9bd
github.com/richardartoul/molecule.MessageEach(0xc000133f10, 0x13bb4a8)
	/Users/evan.jones/molecule/molecule.go:32 +0x105
github.com/richardartoul/molecule/tests.TestMessageEach(0xc00020a1a0)
	/Users/evan.jones/molecule/tests/molecule_test.go:145 +0x28f
testing.tRunner(0xc00020a1a0, 0x13bb4b0)
	/usr/local/Cellar/go/1.17.2/libexec/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
	/usr/local/Cellar/go/1.17.2/libexec/src/testing/testing.go:1306 +0x35a
FAIL	github.com/richardartoul/molecule/tests	0.719s

This happens because the code assumed the buffer had 2 bytes in it, from the following code, but it actually only had 1:

https://github.com/richardartoul/molecule/blob/master/src/codec/decode.go#L162-L164

kevinconaway · 2021-11-11T16:24:34Z

I understand, thanks. Would you please post a benchstat diff?

evanj · 2021-11-12T14:56:33Z

I ran go test -bench=. 10 times for both branches and fed the input into benchstat. This is from my MBP. It seems to basically be "no change" if I am understanding this correctly?

name                                         old time/op    new time/op    delta
Molecule/standard_unmarshal-8                  1.37µs ± 5%    1.39µs ± 9%   ~     (p=0.837 n=10+10)
Molecule/unmarshal_all-8                        162ns ±10%     160ns ± 4%   ~     (p=0.712 n=10+8)
Molecule/unmarshal_loop-8                       150ns ± 7%     154ns ± 5%   ~     (p=0.297 n=9+9)
Molecule/unmarshal_single_with_molecule-8       117ns ±10%     118ns ± 7%   ~     (p=0.374 n=9+9)
Molecule/unmarshal_multiple_with_molecule-8     762ns ± 7%     785ns ±10%   ~     (p=0.400 n=9+10)
Simple-8                                        280ns ± 8%     289ns ±13%   ~     (p=0.368 n=9+10)
Packing-8                                       180µs ±24%     175µs ± 9%   ~     (p=0.912 n=10+10)

name                                         old alloc/op   new alloc/op   delta
Simple-8                                        1.00B ± 0%     1.00B ± 0%   ~     (all equal)
Packing-8                                        813B ± 3%      838B ± 9%   ~     (p=0.139 n=8+9)

name                                         old allocs/op  new allocs/op  delta
Simple-8                                         1.00 ± 0%      1.00 ± 0%   ~     (all equal)
Packing-8                                        0.00           0.00        ~     (all equal)

kevinconaway

🙇

evanj commented Nov 4, 2021

View reviewed changes

kevinconaway approved these changes Nov 12, 2021

View reviewed changes

richardartoul merged commit 2725ff2 into richardartoul:master Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DecodeVarint: Fix panic on truncated input; Add tests #12

DecodeVarint: Fix panic on truncated input; Add tests #12

evanj commented Nov 4, 2021

evanj Nov 4, 2021

richardartoul commented Nov 5, 2021

kevinconaway commented Nov 8, 2021

evanj commented Nov 10, 2021

kevinconaway commented Nov 11, 2021

evanj commented Nov 12, 2021

kevinconaway left a comment

DecodeVarint: Fix panic on truncated input; Add tests #12

DecodeVarint: Fix panic on truncated input; Add tests #12

Conversation

evanj commented Nov 4, 2021

evanj Nov 4, 2021

Choose a reason for hiding this comment

richardartoul commented Nov 5, 2021

kevinconaway commented Nov 8, 2021

evanj commented Nov 10, 2021

kevinconaway commented Nov 11, 2021

evanj commented Nov 12, 2021

kevinconaway left a comment

Choose a reason for hiding this comment