feat[test]: implement `abi_decode` spec test #4095

charles-cooper · 2024-06-05T14:42:58Z

What I did

add a spec-based differential fuzzer test for abi_decode

How I did it

How to verify it

Commit message

this commit implements a spec-based differential fuzzer for
`abi_decode`.

it introduces several components:

- a "spec" implementation of `abi_decode`, which is how vyper's
  abi_decode should behave on a given payload, implemented in python

- a hypothesis strategy to draw vyper types

- hypothesis strategy to create valid data for a given vyper type

- a hypothesis strategy to _mutate_ a given payload which is designed
  to introduce faults in the decoder. testing indicated splicing
  pointers into the payload - either valid pointers or "nearly" valid
  pointers - had the highest success rate for finding bugs in the
  decoder. the intuition here is that the most difficult part of the
  decoder is validating out-of-bound pointers in the payload, so
  pointers represent "semantically high-value" data to the fuzzer.

- some hypothesis tuning to ensure a good distribution of types

over several days of testing+tuning, this fuzzer independently found
the bugs fixed in 44bb281ccaa and 21f7172274e (which were originally
found by manual review).

Description for the changelog

Cute Animal Picture

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

This reverts commit 25b12ed.

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

simplify imports with import hypothesis as hp

the large + nested sarrays just end up producing contracts that break the contract size limit

cyberthirst · 2024-06-09T07:57:05Z

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+
+    # add, edit, delete, word, splice, flip
+    possible_actions = "adwww"
+    actions = draw(st.lists(st.sampled_from(possible_actions), max_size=MAX_MUTATIONS))


this can also generate an empty list, no? Don't we want min_size=1?

We want to test valid payloads too, we don't have to mutate the payload

ok, i assumed we'd test valid payloads separately. we could enmurate all type combinations up to certain depth

cyberthirst · 2024-06-09T07:59:49Z

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+        st_byte = st_any_byte
+
+    # add, edit, delete, word, splice, flip
+    possible_actions = "adwww"


we deliberately don't use the rest of the actions?

Yea, I was going to clean it up a bit. The edit, splice and flip actions turned out to not be very useful.

cyberthirst · 2024-06-09T08:00:46Z

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+
+
+@st.composite
+def _mutate(draw, payload, max_mutations=MAX_MUTATIONS):


max_mutations unused

cyberthirst · 2024-06-09T08:03:26Z

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+static_leaf_ctors = [t for t in leaf_ctors if t._is_prim_word]
+dynamic_leaf_ctors = [BytesT, StringT]
+
+MAX_MUTATIONS = 33


what is the reasoning behind using 33?
why is it ok to use 33 for all payloads, irrespective of their lengths? ie why do we assume this works well both for short and long payloads?

It's about editing, you can add or delete one word (plus change)

we bias towards the w action, which is word-level

overall, i'm not sure about the byte-level mutation. if it happens on the length word, it will likely create an invalid length. if it happens on a pointer, it will likely create an invalid pointer (eg points to a too large address)

given the fairly low MAX_MUTATIONS if a bad byte mutation happens, I think there's a low probability of offsetting it with a "good" mutation

like normal fuzzers do it on this level, but i think that the mutation depth there is much higher

i think it's mostly useful for editing the "data" portion of the payload, like it produces off-by-ones (expected length is 32 but data is only actually 31)

cyberthirst · 2024-06-09T08:06:08Z

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+
+    if t in (BytesT, StringT):
+        # arbitrary max_value
+        bound = draw(st.integers(min_value=1, max_value=1024))


isn't 1024 unnecessarily large so it just slows down the tests? what cases might not be covered by eg 512?

cyberthirst · 2024-06-09T08:50:01Z

i think that for the sake of completeness, we should also test with dirty memory

let's consider that a ptr points outside the buffer to
a) dirty memory
b) clean memory

we currently only test for b) and require that the spec matches the implementation

the spec should always raise when ptr points outside the buffer. but what if the implementation doesn't revert when pointing outside the buffer to dirty memory?

cyberthirst · 2024-06-09T09:00:03Z

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+            ret.pop(ix)
+        elif action == "w":
+            # splice word
+            st_uint256 = st.integers(min_value=0, max_value=2**256 - 1)


i'm not sure how useful this range is. the interesting values will be on it's boundaries but we can sample those for "cheaper"

in most cases we'll just sample some gigantic random number and i'd argue it will just hit the same code path each time (and for ptrs will mostly likely just oog)

cyberthirst · 2024-06-09T09:07:20Z

i think it's important to include structs - one of the original errors was incorrect validation of structs within dynarrays

charles-cooper · 2024-06-09T11:07:33Z

i think it's important to include structs - one of the original errors was incorrect validation of structs within dynarrays

I mean structs share the same code path as tuples, but with named fields in the front-end. I don't think we are missing much by not including them

cyberthirst · 2024-06-09T12:03:55Z

i think it's important to include structs - one of the original errors was incorrect validation of structs within dynarrays

I mean structs share the same code path as tuples, but with named fields in the front-end. I don't think we are missing much by not including them

we skip them for dynarrays, right?

charles-cooper · 2024-06-09T12:07:34Z

i think it's important to include structs - one of the original errors was incorrect validation of structs within dynarrays

I mean structs share the same code path as tuples, but with named fields in the front-end. I don't think we are missing much by not including them

we skip them for dynarrays, right?

Ah right, those are disallowed by the language semantics. Ok let's add them

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

+    # for k, v in asdict(stats).items():
+    #     event(k, v)


charles-cooper added 6 commits June 5, 2024 10:42

wip - implement abi-decode spec

674bac6

spec entry point

3942f49

add ABI_Bool, bytes bound check

36d1bb7

add trailing comma to TupleT repr

b68ae38

lint

88a57f7

abi_decode fuzz harness

5ebfcf9

github-advanced-security bot found potential problems Jun 5, 2024

View reviewed changes

tests/functional/builtins/codegen/test_abi_decode_fuzz.py Fixed Show fixed Hide fixed

charles-cooper added 5 commits June 5, 2024 16:10

wrap types in tuple as necessary

9949022

fix typos/variable names

28da975

bias nonzero indexes

9be35f0

fix argnames

f614a4d

fix typos, handle special cases

bc59d1a

github-advanced-security bot found potential problems Jun 5, 2024

View reviewed changes

tests/functional/builtins/codegen/test_abi_decode_fuzz.py Fixed Show fixed Hide fixed

charles-cooper added 9 commits June 5, 2024 16:48

fix some more bugs

fc8aea1

fix more bugs

6079698

add hypo notes for debugging

0191693

fix strict_slice

595ef43

fix some decoder routines

f2c2443

remove bad check

25b12ed

add padding to bytes, fix lint

dff38e0

fix typo

529361d

Revert "remove bad check"

f507991

This reverts commit 25b12ed.

github-advanced-security bot found potential problems Jun 5, 2024

View reviewed changes

tests/functional/builtins/codegen/test_abi_decode_fuzz.py Fixed Show fixed Hide fixed

charles-cooper added 7 commits June 5, 2024 17:42

more fixes

65a808c

checksum addresses

6cfd815

fix surrogate characters

9d8b6a2

decode wrapped type

c69b159

more verbose exception

4f1c9b0

fix recursion - only need to decode heads when unwrapping

16c3ac4

more verbosity

536e903

charles-cooper added 3 commits June 8, 2024 13:52

filter out contract size limit errors

ebb302b

simplify imports with import hypothesis as hp

filter out init codesize limit too

b741372

tune level of nesting

df951cd

the large + nested sarrays just end up producing contracts that break the contract size limit

cyberthirst reviewed Jun 9, 2024

View reviewed changes

charles-cooper added 9 commits June 12, 2024 06:52

Merge branch 'master' into test-abi-decode-spec

b0eb26e

add extcall fuzzer

fa282b3

fix get_contract_from_ir scope

dfcaaa3

fix harness setup errors

1cc5d9a

kludge, add assertion to match abi_decode behavior

18bea6b

Merge branch 'master' into test-abi-decode-spec

8af77ec

reduce number of examples for CI

a4fafcc

fix tuple repr

300f300

fix lint

8277522

charles-cooper changed the title ~~wip - implement abi-decode spec~~ feat[test]: implement abi_decode spec test Jun 14, 2024

github-advanced-security bot found potential problems Jun 14, 2024

View reviewed changes

tests/functional/builtins/codegen/test_abi_decode_fuzz.py

Comment on lines +342 to +343

# for k, v in asdict(stats).items():

# event(k, v)

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.

update _abi_decode to abi_decode, add a comment

0976fb5

cyberthirst approved these changes Jun 14, 2024

View reviewed changes

charles-cooper marked this pull request as ready for review June 14, 2024 20:16

charles-cooper merged commit 69e5c05 into vyperlang:master Jun 14, 2024
156 checks passed

charles-cooper deleted the test-abi-decode-spec branch June 14, 2024 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat[test]: implement `abi_decode` spec test #4095

feat[test]: implement `abi_decode` spec test #4095

charles-cooper commented Jun 5, 2024 •

edited

Loading

cyberthirst Jun 9, 2024

charles-cooper Jun 9, 2024 •

edited

Loading

cyberthirst Jun 9, 2024 •

edited

Loading

cyberthirst Jun 9, 2024

charles-cooper Jun 9, 2024

cyberthirst Jun 9, 2024

cyberthirst Jun 9, 2024

charles-cooper Jun 9, 2024

cyberthirst Jun 9, 2024

cyberthirst Jun 9, 2024

cyberthirst Jun 9, 2024

charles-cooper Jun 10, 2024

cyberthirst Jun 9, 2024

cyberthirst commented Jun 9, 2024

cyberthirst Jun 9, 2024

cyberthirst commented Jun 9, 2024

charles-cooper commented Jun 9, 2024

cyberthirst commented Jun 9, 2024

charles-cooper commented Jun 9, 2024



		@st.composite
		def _mutate(draw, payload, max_mutations=MAX_MUTATIONS):

feat[test]: implement abi_decode spec test #4095

feat[test]: implement abi_decode spec test #4095

Conversation

charles-cooper commented Jun 5, 2024 • edited Loading

What I did

How I did it

How to verify it

Commit message

Description for the changelog

Cute Animal Picture

Choose a reason for hiding this comment

charles-cooper Jun 9, 2024 • edited Loading

Choose a reason for hiding this comment

cyberthirst Jun 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyberthirst commented Jun 9, 2024

Choose a reason for hiding this comment

cyberthirst commented Jun 9, 2024

charles-cooper commented Jun 9, 2024

cyberthirst commented Jun 9, 2024

charles-cooper commented Jun 9, 2024

feat[test]: implement `abi_decode` spec test #4095

feat[test]: implement `abi_decode` spec test #4095

charles-cooper commented Jun 5, 2024 •

edited

Loading

charles-cooper Jun 9, 2024 •

edited

Loading

cyberthirst Jun 9, 2024 •

edited

Loading