Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: account for recursion when stringing to avoid overflow #1315

Merged
merged 21 commits into from
Dec 21, 2023

Conversation

deelawn
Copy link
Contributor

@deelawn deelawn commented Oct 28, 2023

Addresses #1291. This is to fix the issue causing the entire interpreter process to crash, not the underlying issue that is causing it to panic in the first place.

@github-actions github-actions bot added the 📦 🤖 gnovm Issues or PRs gnovm related label Oct 28, 2023
@deelawn deelawn changed the title Bug/recursive overflow fix: account for recursion when stringing to avoid overflow Oct 28, 2023
@codecov
Copy link

codecov bot commented Oct 28, 2023

Codecov Report

Attention: 50 lines in your changes are missing coverage. Please review.

Comparison is base (71ba96d) 56.08% compared to head (aadbe3a) 56.27%.
Report is 4 commits behind head on master.

Files Patch % Lines
gnovm/pkg/gnolang/values_string.go 45.05% 41 Missing and 9 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1315      +/-   ##
==========================================
+ Coverage   56.08%   56.27%   +0.18%     
==========================================
  Files         421      422       +1     
  Lines       65436    65772     +336     
==========================================
+ Hits        36700    37010     +310     
+ Misses      25870    25869       -1     
- Partials     2866     2893      +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@moul moul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Your commit resolves my issue.

The only thing I'm uncertain about is when mixing slices and pointers, where your "seen" map is reinitialized each time. I'll test some more unusual cases later.

@deelawn
Copy link
Contributor Author

deelawn commented Oct 29, 2023

Looks good to me. Your commit resolves my issue.

The only thing I'm uncertain about is when mixing slices and pointers, where your "seen" map is reinitialized each time. I'll test some more unusual cases later.

Cool. Yeah there are definitely some issues due to this change -- see the failing tests for example. I'm going to take a closer look to figure out what is going wrong with those.

@deelawn
Copy link
Contributor Author

deelawn commented Nov 2, 2023

Using the %#v verb in the format string statement will also prevent the recursive loop. I am going to investigate how this is implemented in the go standard library; we may be able to use a similar implementation to avoid creating method adapters for all the type value structs that implement the Stringer interface.

@deelawn
Copy link
Contributor Author

deelawn commented Nov 3, 2023

I took a look at how the %#v verb works in the go standard library's fmt package. It is able to prevent recursion by not following pointers once the depth of the nested values exceeds zero. In simpler terms, it will not follow any nested pointer values after following the first at the top level of the data structure. That being said, I'm not sure there is a way to effectively prevent this type of recursion from happening other than by using a solution similar to the current solution as it exists in this PR.

Reference:
https://cs.opensource.google/go/go/+/refs/tags/go1.21.3:src/fmt/print.go;l=919

@deelawn deelawn marked this pull request as ready for review November 3, 2023 22:13
@deelawn deelawn requested a review from jaekwon as a code owner November 3, 2023 22:13
@thehowl
Copy link
Member

thehowl commented Nov 23, 2023

There is one way we could possibly make this more efficient.

Instead of using a map, use a []Value, with all the other values encounted during iteration. If at the beginning of each iteration in the recursive function, the value is found, then it is not printed and we avoid the stack overflow.

Of course, this is on the assumption that small maps, initialized to some good amount (ie. make([]Value, 0, 16)) should be more efficient for this case than maps, but if possible I would like you to try doing a micro-benchmark for this.

@deelawn
Copy link
Contributor Author

deelawn commented Nov 24, 2023

I can try it, but I want to understand your thought before I do. Where would efficiency be improved using a slice instead of a map? The map guarantees O(1) look ups so it seems like it would always be much faster to check if it has been encountered before rather than iterating over a slice.

@thehowl
Copy link
Member

thehowl commented Nov 24, 2023

I can try it, but I want to understand your thought before I do. Where would efficiency be improved using a slice instead of a map? The map guarantees O(1) look ups so it seems like it would always be much faster to check if it has been encountered before rather than iterating over a slice.

It's the usual: big O is useful to understand how an algorithm scales, but it is useless as an algorithm performance metric itself.

Consider the following micro benchmark:

package x

import (
	"slices"
	"testing"
)

func BenchmarkSlice(b *testing.B) {
	for i := 0; i < b.N; i++ {
		s := make([]int, 0, nItems)
		for j := 1; j <= nItems; j++ {
			s = append(s, j*2)
		}

		for j := 0; j < (nItems * 2); j++ {
			_ = slices.Contains(s, j)
		}
	}
}

func BenchmarkMap(b *testing.B) {
	for i := 0; i < b.N; i++ {
		s := make(map[int]struct{}, nItems)
		for j := 1; j <= nItems; j++ {
			s[j*2] = struct{}{}
		}

		for j := 0; j < (nItems * 2); j++ {
			_, ok := s[j]
			_ = ok
		}
	}
}

I am running the benchmark as follows, to test it with different values for nItems:

for i in $(seq 10 10 100); do echo "package x; const nItems = $i" > x.go; echo "benchmark with $i items:"; go test -bench=.; done
Here are the results on my machine
benchmark with 10 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16       15068748                81.53 ns/op
BenchmarkMap-16          1709596               847.1 ns/op
PASS
ok      itest   3.479s
benchmark with 20 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16        4675933               233.9 ns/op
BenchmarkMap-16           938896              1676 ns/op
PASS
ok      itest   2.954s
benchmark with 30 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16        2402635               527.5 ns/op
BenchmarkMap-16           357722              2832 ns/op
PASS
ok      itest   2.829s
benchmark with 40 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16        1337540               923.5 ns/op
BenchmarkMap-16           427443              3538 ns/op
PASS
ok      itest   3.347s
benchmark with 50 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16         668139              1809 ns/op
BenchmarkMap-16           345056              5110 ns/op
PASS
ok      itest   3.523s
benchmark with 60 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16         535975              2325 ns/op
BenchmarkMap-16           272246              5737 ns/op
PASS
ok      itest   3.418s
benchmark with 70 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16         240764              5041 ns/op
BenchmarkMap-16           236133              6278 ns/op
PASS
ok      itest   2.808s
benchmark with 80 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16         293373              3905 ns/op
BenchmarkMap-16           148330              8136 ns/op
PASS
ok      itest   2.484s
benchmark with 90 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16         164458              7300 ns/op
BenchmarkMap-16           126627              8300 ns/op
PASS
ok      itest   2.431s
benchmark with 100 items:
goos: linux
goarch: amd64
pkg: itest
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkSlice-16         134197              8951 ns/op
BenchmarkMap-16           153223              9053 ns/op
PASS
ok      itest   3.767s

As you can see, they are not exactly perfect and scientific, but I think you can see the general trend: BenchmarkSlice starts off as being orders of magnitude more efficient than BenchmarkMap, but as the values grow it starts to catch up until it's about tied.

The key insight I'm trying to make: when you want to check for the "existance in a set", while maps will definitely be better than linear-search on a slice in the long run, they are not if you are initializing the values in a "fire and forget" manner and are working with relatively small arrays anyway.

If you create a slice with a reasonable cap at the beginning of ProtectedString (ie. if seen == nil, seen = make([]Value, 0, 16)) my theory is that it will likely be more performant than using the map, because 1. we skip the cost of initializing the map and its values 2. for most cases of ProtectedString, there will be at most one or two extra allocations to be done.

Finally, there is a case that we can implement better by using such a system rather than a map. Consider the following structure:

var i int = 42

type S struct { A, B *int }

var s S = &S{A: &i, B: &i}

From my understanding of how your code would work, the way that s would actually be printed, it would only print the value for s.A, because when it gets to printing s.B it has already been printed.

What we are looking for instead is to guard against recursive structures; but as you can see, s is not recursive.

Of course, regarding the performance concerns, I still invite you to do some benchmarks to confirm what I'm saying and possibly prove me wrong, but my hunch is that a slice will work better with the amount of values we are likely to see :)

To make it clear, the rough idea is that:

  1. We should have callers of String call ProtectedString passing make([]string, 0, 32) or similar (some decently-high value which should avoid us extra allocations in a majority of cases)
  2. We should call other ProtectedString methods by doing ProtectedString(append(seen, thisValue))

@deelawn
Copy link
Contributor Author

deelawn commented Nov 24, 2023

That's a great explanation. After reading it and doing some tests of my own I agree with you; it's unlikely that the depth of the stack trace will reach the point where using the map would be more performant. I'll make the necessary modifications.

@deelawn
Copy link
Contributor Author

deelawn commented Dec 4, 2023

@thehowl can you let me know what you think of the most recent changes when you have a chance?

@moul
Copy link
Member

moul commented Dec 5, 2023

Looks great, thank you @thehowl for the improvement suggestion and @deelawn for the new implementation.

There is a conflict introduced by a recent merge; @deelawn could you resolve it, please?

@deelawn
Copy link
Contributor Author

deelawn commented Dec 6, 2023

Resolved it. good to merge @moul

gnovm/pkg/gnolang/values_string.go Show resolved Hide resolved
gnovm/pkg/gnolang/values_string.go Outdated Show resolved Hide resolved
gnovm/pkg/gnolang/values_string.go Show resolved Hide resolved
gnovm/pkg/gnolang/values_string.go Outdated Show resolved Hide resolved
gnovm/pkg/gnolang/values_string.go Outdated Show resolved Hide resolved
@thehowl thehowl merged commit 29fd2ea into master Dec 21, 2023
186 checks passed
@thehowl thehowl deleted the bug/recursive-overflow branch December 21, 2023 17:46
gfanton pushed a commit to moul/gno that referenced this pull request Jan 18, 2024
…1315)

Addresses gnolang#1291. This is to fix the issue causing the entire interpreter
process to crash, not the underlying issue that is causing it to panic
in the first place.

---------

Co-authored-by: Morgan <morgan@morganbaz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📦 🤖 gnovm Issues or PRs gnovm related
Projects
Status: No status
Archived in project
Development

Successfully merging this pull request may close these issues.

panic: stack overflow when printing machine state with type conversion, embedded structs and loops
3 participants