Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm: Use a preallocated fixed call frame for typical argument sizes #57

Closed
wants to merge 2 commits into from

Conversation

deanm
Copy link
Contributor

@deanm deanm commented Aug 19, 2019

For calls with 20 or less arguments use a preallocated fixed array
for the calling frame, otherwise fall back to heap allocation. This
avoids what was previously always a heap allocation at callsites.

Before:
Benchmark_Calls-4 50000 27618 ns/op

After:
Benchmark_Calls-4 50000 25948 ns/op

For calls with 20 or less arguments use a preallocated fixed array
for the calling frame, otherwise fall back to heap allocation. This
avoids what was previously always a heap allocation at callsites.

Before:
Benchmark_Calls-4        50000       27618 ns/op

After:
Benchmark_Calls-4        50000       25948 ns/op
Adds a bench mark that shows this something around a 30% difference.

Benchmark_NewVMs-4       3000000         431 ns/op
Benchmark_ReuseVMs-4     5000000         302 ns/op
@XCapsule
Copy link

approve

@savsgio
Copy link

savsgio commented Apr 13, 2020

@deanm @antonmedv any progress or news?

@antonmedv
Copy link
Member

@savsgio A little bit old PR. Looks this PR needs to be rebased. What is your use case? I think even without VM reuse Exor Is pretty fast.

// This is possible because the vm never calls back into itself / nests calls
// (expr code cannot call into a function defined in expr code), so the
// callstack is only ever a single frame.
var fixedFrame20 [20]reflect.Value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it affect expressions without calls at all? We gonna allocate frames for every VM run. Even there is no calls. Mabe this part should be separated in another PR? And added compile time detection if it's needed to allocate fixedFrame20. Also is there any performance boost from it?

@savsgio
Copy link

savsgio commented Apr 13, 2020

@savsgio A little bit old PR. Looks this PR needs to be rebased. What is your use case? I think even without VM reuse Exor Is pretty fast.

I need to reuse the same VM in multiples executions. Now i need to evaluate some rules per each request so if i use vm.Run(...) that's generate one memory allocation (the VM generation), that reduce the performance significantly.

I tried to use the following example, but from second execution, always returns nil

rule, _ := expr.Compile("'foo' == 'foo'")
myVM := vm.NewVM(false)

fmt.Println(myVM.Run(rule, nil)) // true
fmt.Println(myVM.Run(rule, nil)) // nil
// the next n executions always returns nil
``

@savsgio
Copy link

savsgio commented Apr 13, 2020

Reducing this allocation, will increase the performance.

Benchmark_Expr-4   	12952581	        92.7 ns/op	      32 B/op	       1 allocs/op

The code of this benchmark:

func Benchmark_Expr(b *testing.B) {
	foo, _ := expr.Compile("'foo' == 'foo'")

	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		vm.Run(foo, nil)
	}
}

@antonmedv
Copy link
Member

Okay, can you make new PR based on this?

@antonmedv
Copy link
Member

Did it here: #107

@antonmedv antonmedv closed this Apr 13, 2020
@savsgio
Copy link

savsgio commented Apr 13, 2020

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants