-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
struct: store members as starlark.StringDict for faster access #401
Conversation
Wraps starlark.StringDict for storing fields. Improves the performance on the `Attr` method for accessing fields. Benchmark of changes to field storage. goos: linux goarch: amd64 pkg: go.starlark.net/starlarkstruct cpu: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz OLD: BenchmarkAttr_4-4 44931932 24.03 ns/op BenchmarkAttr_8-4 39342510 28.64 ns/op BenchmarkAttr_16-4 33596271 32.18 ns/op BenchmarkAttr_32-4 30505212 37.62 ns/op BenchmarkAttr_64-4 24424558 43.33 ns/op BenchmarkAttr_128-4 17947136 64.29 ns/op NEW: BenchmarkAttr_4-4 48772784 21.45 ns/op BenchmarkAttr_8-4 34554286 30.95 ns/op BenchmarkAttr_16-4 56185027 19.59 ns/op BenchmarkAttr_32-4 58317289 19.60 ns/op BenchmarkAttr_64-4 58073444 20.30 ns/op BenchmarkAttr_128-4 52069291 21.30 ns/op
starlarkstruct/struct.go
Outdated
@@ -268,10 +237,13 @@ func structsEqual(x, y *Struct, depth int) (bool, error) { | |||
return false, nil | |||
} | |||
|
|||
for i, n := 0, x.len(); i < n; i++ { | |||
if x.entries[i].name != y.entries[i].name { | |||
for k, xv := range x.members { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is now nondeterministic because EqualDepth may fail before or after an unequal field comparison, depending on iteration order. I don't see any way to make it deterministic other than by iterating over Keys(), which could increase the cost of comparison quite a lot, since it allocates.
Perhaps you could add benchmarks of these operations to testdata/benchmark.star?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! Forgot about the comparison case. Not sure where the benchmarking setup code should live from starlark to give access to starlarkstruct packages so instead I've added a similar benchmark test under struct_test.go:
Current non-deterministic:
BenchmarkEqual_4-4 5617100 203.9 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_8-4 2535235 469.3 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_16-4 1643161 734.6 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_32-4 743796 1568 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_64-4 334785 3319 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_128-4 150016 7613 ns/op 0 B/op 0 allocs/op
Fixed with range over members.Keys()
:
BenchmarkEqual_4-4 2556396 470.8 ns/op 88 B/op 2 allocs/op
BenchmarkEqual_8-4 1000000 1001 ns/op 152 B/op 2 allocs/op
BenchmarkEqual_16-4 711522 1753 ns/op 280 B/op 2 allocs/op
BenchmarkEqual_32-4 274262 3979 ns/op 536 B/op 2 allocs/op
BenchmarkEqual_64-4 121689 9411 ns/op 1048 B/op 2 allocs/op
BenchmarkEqual_128-4 51733 22085 ns/op 2072 B/op 2 allocs/op
Optimized with setErr
func to report the first error by key:
BenchmarkEqual_4-4 5697518 205.8 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_8-4 2513456 477.0 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_16-4 1560715 741.0 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_32-4 725080 1535 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_64-4 354627 3426 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_128-4 148989 7502 ns/op 0 B/op 0 allocs/op
Original:
BenchmarkEqual_4-4 11487849 99.35 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_8-4 6423819 185.7 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_16-4 3405678 350.4 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_32-4 1758501 679.3 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_64-4 880608 1345 ns/op 0 B/op 0 allocs/op
BenchmarkEqual_128-4 441919 2662 ns/op 0 B/op 0 allocs/op
I tried to optimize the comparison to avoid the allocation. So now looking at least 2-3x worse with the optimized case. I've just tested happy path on equal values. Non equal comparisons have to loop through all keys now so will be affected more. I thought Attr
was the most important method but thats only from my use case.
Looked at how starlark.Dict internally does it and found a similar optimisation: #402 |
return false, nil | ||
} else if eq, err := starlark.EqualDepth(x.entries[i].value, y.entries[i].value, depth-1); err != nil { | ||
return false, err | ||
// Loop over every key returning the lowest error by key, if any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop must now visit every pair of elements even when the first pair differs. That's a regression. I'm not sure StringDict really makes sense as a representation for struct. The requirements are that:
- struct is compact.
- struct.f is fast.
- str(struct) uses field-name order.
- hash(struct) uses field-name order (or is order-independent).
- struct==struct compares fields in name order.
The previous representation using a name-ordered k/v array was better on all requirements except 2, and even then the advantage is only material when N>=16, which is an unusually large struct in a typical Starlark program.
I suspect the access patterns of Starlark fields in Attr are highly temporally localized: the most likely field name to be accessed is the previous one. If that is true (and it would be worth measuring), it might be possible to outperform both the existing implementation and the hash table by caching the resulting index of the binary search in a field in the struct (without a mutex!) and using it as a hint next time.
func FromStringDict() {
...
s := &Struct{
constructor: constructor,
hint: maxuint32,
}
}
func (s *Struct) Attr(name string) (starlark.Value, error) {
if hint := s.hint; hint < n && s.entries[hint].name == name {
return ... // yay the hint worked
}
// Binary search the entries.
for i < j { ...binary search...}
if i < n && s.entries[i].name == name {
s.hint = i // save the hint
return s.entries[i].value, nil
}
...
}
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I think the original binary search implementation is better. Hint idea sounds good, I'll have a look at how to validate it. Please let me know if you'd like to keep the benchmarking code and I could modify this PR to just include that? Otherwise please close! Thanks for taking the time to review.
I did try a solution based on the starlark.hashtable
here: emcfarlane#1 . It does look like it could beat both implementations but requires modifying the original hashtable, and cloning it out of the package. Just an experiment.
Closing for #403 |
Wraps starlark.StringDict for storing fields.
Improves the performance on the
Attr
method for accessing fields.I added a benchmark to check the happy path for access to a fields value.
OLD:
BenchmarkAttr_4-4 44931932 24.03 ns/op
BenchmarkAttr_8-4 39342510 28.64 ns/op
BenchmarkAttr_16-4 33596271 32.18 ns/op
BenchmarkAttr_32-4 30505212 37.62 ns/op
BenchmarkAttr_64-4 24424558 43.33 ns/op
BenchmarkAttr_128-4 17947136 64.29 ns/op
NEW:
BenchmarkAttr_4-4 48772784 21.45 ns/op
BenchmarkAttr_8-4 34554286 30.95 ns/op
BenchmarkAttr_16-4 56185027 19.59 ns/op
BenchmarkAttr_32-4 58317289 19.60 ns/op
BenchmarkAttr_64-4 58073444 20.30 ns/op
BenchmarkAttr_128-4 52069291 21.30 ns/op