Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ast/term: don't sort an object's keys slice in-place #3823

Merged
merged 7 commits into from
Sep 27, 2021

Conversation

srenatus
Copy link
Contributor

@srenatus srenatus commented Sep 24, 2021

Two different commits, two different approaches to fix the underlying problem that during evaluation, we may sort an object's key slice; and that could happen within the function passed to the object's Iter() method, causing the weird situation that on subsequent invocations of the passed function, the same key would be presented.

This example shows how it's possible, and I believe that this is what happened in #3819.

The partial set memo work that seemed to be the cause of the problem must have changed one thing: before, the document under data.external["test.target"] would not have been Compare'd to anything, and afterwards, it would. (I haven't uncovered the complete code flow yet...)


This was introduced in 2381824, and there's a TODO comment hinting at this being not ideal. I suppose we want to contemplate which way to go here (squash and merge, or drop the second commit doing the sort at insertion-time).


Fixes #3819.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
This is an alternative to the previous commit: instead of sorting
on each call including Compare() or String(), we'll sort the key
slice on insertion of a new value.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
"foo": "bar",
"fizz": "buzz",
}
globals = {"fizz": "buzz", "foo": "bar"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure what to make of this one. I hadn't expected changes to how source files are formatted coming from this 🤔

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@srenatus srenatus marked this pull request as ready for review September 24, 2021 14:19
As observed in the original issue, the failure does not happen on every
run. To observe the added test case _failing_ on branch main, use a test
call like

    go test -v ./topdown -run TestRego/partialsetdoc:_object_sort_while_iter -count=10 -v

The added `sort_bindings: true` is not required for the topdown eval,
but when executing this on the wasm engine, the (unspecified, non-
guaranteed) order is reversed.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@srenatus
Copy link
Contributor Author

srenatus commented Sep 27, 2021

Running some benchmarks with this:

As is to be expected, (obj).String() (which used to sort before printing) got faster, since it's got a lot less to do:

name                               old time/op    new time/op    delta
ObjectString/5/String()-16            893ns ± 7%     564ns ± 2%  -36.78%  (p=0.008 n=5+5)
ObjectString/50/String()-16          10.4µs ± 2%     4.6µs ± 5%  -55.76%  (p=0.008 n=5+5)
ObjectString/500/String()-16          170µs ± 3%      58µs ± 2%  -66.09%  (p=0.008 n=5+5)
ObjectString/5000/String()-16        2.40ms ± 6%    0.60ms ± 5%  -74.90%  (p=0.008 n=5+5)

name                               old alloc/op   new alloc/op   delta
ObjectString/5/String()-16             280B ± 0%      208B ± 0%  -25.71%  (p=0.008 n=5+5)
ObjectString/50/String()-16          2.26kB ± 0%    1.82kB ± 0%  -19.43%  (p=0.008 n=5+5)
ObjectString/500/String()-16         41.8kB ± 0%    37.6kB ± 0%   -9.87%  (p=0.008 n=5+5)
ObjectString/5000/String()-16         409kB ± 0%     368kB ± 0%  -10.01%  (p=0.008 n=5+5)

name                               old allocs/op  new allocs/op  delta
ObjectString/5/String()-16             21.0 ± 0%      19.0 ± 0%   -9.52%  (p=0.008 n=5+5)
ObjectString/50/String()-16             159 ± 0%       157 ± 0%   -1.26%  (p=0.008 n=5+5)
ObjectString/500/String()-16          1.52k ± 0%     1.51k ± 0%   -0.13%  (p=0.008 n=5+5)
ObjectString/5000/String()-16         11.0k ± 0%     11.0k ± 0%   -0.02%  (p=0.008 n=5+5)

This is "paid for" by extra work during object creation, like the added benchmark shows:

name                                         old time/op    new time/op    delta
ObjectConstruction/random_keys/5-16            3.23µs ±25%    3.44µs ±26%      ~     (p=0.247 n=10+10)
ObjectConstruction/random_keys/50-16           25.8µs ± 6%    30.1µs ± 9%   +16.65%  (p=0.000 n=10+10)
ObjectConstruction/random_keys/500-16           254µs ± 7%     338µs ± 7%   +33.06%  (p=0.000 n=10+10)
ObjectConstruction/random_keys/5000-16         2.91ms ± 5%    4.26ms ± 5%   +46.45%  (p=0.000 n=10+10)
ObjectConstruction/random_keys/50000-16        36.7ms ±10%    90.5ms ±13%  +146.74%  (p=0.000 n=10+10)
ObjectConstruction/increasing_keys/5-16        4.39µs ± 5%    4.31µs ± 3%      ~     (p=0.280 n=10+10)
ObjectConstruction/increasing_keys/50-16       39.7µs ±13%    45.3µs ± 5%   +14.16%  (p=0.001 n=10+10)
ObjectConstruction/increasing_keys/500-16       413µs ± 7%     525µs ± 1%   +27.12%  (p=0.000 n=10+8)
ObjectConstruction/increasing_keys/5000-16     5.00ms ± 2%    7.46ms ± 2%   +49.26%  (p=0.000 n=8+10)
ObjectConstruction/increasing_keys/50000-16    53.7ms ± 6%   175.2ms ± 5%  +226.41%  (p=0.000 n=10+10)

name                                         old alloc/op   new alloc/op   delta
ObjectConstruction/random_keys/5-16            2.51kB ±11%    2.58kB ±13%      ~     (p=0.534 n=10+10)
ObjectConstruction/random_keys/50-16           15.3kB ± 5%    15.3kB ± 9%      ~     (p=0.565 n=10+10)
ObjectConstruction/random_keys/500-16           137kB ± 2%     137kB ± 2%      ~     (p=0.631 n=10+10)
ObjectConstruction/random_keys/5000-16         1.39MB ± 0%    1.38MB ± 1%    -0.80%  (p=0.001 n=10+10)
ObjectConstruction/random_keys/50000-16        16.5MB ± 1%    16.5MB ± 1%      ~     (p=0.218 n=10+10)
ObjectConstruction/increasing_keys/5-16        2.97kB ± 0%    2.97kB ± 0%      ~     (all equal)
ObjectConstruction/increasing_keys/50-16       19.8kB ± 0%    19.8kB ± 0%      ~     (p=0.103 n=10+10)
ObjectConstruction/increasing_keys/500-16       244kB ± 0%     244kB ± 0%      ~     (p=0.305 n=10+10)
ObjectConstruction/increasing_keys/5000-16     2.24MB ± 0%    2.24MB ± 1%      ~     (p=0.393 n=10+10)
ObjectConstruction/increasing_keys/50000-16    23.4MB ± 1%    23.8MB ± 0%    +1.57%  (p=0.000 n=10+10)

name                                         old allocs/op  new allocs/op  delta
ObjectConstruction/random_keys/5-16              47.9 ±21%      50.6 ±25%      ~     (p=0.534 n=10+10)
ObjectConstruction/random_keys/50-16              375 ± 9%       374 ±14%      ~     (p=0.591 n=10+10)
ObjectConstruction/random_keys/500-16           3.51k ± 3%     3.49k ± 4%      ~     (p=0.529 n=10+10)
ObjectConstruction/random_keys/5000-16          34.9k ± 1%     34.9k ± 1%      ~     (p=0.684 n=10+10)
ObjectConstruction/random_keys/50000-16          349k ± 0%      349k ± 0%      ~     (p=0.796 n=10+10)
ObjectConstruction/increasing_keys/5-16          65.0 ± 0%      65.0 ± 0%      ~     (all equal)
ObjectConstruction/increasing_keys/50-16          556 ± 0%       556 ± 0%      ~     (all equal)
ObjectConstruction/increasing_keys/500-16       5.53k ± 0%     5.53k ± 0%      ~     (p=1.000 n=10+10)
ObjectConstruction/increasing_keys/5000-16      55.1k ± 0%     55.1k ± 0%      ~     (p=0.685 n=10+10)
ObjectConstruction/increasing_keys/50000-16      553k ± 0%      553k ± 0%      ~     (p=0.447 n=10+10)

Note that the "increasing keys" object created in the benchmark ({"1": 1, "2": 2, "3": 3, ... }) is particularly unsuited for the binary-search-to-find-right-spot approach used when ensuring that the keys slice is sorted on (Object).Insert().

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
Copy link
Member

@tsandall tsandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@srenatus srenatus merged commit bff5399 into open-policy-agent:main Sep 27, 2021
@srenatus srenatus deleted the sr/issue-3819 branch September 27, 2021 16:36
srenatus added a commit to srenatus/opa that referenced this pull request Oct 4, 2021
Before, it depended on the elements being passed in ordered by their rows.
Before open-policy-agent#3823, the iteration
order was the same as the row order; but with sorting the keys slice (which
determines iteration order) on creation, that was changed.

Now, we'll sort the elements within `groupIterable`.

Fixes open-policy-agent#3849.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
srenatus added a commit that referenced this pull request Oct 4, 2021
Before, it depended on the elements being passed in ordered by their rows.
Before #3823, the iteration
order was the same as the row order; but with sorting the keys slice (which
determines iteration order) on creation, that was changed.

Now, we'll sort the elements within `groupIterable`.

Fixes #3849.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
srenatus added a commit to srenatus/opa that referenced this pull request Oct 4, 2021
Before, it depended on the elements being passed in ordered by their rows.
Before open-policy-agent#3823, the iteration
order was the same as the row order; but with sorting the keys slice (which
determines iteration order) on creation, that was changed.

Now, we'll sort the elements within `groupIterable`.

Fixes open-policy-agent#3849.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
srenatus added a commit that referenced this pull request Oct 4, 2021
* format: make groupIterable sort by row

Before, it depended on the elements being passed in ordered by their rows.
Before #3823, the iteration
order was the same as the row order; but with sorting the keys slice (which
determines iteration order) on creation, that was changed.

Now, we'll sort the elements within `groupIterable`.

Fixes #3849.

Also includes:

* ast/term_bench_test: fix benchmark

Since map iteration is randomized in golang, this benchmark didn't
actually measure what was intended, but rather the presence or ab-
sence of duplicate keys.

Now, we'll create a set of keys to insert before, and either shuffle
it or use it in its increasing order: to call `(Object).Insert()` in
a loop.

* CHANGELOG/capabilities: update for v0.33.1 bugfix release

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
dolevf pushed a commit to dolevf/opa that referenced this pull request Nov 4, 2021
…nt#3823)

* ast/term: sort-on-insert for object.keys slice

Instead of sorting on each call including Compare() or String(), we'll sort the key
slice on insertion of a new value.

* testcases: add case

As observed in the original issue, the failure does not happen on every
run. To observe the added test case _failing_ on branch main, use a test
call like

    go test -v ./topdown -run TestRego/partialsetdoc:_object_sort_while_iter -count=10 -v

The added `sort_bindings: true` is not required for the topdown eval,
but when executing this on the wasm engine, the (unspecified, non-
guaranteed) order is reversed.

Fixes open-policy-agent#3819

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
Signed-off-by: Dolev Farhi <farhi.dolev@gmail.com>
dolevf pushed a commit to dolevf/opa that referenced this pull request Nov 4, 2021
* format: make groupIterable sort by row

Before, it depended on the elements being passed in ordered by their rows.
Before open-policy-agent#3823, the iteration
order was the same as the row order; but with sorting the keys slice (which
determines iteration order) on creation, that was changed.

Now, we'll sort the elements within `groupIterable`.

Fixes open-policy-agent#3849.

Also includes:

* ast/term_bench_test: fix benchmark

Since map iteration is randomized in golang, this benchmark didn't
actually measure what was intended, but rather the presence or ab-
sence of duplicate keys.

Now, we'll create a set of keys to insert before, and either shuffle
it or use it in its increasing order: to call `(Object).Insert()` in
a loop.

* CHANGELOG/capabilities: update for v0.33.1 bugfix release

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
Signed-off-by: Dolev Farhi <farhi.dolev@gmail.com>
srenatus added a commit to srenatus/opa that referenced this pull request Nov 17, 2021
Before, we'd sort a set's `keys` slice in-place, and only when it
was actually compared to another set. This side-effect of a comparison
can be unexpected.

Now, we'll keep the `keys` slice sorted by inserting every new element
into its sorted position.

Analogous to open-policy-agent#3823, where we did this with objects' `keys` slices.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
srenatus added a commit that referenced this pull request Nov 17, 2021
Before, we'd sort a set's `keys` slice in-place, and only when it
was actually compared to another set. This side-effect of a comparison
can be unexpected.

Now, we'll keep the `keys` slice sorted by inserting every new element
into its sorted position.

Analogous to #3823, where we did this with objects' `keys` slices.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
floriangasc pushed a commit to floriangasc/opa that referenced this pull request Nov 22, 2021
Before, we'd sort a set's `keys` slice in-place, and only when it
was actually compared to another set. This side-effect of a comparison
can be unexpected.

Now, we'll keep the `keys` slice sorted by inserting every new element
into its sorted position.

Analogous to open-policy-agent#3823, where we did this with objects' `keys` slices.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@philipaconrad philipaconrad mentioned this pull request Jul 1, 2022
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

intermittent incorrect results with OPA CLI
2 participants