ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json #9843

dianaclarke · 2021-03-29T22:47:24Z

This test assumes:

 artificial_reg, normal = RunnerComparator(contender, baseline).comparisons

When the return order could be:

 normal, artificial_reg = RunnerComparator(contender, baseline).comparisons

The return order of comparisons isn't deterministic.

See: https://issues.apache.org/jira/browse/ARROW-12145

…_json

cyb70289 · 2021-03-30T05:29:17Z

dev/archery/archery/tests/test_benchmarks.py

-    artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
+    comparisons = list(RunnerComparator(contender, baseline).comparisons)
+
+    # can't assume return order


The return order is not determined as internally, benchmarks list is changed to pythons dict which is unordered.
https://github.com/apache/arrow/blob/master/dev/archery/archery/benchmark/compare.py#L140-L145
Result of the first benchmark may return after the result of the second benchmark.

Starting from Python 3.7, dicts preserve insertion order:
https://docs.python.org/3.9/whatsnew/3.7.html

While dictionary order is preserved in Python 3, set order isn't (to the best of my knowledge).

>>> foo = {'a': 1, 'b': 2, 'c': 3} >>> bar = {'a': 1, 'b': 2, 'c': 3} >>> type(foo.keys() & bar.keys()) <class 'set'>

I suspect it's the set usage in here, that makes the resulting order non-deterministic.

def pairwise_compare(contender, baseline): dict_contender = {e.name: e for e in contender} dict_baseline = {e.name: e for e in baseline} for name in (dict_contender.keys() & dict_baseline.keys()): # <------ for name in set yield name, (dict_contender[name], dict_baseline[name])

I see, thanks.

cyb70289 · 2021-03-30T05:36:46Z

dev/archery/archery/tests/test_benchmarks.py

+    artificial, unchanged = comparisons[0], comparisons[1]
+    if comparisons[0].name == "FloatParsing<FloatType>":
+        artificial, unchanged = comparisons[1], comparisons[0]


This code is a bit strange and hard to reason.
I would recommend changing the test suite to contain only one benchmark.
First test the non-regression case. Then introduce artificial regression and test the regression case.

github-actions · 2021-03-30T08:07:06Z

https://issues.apache.org/jira/browse/ARROW-12145

pitrou · 2021-03-30T12:39:29Z

dev/archery/archery/tests/test_benchmarks.py

@@ -94,10 +94,16 @@ def test_static_runner_from_json():
    archery_result['suites'][0]['benchmarks'][0]['values'][0] *= 2
    baseline = StaticBenchmarkRunner.from_json(json.dumps(archery_result))

-    artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
+    comparisons = list(RunnerComparator(contender, baseline).comparisons)


Why not explicitly sort the results? Assuming the benchmark name is available somewhere in the comparisons objects, this would allow making the ordering deterministic.

…_json

dianaclarke · 2021-03-30T13:41:44Z

Thanks for the reviews, folks.

I decided to just split the test in two:

test_static_runner_from_json_not_a_regression
test_static_runner_from_json_regression

cyb70289 · 2021-03-31T01:39:15Z

Thanks @dianaclarke

This test assumes: ``` artificial_reg, normal = RunnerComparator(contender, baseline).comparisons ``` When the return order could be: ``` normal, artificial_reg = RunnerComparator(contender, baseline).comparisons ``` The return order of `comparisons` isn't deterministic. See: https://issues.apache.org/jira/browse/ARROW-12145 Closes apache#9843 from dianaclarke/ARROW-12145 Authored-by: Diana Clarke <diana.joan.clarke@gmail.com> Signed-off-by: Yibo Cai <yibo.cai@arm.com>

ARROW-12145: [Developer][Archery] Flaky test: test_static_runner_from…

c0a12b2

…_json

cyb70289 requested changes Mar 30, 2021

View reviewed changes

pitrou reviewed Mar 30, 2021

View reviewed changes

ARROW-12145: [Developer][Archery] Flaky test: test_static_runner_from…

961edb1

…_json

cyb70289 approved these changes Mar 31, 2021

View reviewed changes

cyb70289 closed this in a542e52 Mar 31, 2021

asfimport mentioned this pull request Mar 31, 2021

[Developer][Archery] Flaky test: test_static_runner_from_json #27967

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json #9843

ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json #9843

dianaclarke commented Mar 29, 2021 •

edited

Loading

cyb70289 Mar 30, 2021

pitrou Mar 30, 2021 •

edited

Loading

dianaclarke Mar 30, 2021

pitrou Mar 30, 2021

cyb70289 Mar 30, 2021 •

edited

Loading

github-actions bot commented Mar 30, 2021

pitrou Mar 30, 2021

dianaclarke commented Mar 30, 2021

cyb70289 commented Mar 31, 2021

ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json #9843

ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json #9843

Conversation

dianaclarke commented Mar 29, 2021 • edited Loading

cyb70289 Mar 30, 2021

Choose a reason for hiding this comment

pitrou Mar 30, 2021 • edited Loading

Choose a reason for hiding this comment

dianaclarke Mar 30, 2021

Choose a reason for hiding this comment

pitrou Mar 30, 2021

Choose a reason for hiding this comment

cyb70289 Mar 30, 2021 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Mar 30, 2021

pitrou Mar 30, 2021

Choose a reason for hiding this comment

dianaclarke commented Mar 30, 2021

cyb70289 commented Mar 31, 2021

dianaclarke commented Mar 29, 2021 •

edited

Loading

pitrou Mar 30, 2021 •

edited

Loading

cyb70289 Mar 30, 2021 •

edited

Loading