Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json #9843

Closed
wants to merge 2 commits into from

Conversation

dianaclarke
Copy link
Contributor

@dianaclarke dianaclarke commented Mar 29, 2021

This test assumes:

 artificial_reg, normal = RunnerComparator(contender, baseline).comparisons

When the return order could be:

 normal, artificial_reg = RunnerComparator(contender, baseline).comparisons

The return order of comparisons isn't deterministic.

See: https://issues.apache.org/jira/browse/ARROW-12145

artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
comparisons = list(RunnerComparator(contender, baseline).comparisons)

# can't assume return order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return order is not determined as internally, benchmarks list is changed to pythons dict which is unordered.
https://github.com/apache/arrow/blob/master/dev/archery/archery/benchmark/compare.py#L140-L145
Result of the first benchmark may return after the result of the second benchmark.

Copy link
Member

@pitrou pitrou Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting from Python 3.7, dicts preserve insertion order:
https://docs.python.org/3.9/whatsnew/3.7.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While dictionary order is preserved in Python 3, set order isn't (to the best of my knowledge).

>>> foo = {'a': 1, 'b': 2, 'c': 3}
>>> bar = {'a': 1, 'b': 2, 'c': 3}
>>> type(foo.keys() & bar.keys())
<class 'set'>

I suspect it's the set usage in here, that makes the resulting order non-deterministic.

def pairwise_compare(contender, baseline):
    dict_contender = {e.name: e for e in contender}
    dict_baseline = {e.name: e for e in baseline}

    for name in (dict_contender.keys() & dict_baseline.keys()):      # <------ for name in set
        yield name, (dict_contender[name], dict_baseline[name])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks.

Comment on lines 100 to 102
artificial, unchanged = comparisons[0], comparisons[1]
if comparisons[0].name == "FloatParsing<FloatType>":
artificial, unchanged = comparisons[1], comparisons[0]
Copy link
Contributor

@cyb70289 cyb70289 Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is a bit strange and hard to reason.
I would recommend changing the test suite to contain only one benchmark.
First test the non-regression case. Then introduce artificial regression and test the regression case.

@github-actions
Copy link

@@ -94,10 +94,16 @@ def test_static_runner_from_json():
archery_result['suites'][0]['benchmarks'][0]['values'][0] *= 2
baseline = StaticBenchmarkRunner.from_json(json.dumps(archery_result))

artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
comparisons = list(RunnerComparator(contender, baseline).comparisons)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not explicitly sort the results? Assuming the benchmark name is available somewhere in the comparisons objects, this would allow making the ordering deterministic.

@dianaclarke
Copy link
Contributor Author

Thanks for the reviews, folks.

I decided to just split the test in two:

  • test_static_runner_from_json_not_a_regression
  • test_static_runner_from_json_regression

@cyb70289 cyb70289 closed this in a542e52 Mar 31, 2021
@cyb70289
Copy link
Contributor

Thanks @dianaclarke

GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
This test assumes:

```
 artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
 ```

When the return order could be:

```
 normal, artificial_reg = RunnerComparator(contender, baseline).comparisons
```

The return order of `comparisons` isn't deterministic.

See: https://issues.apache.org/jira/browse/ARROW-12145

Closes apache#9843 from dianaclarke/ARROW-12145

Authored-by: Diana Clarke <diana.joan.clarke@gmail.com>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 10, 2021
This test assumes:

```
 artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
 ```

When the return order could be:

```
 normal, artificial_reg = RunnerComparator(contender, baseline).comparisons
```

The return order of `comparisons` isn't deterministic.

See: https://issues.apache.org/jira/browse/ARROW-12145

Closes apache#9843 from dianaclarke/ARROW-12145

Authored-by: Diana Clarke <diana.joan.clarke@gmail.com>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 13, 2021
This test assumes:

```
 artificial_reg, normal = RunnerComparator(contender, baseline).comparisons
 ```

When the return order could be:

```
 normal, artificial_reg = RunnerComparator(contender, baseline).comparisons
```

The return order of `comparisons` isn't deterministic.

See: https://issues.apache.org/jira/browse/ARROW-12145

Closes apache#9843 from dianaclarke/ARROW-12145

Authored-by: Diana Clarke <diana.joan.clarke@gmail.com>
Signed-off-by: Yibo Cai <yibo.cai@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants