Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests preset pass manager API #2519

Merged
merged 14 commits into from Jul 12, 2019
Merged

Tests preset pass manager API #2519

merged 14 commits into from Jul 12, 2019

Conversation

1ucian0
Copy link
Member

@1ucian0 1ucian0 commented May 27, 2019

Originally posted by @rafamartinc in #2510 (comment)

Test should be obvious and easy to read. It's okey to repeat code in tests if that makes it more clear. Reading the body of a test should self-explain what is the goal of the test. And that goal should be one single and simple check.

Let's take a test in test/python/transpiler/test_preset_passmanagers.py for example:

    def test_optimization_level(self):
        """Test several backends with all optimization levels"""
        for backend in [FakeTenerife(), FakeMelbourne(), FakeRueschlikon(), FakeTokyo()]:
            for optimization_level in range(4):
                result = transpile(
                    [self._circuit],
                    backend=backend,
                    optimization_level=optimization_level
                )
                self.assertIsInstance(result, QuantumCircuit)

This test tests several fake backends with several levels. A single fail would the full test to stop. Additionally, there is no way to run a single of those combinations. It's a kind of all or nothing thing. But more importantly, it can hide issues that make it hard to debug. For example, if one of the transpile execution modified self._circuit, the next transpile would run with that circuit. So the tests depeden on each other! That's not good.

I advocate for explicit and not-in-a-loop test. They will help us in the future...

@yaelbh
Copy link
Contributor

yaelbh commented May 27, 2019

I'm not sure that test readability is in trade-off with code duplication. The issues that @rafamartinc raises are important, but in my opinion effort should be made to resolve them without duplicating code. For example, one can make the loop iterate over a list of pairs (backend, optiomization_level), where the list can be controlled by the person who runs the test. The setting of the circuit can be moved inside the loop. Not sure if this is a good solution, but it's only an example. Hopefully a good solution will be easy to read not despite but because it refrains from code duplication.

In Aer we have many code duplications in tests, with this rationale of readability and easy debug. As a result, when moved to Terra 0.8, we had to replace almost 200 instances of compile:
Qiskit/qiskit-aer@826bff9

@yehuda-naveh
Copy link

Underscoring Yael's point - it's not just the tedious work of editing 200 places for each change. It is a functional issue. If you need to introduce a change in a test, e.g., because the test failed, or for compatibility, or to enhance the test with additional functionality - it is likely that you would not always repeat the changes in the 200 instances. E.g., because doing a fast fix only to a test that failed and blocks you, or only to a test you are sure what you are doing. So that's a source of a bug in the other tests which did not get the fix. More than that, each of the tests may diverge a bit because it covers different functionality, e.g., for Tenerife someone may add an additional check which is not relevant to Melbourne. So far so good, and you could hardly stop this practice by methodology alone. But then when you need a global change to all tests because of e.g. compatibility with new version, the change needs to be implemented differently in each of the now divergent tests. Again - a sure way to introduce bugs and reduce test functionality. Just as with any other duplication in any part of the code.

So from my point of view - totally agree with Yael, need to have simple readable tests with simple to read loops and functions, clear comments, and no duplication.

And on a related note - need to make the tests random to dramatically increase coverage - especially in exponential or very large spaces, which we have a lot. Once used to it, it is simple to write random tests which are still highly readable and very clear in their intention and functionality

@1ucian0
Copy link
Member Author

1ucian0 commented May 28, 2019

For example, one can make the loop iterate over a list of pairs (backend, optiomization_level)

Can you suggest a snippet? Would I have the possibility to run a single backend/level combination?

As a result, when moved to Terra 0.8, we had to replace almost 200 instances of compile.

That's what refactor tools are for :) API changes have that kind of problems.

So from my point of view - totally agree with Yael, need to have simple readable tests with simple to read loops and functions, clear comments, and no duplication.

I have no problems with loops in the assertions, but not in the functionality to check (like in this PR). https://mtlynch.io/good-developers-bad-tests/#dare-to-violate-dry

need to make the tests random to dramatically increase coverage

You can join the discussion in #205 about that. We are moving towards having out-of-band fuzzing for that. Having non-deterministic tests in the suite is, for me, a strong and big no.

@diego-plan9
Copy link
Member

Perhaps .subTest() offers a better trade-off?

@delapuente
Copy link
Contributor

Actually, @diego-plan9 solution is the way we follow in other places and should be the path to go.

@yaelbh
Copy link
Contributor

yaelbh commented May 28, 2019

I don't think you should count on refactoring tools because as @yehuda-naveh mentioned over time there are divergences between the tests, which force you to manually refactor each instance. Overall @diego-plan9's subtest suggestion indeed looks like what we are looking for here, so the snippet is not needed anymore. If subtest turns out to be irrelevant, the point that I'm trying to convey is that effort should be made to avoid duplication. It is worth spending some time on searching an alternative before reaching the conclusion that there is no other choice.

@1ucian0
Copy link
Member Author

1ucian0 commented May 28, 2019

The option subtest solves the "stop after first fail" issue [1], but it is not possible to call an individual subtest. The reason for asking a snippet is because I think they are also unnecessarily obscure to read:

    def test_all_optimization_levels_and_all_backends(self):
        for backend in [FakeTenerife(), FakeMelbourne(), FakeRueschlikon(), FakeTokyo()]:
            for optimization_level in range(4):
                with self.subTest(optimization_level=optimization_level, backend=backend):
                    result = transpile(self.create_circuit(),
                                       backend=backend,
                                       optimization_level=optimization_level)
                    self.assertIsInstance(result, QuantumCircuit)

Lack of readability might hide problems like a possible test dependency in case of mutable circuit parameter (as the current test_no_coupling_map shows).

So, I do prefer readability over "code duplication" (where the code is literally the line to test, transpile in this case). What's your suggestion for the alternative, @yaelbh?

[1] Kinda. Notice that a problem in a single backend constructor will stop the full test. But backend construction is not what this test is testing.

@yaelbh
Copy link
Contributor

yaelbh commented May 29, 2019

I'm not knowledgeable enough about the test and the code that it tests. Just wanted to share our experience with Aer tests.

@1ucian0 1ucian0 changed the title "compacted" tests are not good practice Tests preset pass manager API May 31, 2019
@1ucian0
Copy link
Member Author

1ucian0 commented May 31, 2019

I pushed a middle ground. Where I just split the four preset pass managers we have. This file is for testing the API.

@ajavadia
Copy link
Member

ajavadia commented Jun 2, 2019

I think this latest commit is an improvement here, but I'm still not completely on board with splitting the tests, even at this limited level. It would make sense to do it like this, if and when we are testing different things for each of the optimization levels. That may happen in the future, but this is premature right now in my opinion and is just code duplication with no clear added value.

@1ucian0
Copy link
Member Author

1ucian0 commented Jun 4, 2019

Some issues refer to specific optimizations levels. They cover significantly different parts of the code.

@1ucian0
Copy link
Member Author

1ucian0 commented Jun 4, 2019

If we are going to give away simplicity, at least let's continue having independent calling. What do you think about https://github.com/1ucian0/qiskit-terra/blob/2532/test/python/transpiler/test_preset_passmanagers.py ?

@1ucian0
Copy link
Member Author

1ucian0 commented Jun 7, 2019

This PR introduces ddt, as PR #2559. Consensus about ddt is still pending. So I'm putting this one on hold.

@1ucian0 1ucian0 added the on hold Can not fix yet label Jun 7, 2019
@1ucian0
Copy link
Member Author

1ucian0 commented Jun 9, 2019

The discussion about ddt is happening here #2589

@1ucian0
Copy link
Member Author

1ucian0 commented Jun 10, 2019

#2589 was resolved and ddt was accepted. Removing on hold.

@1ucian0 1ucian0 removed the on hold Can not fix yet label Jun 10, 2019
@1ucian0 1ucian0 requested a review from lcapelluto as a code owner July 12, 2019 12:35
@1ucian0
Copy link
Member Author

1ucian0 commented Jul 12, 2019

The test were added in #2563, this PR just have some tiny renames left.

@kdk kdk merged commit 556fb35 into Qiskit:master Jul 12, 2019
lia-approves pushed a commit to edasgupta/qiskit-terra that referenced this pull request Jul 30, 2019
* expand tests

* compress back

* a new approach

* docstring

* style
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants