Shared testsuite: Skip unsupported ops tests #724

antalszava · 2020-07-28T18:53:28Z

Context:
A shared device test suite was added in #695 which allows specifying PennyLane devices and running test cases using them.

Certain devices might not support all the operations and observables that are used in these tests. Test cases which contain unsupported operations or observables will fail for the specific device. This becomes challenging when trying to filter which test cases failed due to unsupported ops/observables and which due to an error with the device itself.

Description of the Change:

Adds a custom pytest_runtest_makereport hook definition to mark test cases which failed due to a DeviceError or with a NotImplementedError as xfail
Adds checks to the test cases in the shared test suite

Benefits:
Devices that do not support all the operations and observables can still use the shared test suite. Test cases that include unsupported ops/observables will be skipped.

Possible Drawbacks:
Passing the test suite while also skipping several tests might give developers the false impression that every basic operation was implemented.

One potential aid to this could be passing the -rsx opytion to pytest to include messages for skipped test cases:

========================================================== short test summary info ==========================================================
XFAIL pennylane/plugins/tests/test_gates.py::TestGatesQubit::test_two_qubit_no_parameters[device_kwargs0-SWAP-mat1]
  reason:Gate SWAP not supported on device default.qubit
XFAIL pennylane/plugins/tests/test_gates.py::TestGatesQubit::test_two_qubit_no_parameters[device_kwargs0-SWAP-mat1]
  reason:Gate SWAP not supported on device default.qubit
XFAIL pennylane/plugins/tests/test_gates.py::TestGatesQubit::test_two_qubit_no_parameters[device_kwargs0-SWAP-mat1]
  reason:Gate SWAP not supported on device default.qubit

Related GitHub Issues:
N/A

codecov · 2020-07-28T18:55:35Z

Codecov Report

Merging #724 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #724   +/-   ##
=======================================
  Coverage   95.46%   95.46%           
=======================================
  Files         107      107           
  Lines        6791     6791           
=======================================
  Hits         6483     6483           
  Misses        308      308

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c859da...57dffb4. Read the comment docs.

antalszava · 2020-07-28T19:51:10Z

Curious to hear feedback on whether or not:

skipping tests for unsupported operations/observables for the shared test suite

aligns with the aim of the shared test suite (might not require an entire code review).

trbromley

Thanks @antalszava, looks great! I left some quick comments, but don't worry about implementing them all if it looks like it'll take a nontrivial amount of time.

trbromley · 2020-07-28T21:21:13Z

pennylane/plugins/tests/conftest.py

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
    return _skip_if


+@pytest.fixture(scope="session")


I never got comfortable with the scope. What does it mean in this setting to set the scope as "session"?

I believe the scope determines when the fixture is evaluated.

e.g., scope="function" means that the fixture is re-evaluated for each test function (almost like a 'setUp'). Whereas scope="session" means that the fixture is evaluated when pytest first starts, and this value used for each test function

Setting a larger scope can be important if the fixture is time consuming to evaluate. On the other hand, a smaller scope might be required if the test functions mutate/affect the logic within the fixture.

https://docs.pytest.org/en/stable/fixture.html#scope-sharing-fixtures-across-classes-modules-packages-or-session

trbromley · 2020-07-28T21:23:01Z

pennylane/plugins/tests/conftest.py

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
    return _skip_if


+@pytest.fixture(scope="session")
+def skip_if_ops():


Minor point: is this a clear name?
Would something like
skip_if_ops_unsupported? It's longer, but clearer (to me at least!)

trbromley · 2020-07-28T21:31:51Z

pennylane/plugins/tests/conftest.py

+    def _skip_if_ops(dev, ops):
+        """ Skip test if device does not support an operation. """
+
+        for op in ops:


Minor point: Could we do if op.__name__ not in dev.operations here, to save all the __name__'s below?

trbromley · 2020-07-28T21:34:16Z

pennylane/plugins/tests/conftest.py

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
    return _skip_if


+@pytest.fixture(scope="session")
+def skip_if_ops():


Minor point: could we pass the device fixture here?

This might be a good idea. If I recall, the other fixtures do not depend on the device fixture because they must be run before device creation.

This fixture could potentially be evaluated after device creation, however.

I am not sure if this will work, or if the same problem occurs: the device is only created inside a test, because it may be created with different wire classes that are specified by the fixture.

josh146

@antalszava, perhaps it might make sense to automate this, rather than using a fixture that must be copied and pasted into every test.

For example, here is an approach where we create a pytest hook in conftest.py:

def pytest_runtest_makereport(item, call):
    from _pytest.runner import pytest_runtest_makereport as orig_pytest_runtest_makereport
    tr = orig_pytest_runtest_makereport(item, call)

    if "skip_unsupported" in item.keywords:
        if call.excinfo is not None:
            if call.excinfo.type == qml.DeviceError and "not supported on device" in str(call.excinfo.value):
                tr.outcome = "skipped"
                tr.wasxfail = "reason:" + str(call.excinfo.value)

    return tr

Now, if we decorate a test class of test function with @pytest.mark.skip_unsupported, any test that raises the unsupported operation/observable exception will be automatically skipped:

@pytest.mark.skip_unsupported
@flaky(max_runs=10)
class TestGatesQubit:
    """Test qubit-based devices' probability vector after application of gates.
    """

This has two advantages:

It's easy to add this feature to whole test classes or test modules with one line
Since you don't have to hardcode into the test which operations to skip, it generalizes to decompositions. For example, AQT doesn't support the PauliX gate, but it does support a decomposition of PauliX into RX gates. Using the approach in the PR, the integration test would be skipped if PauliX not in dev.operations (which it isn't), but this test should still be able to run, since PL will perform the decomposition.

josh146 · 2020-07-29T05:50:19Z

pennylane/plugins/tests/conftest.py

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
    return _skip_if


+@pytest.fixture(scope="session")


I believe the scope determines when the fixture is evaluated.

e.g., scope="function" means that the fixture is re-evaluated for each test function (almost like a 'setUp'). Whereas scope="session" means that the fixture is evaluated when pytest first starts, and this value used for each test function

Setting a larger scope can be important if the fixture is time consuming to evaluate. On the other hand, a smaller scope might be required if the test functions mutate/affect the logic within the fixture.

https://docs.pytest.org/en/stable/fixture.html#scope-sharing-fixtures-across-classes-modules-packages-or-session

josh146 · 2020-07-29T05:54:19Z

pennylane/plugins/tests/conftest.py

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
    return _skip_if


+@pytest.fixture(scope="session")
+def skip_if_ops():


This might be a good idea. If I recall, the other fixtures do not depend on the device fixture because they must be run before device creation.

This fixture could potentially be evaluated after device creation, however.

josh146 · 2020-07-29T05:56:54Z

pennylane/plugins/tests/conftest.py

+            if op not in dev.operations:
+                pytest.skip(
+                    "Test skipped for unsupported operation {} on {} device.".format(op, dev.name)
+                )


This seems like a good place to use dev.supports_operation(operation), where operation can be either a string or the class type itself.

e.g.,

>>> dev.supports_operation("PauliX") True >>> dev.supports_operation(qml.PauliX) True

This also avoids the issue with calling __name__ below.

josh146 · 2020-07-29T05:57:15Z

pennylane/plugins/tests/conftest.py

+
+        for obs in observables:
+            # skip if the observable is not supported by the device
+            if obs not in dev.observables:


similarly, you can use dev.supports_observable(obs) here.

mariaschuld

I like @josh146 's solution, would have never thought about this. But would it be possible to have a command line argument which switches skipping off altogether?

A developer could then run the "strict" or "lenient" test suite version. If we understand the tests as our basic requirements for a device, and implementing all PL ops is a basic requirement for a device, the default version of the tests should be the one not accepting the skipping in my eyes...

josh146 · 2020-07-29T13:56:51Z

I like @josh146 's solution, would have never thought about this. But would it be possible to have a command line argument which switches skipping off altogether?

Ooh, maybe! Maybe you could add it to the logic here:

if "skip_unsupported" in item.keywords and <cli argument?>:

antalszava · 2020-07-29T17:52:40Z

Thanks @josh146 @mariaschuld @trbromley for the nice comments! 😊

Turned to the more elegant solution that @josh posted. :) Unfortunately didn't come across a nice way to use cli arguments when redefining pytest_runtest_makereport :( This would be a pytest hook that creates the report after the test ran and with this version, it would mark it as xfail (doesn't seem to be trivial to mark it as skipped 🤔). One small last resort is that the -rsx still outputs further details on why tests were marked with xfail.

josh146

Looks great @antalszava!

I didn't really have an answer with respect to CLI arguments (pytest docs are good in that they are extensive, but bad in that they don't cover all use-cases!), but I had a hunch and tried it locally, and managed to get it working. I pushed this commit here: 25950dc

Essentially, the item argument in pytest_runtest_makereport allows you to inspect CLI arguments, and branch based on their values 🙂 I learnt something new! You can now do

pytest pennylane/plugins/tests --device=default.qubit --skip-ops

to turn on skipping unsupported operations.

@mariaschuld, does this affect the existing skip_if(dev, capabilities) fixture? In the original PR, we played around with getting @pytest.mark.capabilities("inverse_operations.true", "model.qubit") etc. to work.

It would even be nice to have marks for particular types of devices. E.g., passthru devices have specific tests marked with @pytest.mark.backprop etc.

josh146 · 2020-07-30T10:49:22Z

pennylane/plugins/tests/test_gates.py

@@ -141,6 +141,7 @@


 @flaky(max_runs=10)
+@pytest.mark.skip_unsupported


Minor note: if every class in this file should skip unsupported gates, you can add

unsupportedmark = pytest.mark.skip_unsupported

at the top of the module, and pytest will apply it to the entire module :)

Thanks great idea! Adding it

…ennylane into skip_unsupported_ops_tests

trbromley

Thanks @antalszava, the second iteration looks much cleaner!

What is the motivation for testing devices with operations that they don't support? Do we need this as a command line option? I was expecting this to be hard coded rather than an option.

trbromley · 2020-07-30T11:59:14Z

pennylane/plugins/tests/conftest.py

+    parser.addoption(
+        "--skip-ops",
+        action="store_true",
+        default=False,


I'm curious why we want the default to be False?

Why do we want to test a device with operations we know it doesn't support?

I'm ambivalent here, I left it like this as that is what @antalszava's original logic did.

An argument for making the skipping optional is given here by @mariaschuld: #724 (review)

Thanks! Although I'm not convinced.

If we have a "lenient" and a "strict" mode, which mode do we use to conclude that the device is successfully passing tests? I thought the idea of the test suite was to define a standard, if a device passes the tests then it's good to work with PL.

If passing tests on strict mode is the requirement, then does that mean that there is a canonical set of gates that a device must support? That seems like a high bar, especially on hardware devices.

I thought the idea of the test suite was to define a standard, if a device passes the tests then it's good to work with PL.

I think the idea is more to (a) reduce device integration test duplication, and (b) reduce maintenance load. Using it to set a standard is a potential option we marked to explore further down the road :)

Agree with the points raised by Josh. From my point of view, if a device for some reason cannot support all operations then basically it also cannot use the shared test suite. This would simply lead to having to duplicate the tests in another repo.

…uAI/pennylane into skip_unsupported_ops_tests

antalszava requested review from mariaschuld, trbromley, co9olguy and josh146 July 28, 2020 19:44

antalszava mentioned this pull request Jul 28, 2020

Python tests PennyLaneAI/pennylane-lightning#17

Merged

trbromley approved these changes Jul 28, 2020

View reviewed changes

josh146 requested changes Jul 29, 2020

View reviewed changes

mariaschuld reviewed Jul 29, 2020

View reviewed changes

Use pytest_runtest_makereport

81de457

antalszava force-pushed the skip_unsupported_ops_tests branch from 160a7c2 to 81de457 Compare July 29, 2020 17:43

Docstring, import organization

d1b6e9c

Formatting

f0fcc97

co9olguy removed their request for review July 29, 2020 17:57

Formatting

3b7e5c9

antalszava requested review from josh146, mariaschuld and trbromley July 29, 2020 18:01

josh146 added 3 commits July 30, 2020 20:17

Merge branch 'master' into skip_unsupported_ops_tests

e99967c

Added cli option

25950dc

Merge branch 'master' into skip_unsupported_ops_tests

be5dfca

josh146 approved these changes Jul 30, 2020

View reviewed changes

josh146 added 2 commits July 30, 2020 20:48

black

dcd523f

Merge branch 'skip_unsupported_ops_tests' of github.com:PennyLaneAI/p…

60df55b

…ennylane into skip_unsupported_ops_tests

trbromley reviewed Jul 30, 2020

View reviewed changes

antalszava added 2 commits July 30, 2020 11:41

Use module level mark

faac9da

Merge branch 'master' into skip_unsupported_ops_tests

f2ac496

antalszava added 3 commits July 30, 2020 11:44

Merge branch 'skip_unsupported_ops_tests' of https://github.com/Xanad…

d593ef9

…uAI/pennylane into skip_unsupported_ops_tests

Unpin pip

7d6570a

Isort

57dffb4

antalszava merged commit 3a3ed68 into master Jul 30, 2020

antalszava deleted the skip_unsupported_ops_tests branch July 30, 2020 19:04

antalszava changed the title ~~Skip unsupported ops tests~~ Shared testsuite: Skip unsupported ops tests Jul 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared testsuite: Skip unsupported ops tests #724

Shared testsuite: Skip unsupported ops tests #724

antalszava commented Jul 28, 2020 •

edited

Loading

codecov bot commented Jul 28, 2020 •

edited

Loading

antalszava commented Jul 28, 2020

trbromley left a comment

trbromley Jul 28, 2020

josh146 Jul 29, 2020

trbromley Jul 28, 2020

trbromley Jul 28, 2020

trbromley Jul 28, 2020

josh146 Jul 29, 2020

mariaschuld Jul 29, 2020

josh146 left a comment •

edited

Loading

josh146 Jul 29, 2020

josh146 Jul 29, 2020

josh146 Jul 29, 2020

josh146 Jul 29, 2020

josh146 Jul 29, 2020

mariaschuld left a comment

josh146 commented Jul 29, 2020

antalszava commented Jul 29, 2020 •

edited

Loading

josh146 left a comment •

edited

Loading

josh146 Jul 30, 2020

antalszava Jul 30, 2020

trbromley left a comment

trbromley Jul 30, 2020

josh146 Jul 30, 2020

trbromley Jul 30, 2020

josh146 Jul 30, 2020

antalszava Jul 30, 2020

		@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
		return _skip_if


		@pytest.fixture(scope="session")

		@@ -141,6 +141,7 @@


		@flaky(max_runs=10)
		@pytest.mark.skip_unsupported

Shared testsuite: Skip unsupported ops tests #724

Shared testsuite: Skip unsupported ops tests #724

Conversation

antalszava commented Jul 28, 2020 • edited Loading

codecov bot commented Jul 28, 2020 • edited Loading

Codecov Report

antalszava commented Jul 28, 2020

trbromley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh146 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mariaschuld left a comment

Choose a reason for hiding this comment

josh146 commented Jul 29, 2020

antalszava commented Jul 29, 2020 • edited Loading

josh146 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trbromley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antalszava commented Jul 28, 2020 •

edited

Loading

codecov bot commented Jul 28, 2020 •

edited

Loading

josh146 left a comment •

edited

Loading

antalszava commented Jul 29, 2020 •

edited

Loading

josh146 left a comment •

edited

Loading