Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared testsuite: Skip unsupported ops tests #724

Merged
merged 14 commits into from
Jul 30, 2020

Conversation

antalszava
Copy link
Contributor

@antalszava antalszava commented Jul 28, 2020

Context:
A shared device test suite was added in #695 which allows specifying PennyLane devices and running test cases using them.

Certain devices might not support all the operations and observables that are used in these tests. Test cases which contain unsupported operations or observables will fail for the specific device. This becomes challenging when trying to filter which test cases failed due to unsupported ops/observables and which due to an error with the device itself.

Description of the Change:

  • Adds a custom pytest_runtest_makereport hook definition to mark test cases which failed due to a DeviceError or with a NotImplementedError as xfail
  • Adds checks to the test cases in the shared test suite

Benefits:
Devices that do not support all the operations and observables can still use the shared test suite. Test cases that include unsupported ops/observables will be skipped.

Possible Drawbacks:
Passing the test suite while also skipping several tests might give developers the false impression that every basic operation was implemented.

One potential aid to this could be passing the -rsx opytion to pytest to include messages for skipped test cases:

========================================================== short test summary info ==========================================================
XFAIL pennylane/plugins/tests/test_gates.py::TestGatesQubit::test_two_qubit_no_parameters[device_kwargs0-SWAP-mat1]
  reason:Gate SWAP not supported on device default.qubit
XFAIL pennylane/plugins/tests/test_gates.py::TestGatesQubit::test_two_qubit_no_parameters[device_kwargs0-SWAP-mat1]
  reason:Gate SWAP not supported on device default.qubit
XFAIL pennylane/plugins/tests/test_gates.py::TestGatesQubit::test_two_qubit_no_parameters[device_kwargs0-SWAP-mat1]
  reason:Gate SWAP not supported on device default.qubit

Related GitHub Issues:
N/A

@codecov
Copy link

codecov bot commented Jul 28, 2020

Codecov Report

Merging #724 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #724   +/-   ##
=======================================
  Coverage   95.46%   95.46%           
=======================================
  Files         107      107           
  Lines        6791     6791           
=======================================
  Hits         6483     6483           
  Misses        308      308           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c859da...57dffb4. Read the comment docs.

@antalszava
Copy link
Contributor Author

Curious to hear feedback on whether or not:

  • skipping tests for unsupported operations/observables for the shared test suite

aligns with the aim of the shared test suite (might not require an entire code review).

Copy link
Contributor

@trbromley trbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @antalszava, looks great! I left some quick comments, but don't worry about implementing them all if it looks like it'll take a nontrivial amount of time.

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
return _skip_if


@pytest.fixture(scope="session")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never got comfortable with the scope. What does it mean in this setting to set the scope as "session"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the scope determines when the fixture is evaluated.

e.g., scope="function" means that the fixture is re-evaluated for each test function (almost like a 'setUp'). Whereas scope="session" means that the fixture is evaluated when pytest first starts, and this value used for each test function

Setting a larger scope can be important if the fixture is time consuming to evaluate. On the other hand, a smaller scope might be required if the test functions mutate/affect the logic within the fixture.

https://docs.pytest.org/en/stable/fixture.html#scope-sharing-fixtures-across-classes-modules-packages-or-session

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
return _skip_if


@pytest.fixture(scope="session")
def skip_if_ops():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: is this a clear name?
Would something like
skip_if_ops_unsupported? It's longer, but clearer (to me at least!)

def _skip_if_ops(dev, ops):
""" Skip test if device does not support an operation. """

for op in ops:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: Could we do if op.__name__ not in dev.operations here, to save all the __name__'s below?

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
return _skip_if


@pytest.fixture(scope="session")
def skip_if_ops():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: could we pass the device fixture here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good idea. If I recall, the other fixtures do not depend on the device fixture because they must be run before device creation.

This fixture could potentially be evaluated after device creation, however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this will work, or if the same problem occurs: the device is only created inside a test, because it may be created with different wire classes that are specified by the fixture.

Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antalszava, perhaps it might make sense to automate this, rather than using a fixture that must be copied and pasted into every test.

For example, here is an approach where we create a pytest hook in conftest.py:

def pytest_runtest_makereport(item, call):
    from _pytest.runner import pytest_runtest_makereport as orig_pytest_runtest_makereport
    tr = orig_pytest_runtest_makereport(item, call)

    if "skip_unsupported" in item.keywords:
        if call.excinfo is not None:
            if call.excinfo.type == qml.DeviceError and "not supported on device" in str(call.excinfo.value):
                tr.outcome = "skipped"
                tr.wasxfail = "reason:" + str(call.excinfo.value)

    return tr

Now, if we decorate a test class of test function with @pytest.mark.skip_unsupported, any test that raises the unsupported operation/observable exception will be automatically skipped:

@pytest.mark.skip_unsupported
@flaky(max_runs=10)
class TestGatesQubit:
    """Test qubit-based devices' probability vector after application of gates.
    """

This has two advantages:

  • It's easy to add this feature to whole test classes or test modules with one line

  • Since you don't have to hardcode into the test which operations to skip, it generalizes to decompositions. For example, AQT doesn't support the PauliX gate, but it does support a decomposition of PauliX into RX gates. Using the approach in the PR, the integration test would be skipped if PauliX not in dev.operations (which it isn't), but this test should still be able to run, since PL will perform the decomposition.

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
return _skip_if


@pytest.fixture(scope="session")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the scope determines when the fixture is evaluated.

e.g., scope="function" means that the fixture is re-evaluated for each test function (almost like a 'setUp'). Whereas scope="session" means that the fixture is evaluated when pytest first starts, and this value used for each test function

Setting a larger scope can be important if the fixture is time consuming to evaluate. On the other hand, a smaller scope might be required if the test functions mutate/affect the logic within the fixture.

https://docs.pytest.org/en/stable/fixture.html#scope-sharing-fixtures-across-classes-modules-packages-or-session

@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities):
return _skip_if


@pytest.fixture(scope="session")
def skip_if_ops():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good idea. If I recall, the other fixtures do not depend on the device fixture because they must be run before device creation.

This fixture could potentially be evaluated after device creation, however.

Comment on lines 91 to 94
if op not in dev.operations:
pytest.skip(
"Test skipped for unsupported operation {} on {} device.".format(op, dev.name)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good place to use dev.supports_operation(operation), where operation can be either a string or the class type itself.

e.g.,

>>> dev.supports_operation("PauliX")
True
>>> dev.supports_operation(qml.PauliX)
True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also avoids the issue with calling __name__ below.


for obs in observables:
# skip if the observable is not supported by the device
if obs not in dev.observables:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, you can use dev.supports_observable(obs) here.

Copy link
Contributor

@mariaschuld mariaschuld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @josh146 's solution, would have never thought about this. But would it be possible to have a command line argument which switches skipping off altogether?

A developer could then run the "strict" or "lenient" test suite version. If we understand the tests as our basic requirements for a device, and implementing all PL ops is a basic requirement for a device, the default version of the tests should be the one not accepting the skipping in my eyes...

@josh146
Copy link
Member

josh146 commented Jul 29, 2020

I like @josh146 's solution, would have never thought about this. But would it be possible to have a command line argument which switches skipping off altogether?

Ooh, maybe! Maybe you could add it to the logic here:

if "skip_unsupported" in item.keywords and <cli argument?>:

@antalszava
Copy link
Contributor Author

antalszava commented Jul 29, 2020

Thanks @josh146 @mariaschuld @trbromley for the nice comments! 😊

Turned to the more elegant solution that @josh posted. :) Unfortunately didn't come across a nice way to use cli arguments when redefining pytest_runtest_makereport :( This would be a pytest hook that creates the report after the test ran and with this version, it would mark it as xfail (doesn't seem to be trivial to mark it as skipped 🤔). One small last resort is that the -rsx still outputs further details on why tests were marked with xfail.

@co9olguy co9olguy removed their request for review July 29, 2020 17:57
Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @antalszava!

I didn't really have an answer with respect to CLI arguments (pytest docs are good in that they are extensive, but bad in that they don't cover all use-cases!), but I had a hunch and tried it locally, and managed to get it working. I pushed this commit here: 25950dc

Essentially, the item argument in pytest_runtest_makereport allows you to inspect CLI arguments, and branch based on their values 🙂 I learnt something new! You can now do

pytest pennylane/plugins/tests --device=default.qubit --skip-ops

to turn on skipping unsupported operations.

@mariaschuld, does this affect the existing skip_if(dev, capabilities) fixture? In the original PR, we played around with getting @pytest.mark.capabilities("inverse_operations.true", "model.qubit") etc. to work.

It would even be nice to have marks for particular types of devices. E.g., passthru devices have specific tests marked with @pytest.mark.backprop etc.

@@ -141,6 +141,7 @@


@flaky(max_runs=10)
@pytest.mark.skip_unsupported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note: if every class in this file should skip unsupported gates, you can add

unsupportedmark = pytest.mark.skip_unsupported

at the top of the module, and pytest will apply it to the entire module :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks great idea! Adding it

Copy link
Contributor

@trbromley trbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @antalszava, the second iteration looks much cleaner!

What is the motivation for testing devices with operations that they don't support? Do we need this as a command line option? I was expecting this to be hard coded rather than an option.

parser.addoption(
"--skip-ops",
action="store_true",
default=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we want the default to be False?

Why do we want to test a device with operations we know it doesn't support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ambivalent here, I left it like this as that is what @antalszava's original logic did.

An argument for making the skipping optional is given here by @mariaschuld: #724 (review)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Although I'm not convinced.

If we have a "lenient" and a "strict" mode, which mode do we use to conclude that the device is successfully passing tests? I thought the idea of the test suite was to define a standard, if a device passes the tests then it's good to work with PL.

If passing tests on strict mode is the requirement, then does that mean that there is a canonical set of gates that a device must support? That seems like a high bar, especially on hardware devices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea of the test suite was to define a standard, if a device passes the tests then it's good to work with PL.

I think the idea is more to (a) reduce device integration test duplication, and (b) reduce maintenance load. Using it to set a standard is a potential option we marked to explore further down the road :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the points raised by Josh. From my point of view, if a device for some reason cannot support all operations then basically it also cannot use the shared test suite. This would simply lead to having to duplicate the tests in another repo.

@antalszava antalszava merged commit 3a3ed68 into master Jul 30, 2020
@antalszava antalszava deleted the skip_unsupported_ops_tests branch July 30, 2020 19:04
@antalszava antalszava changed the title Skip unsupported ops tests Shared testsuite: Skip unsupported ops tests Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants