-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared testsuite: Skip unsupported ops tests #724
Conversation
Codecov Report
@@ Coverage Diff @@
## master #724 +/- ##
=======================================
Coverage 95.46% 95.46%
=======================================
Files 107 107
Lines 6791 6791
=======================================
Hits 6483 6483
Misses 308 308 Continue to review full report at Codecov.
|
Curious to hear feedback on whether or not:
aligns with the aim of the shared test suite (might not require an entire code review). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @antalszava, looks great! I left some quick comments, but don't worry about implementing them all if it looks like it'll take a nontrivial amount of time.
pennylane/plugins/tests/conftest.py
Outdated
@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities): | |||
return _skip_if | |||
|
|||
|
|||
@pytest.fixture(scope="session") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never got comfortable with the scope
. What does it mean in this setting to set the scope as "session"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the scope determines when the fixture is evaluated.
e.g., scope="function"
means that the fixture is re-evaluated for each test function (almost like a 'setUp'). Whereas scope="session"
means that the fixture is evaluated when pytest first starts, and this value used for each test function
Setting a larger scope can be important if the fixture is time consuming to evaluate. On the other hand, a smaller scope might be required if the test functions mutate/affect the logic within the fixture.
pennylane/plugins/tests/conftest.py
Outdated
@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities): | |||
return _skip_if | |||
|
|||
|
|||
@pytest.fixture(scope="session") | |||
def skip_if_ops(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point: is this a clear name?
Would something like
skip_if_ops_unsupported
? It's longer, but clearer (to me at least!)
pennylane/plugins/tests/conftest.py
Outdated
def _skip_if_ops(dev, ops): | ||
""" Skip test if device does not support an operation. """ | ||
|
||
for op in ops: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point: Could we do if op.__name__ not in dev.operations
here, to save all the __name__
's below?
pennylane/plugins/tests/conftest.py
Outdated
@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities): | |||
return _skip_if | |||
|
|||
|
|||
@pytest.fixture(scope="session") | |||
def skip_if_ops(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point: could we pass the device
fixture here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a good idea. If I recall, the other fixtures do not depend on the device
fixture because they must be run before device creation.
This fixture could potentially be evaluated after device creation, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this will work, or if the same problem occurs: the device is only created inside a test, because it may be created with different wire classes that are specified by the fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antalszava, perhaps it might make sense to automate this, rather than using a fixture that must be copied and pasted into every test.
For example, here is an approach where we create a pytest hook in conftest.py
:
def pytest_runtest_makereport(item, call):
from _pytest.runner import pytest_runtest_makereport as orig_pytest_runtest_makereport
tr = orig_pytest_runtest_makereport(item, call)
if "skip_unsupported" in item.keywords:
if call.excinfo is not None:
if call.excinfo.type == qml.DeviceError and "not supported on device" in str(call.excinfo.value):
tr.outcome = "skipped"
tr.wasxfail = "reason:" + str(call.excinfo.value)
return tr
Now, if we decorate a test class of test function with @pytest.mark.skip_unsupported
, any test that raises the unsupported operation/observable exception will be automatically skipped:
@pytest.mark.skip_unsupported
@flaky(max_runs=10)
class TestGatesQubit:
"""Test qubit-based devices' probability vector after application of gates.
"""
This has two advantages:
-
It's easy to add this feature to whole test classes or test modules with one line
-
Since you don't have to hardcode into the test which operations to skip, it generalizes to decompositions. For example, AQT doesn't support the PauliX gate, but it does support a decomposition of PauliX into RX gates. Using the approach in the PR, the integration test would be skipped if
PauliX not in dev.operations
(which it isn't), but this test should still be able to run, since PL will perform the decomposition.
pennylane/plugins/tests/conftest.py
Outdated
@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities): | |||
return _skip_if | |||
|
|||
|
|||
@pytest.fixture(scope="session") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the scope determines when the fixture is evaluated.
e.g., scope="function"
means that the fixture is re-evaluated for each test function (almost like a 'setUp'). Whereas scope="session"
means that the fixture is evaluated when pytest first starts, and this value used for each test function
Setting a larger scope can be important if the fixture is time consuming to evaluate. On the other hand, a smaller scope might be required if the test functions mutate/affect the logic within the fixture.
pennylane/plugins/tests/conftest.py
Outdated
@@ -79,6 +79,40 @@ def _skip_if(dev, capabilities): | |||
return _skip_if | |||
|
|||
|
|||
@pytest.fixture(scope="session") | |||
def skip_if_ops(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a good idea. If I recall, the other fixtures do not depend on the device
fixture because they must be run before device creation.
This fixture could potentially be evaluated after device creation, however.
pennylane/plugins/tests/conftest.py
Outdated
if op not in dev.operations: | ||
pytest.skip( | ||
"Test skipped for unsupported operation {} on {} device.".format(op, dev.name) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good place to use dev.supports_operation(operation)
, where operation
can be either a string or the class type itself.
e.g.,
>>> dev.supports_operation("PauliX")
True
>>> dev.supports_operation(qml.PauliX)
True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also avoids the issue with calling __name__
below.
pennylane/plugins/tests/conftest.py
Outdated
|
||
for obs in observables: | ||
# skip if the observable is not supported by the device | ||
if obs not in dev.observables: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, you can use dev.supports_observable(obs)
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @josh146 's solution, would have never thought about this. But would it be possible to have a command line argument which switches skipping off altogether?
A developer could then run the "strict" or "lenient" test suite version. If we understand the tests as our basic requirements for a device, and implementing all PL ops is a basic requirement for a device, the default version of the tests should be the one not accepting the skipping in my eyes...
Ooh, maybe! Maybe you could add it to the logic here: if "skip_unsupported" in item.keywords and <cli argument?>: |
160a7c2
to
81de457
Compare
Thanks @josh146 @mariaschuld @trbromley for the nice comments! 😊 Turned to the more elegant solution that @josh posted. :) Unfortunately didn't come across a nice way to use cli arguments when redefining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @antalszava!
I didn't really have an answer with respect to CLI arguments (pytest docs are good in that they are extensive, but bad in that they don't cover all use-cases!), but I had a hunch and tried it locally, and managed to get it working. I pushed this commit here: 25950dc
Essentially, the item
argument in pytest_runtest_makereport
allows you to inspect CLI arguments, and branch based on their values 🙂 I learnt something new! You can now do
pytest pennylane/plugins/tests --device=default.qubit --skip-ops
to turn on skipping unsupported operations.
@mariaschuld, does this affect the existing skip_if(dev, capabilities)
fixture? In the original PR, we played around with getting @pytest.mark.capabilities("inverse_operations.true", "model.qubit")
etc. to work.
It would even be nice to have marks for particular types of devices. E.g., passthru devices have specific tests marked with @pytest.mark.backprop
etc.
@@ -141,6 +141,7 @@ | |||
|
|||
|
|||
@flaky(max_runs=10) | |||
@pytest.mark.skip_unsupported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor note: if every class in this file should skip unsupported gates, you can add
unsupportedmark = pytest.mark.skip_unsupported
at the top of the module, and pytest will apply it to the entire module :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks great idea! Adding it
…ennylane into skip_unsupported_ops_tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @antalszava, the second iteration looks much cleaner!
What is the motivation for testing devices with operations that they don't support? Do we need this as a command line option? I was expecting this to be hard coded rather than an option.
parser.addoption( | ||
"--skip-ops", | ||
action="store_true", | ||
default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why we want the default to be False
?
Why do we want to test a device with operations we know it doesn't support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ambivalent here, I left it like this as that is what @antalszava's original logic did.
An argument for making the skipping optional is given here by @mariaschuld: #724 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Although I'm not convinced.
If we have a "lenient" and a "strict" mode, which mode do we use to conclude that the device is successfully passing tests? I thought the idea of the test suite was to define a standard, if a device passes the tests then it's good to work with PL.
If passing tests on strict mode is the requirement, then does that mean that there is a canonical set of gates that a device must support? That seems like a high bar, especially on hardware devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the idea of the test suite was to define a standard, if a device passes the tests then it's good to work with PL.
I think the idea is more to (a) reduce device integration test duplication, and (b) reduce maintenance load. Using it to set a standard is a potential option we marked to explore further down the road :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the points raised by Josh. From my point of view, if a device for some reason cannot support all operations then basically it also cannot use the shared test suite. This would simply lead to having to duplicate the tests in another repo.
…uAI/pennylane into skip_unsupported_ops_tests
Context:
A shared device test suite was added in #695 which allows specifying PennyLane devices and running test cases using them.
Certain devices might not support all the operations and observables that are used in these tests. Test cases which contain unsupported operations or observables will fail for the specific device. This becomes challenging when trying to filter which test cases failed due to unsupported ops/observables and which due to an error with the device itself.
Description of the Change:
pytest_runtest_makereport
hook definition to mark test cases which failed due to aDeviceError
or with aNotImplementedError
asxfail
Benefits:
Devices that do not support all the operations and observables can still use the shared test suite. Test cases that include unsupported ops/observables will be skipped.
Possible Drawbacks:
Passing the test suite while also skipping several tests might give developers the false impression that every basic operation was implemented.
One potential aid to this could be passing the
-rsx
opytion topytest
to include messages for skipped test cases:Related GitHub Issues:
N/A