Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parametric ops GPU compatibility (v0.19.1) #1927

Merged
merged 38 commits into from
Nov 25, 2021
Merged

Conversation

antalszava
Copy link
Contributor

@antalszava antalszava commented Nov 23, 2021

Context:

With recent changes in default.qubit (#1749), the backpropagation variants of the device no longer define their own ML framework-specific implementation of parametric operations.

This means, that the pipeline defined in pennylane/ops/qubit/parametric_ops.py is being used by all backpropagation devices.

Some matrix definitions include scalars/constants (e.g., 0 or 1) that don't have an explicit type definition. When using Torch with GPU, such objects will be put on the CPU resulting in an error.

Such an operation is qml.RX:

import pennylane as qml
import torch

dev = qml.device("default.qubit.torch", wires=1)

@qml.qnode(dev, interface='torch')
def circuit(x, y):
    qml.RX(x, wires=0)
    qml.RZ(y, wires=0)
    return qml.expval(qml.PauliZ(0))

x = torch.tensor(0.3, device='cuda')
y =  torch.tensor(0., device='cuda')

circuit(x, y)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument tensors in method wrapper___cat)

The cause of the error boils down to:

import pennylane as qml
import torch

a = torch.tensor(0.34, device='cuda')
qml.math.stack([0, a])

qml.math.stack is used in such a way for several matrix definitions of parametric operations.

Such cases arise also when running the tests contained in tests/gpu/test_torch.py.

Description of the Change:

  • Changes the definition of Tensor coercion in qml.math for Torch. The coercion now takes into account any Torch devices that may have been provided and puts the output tensors on any such device. An error is raised if several Torch devices were used by the input tensors;
import pennylane as qml
import torch

a = torch.tensor(0.34, device='cuda')
qml.math.stack([0, a])
tensor([0.0000, 0.3400], device='cuda:0')
  • Changes the tests for default.qubit.torch such that tests are run on the GPU, if available;
  • Re-adds the torch_device argument to default.qubit.torch to enable evaluating QNodes without input parameters on GPUs.

Benefits:

Fixed compatibility with GPUs for default.qubit.torch and better GPU testing with Torch.

Possible Drawbacks:
N/A

Related GitHub Issues:
N/A

Note:

The following test files contain cases that are run on the GPU:

  • tests/gpu/test_torch.py
  • tests/devices/test_default_qubit_torch.py
  • tests/interfaces/test_batch_torch.py

@github-actions
Copy link
Contributor

Hello. You may have forgotten to update the changelog!
Please edit doc/releases/changelog-dev.md with:

  • A one-to-two sentence description of the change. You may include a small working example for new features.
  • A link back to this PR.
  • Your name (or GitHub username) in the contributors section.

@codecov
Copy link

codecov bot commented Nov 23, 2021

Codecov Report

❗ No coverage uploaded for pull request base (v0.19.1-rc0@3d71f9f). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff               @@
##             v0.19.1-rc0    #1927   +/-   ##
==============================================
  Coverage               ?   98.92%           
==============================================
  Files                  ?      218           
  Lines                  ?    16417           
  Branches               ?        0           
==============================================
  Hits                   ?    16240           
  Misses                 ?      177           
  Partials               ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d71f9f...d23cba0. Read the comment docs.

@antalszava
Copy link
Contributor Author

[sc-11630]

@antalszava antalszava marked this pull request as ready for review November 23, 2021 22:16
* mark the kernel test not converging as xfail

* format

* skipif

* skipif marker

* format
@@ -395,6 +393,32 @@ def test_closest_psd_matrix(self, input, fix_diagonal, expected_output):

assert np.allclose(output, expected_output, atol=1e-5)

@pytest.mark.usefixtures("skip_if_no_cvxpy_support")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a commit cherry-picked from #1918 (merged to master). The test suite failed on this specific test case.

@antalszava antalszava requested a review from mlxd November 23, 2021 22:41
# Extract existing set devices, if any
device_set = set(t.device for t in tensors if isinstance(t, torch.Tensor))
if len(device_set) > 1:
raise ValueError("Multiple Torch devices were specified, coercing is not possible.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to match the Torch error message?

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be better!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just stick them all on the GPU if at least one is on the GPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I'd like users to know what they are doing. I think we could make the change, but we'll be putting the burden on us providing convenience for user code that is incorect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems Torch is providing different approaches for different operations. Arithmetics (e.g., addition) seems to work okay between a tensor on the GPU and a tensor on the CPU:

import torch

a = torch.tensor(0.3, device='cpu')
b = torch.tensor(0.3, device='cuda')

print(a + b)
tensor(0.6000, device='cuda:0')

But stacking doesn't:

print(torch.stack([a,b]))
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument tensors in method wrapper___cat)

As we're using _coerce_types_torch when calling qml.math.stack, I'd rather we kept raising an error for now. Good to keep this behaviour in mind though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think raising an error here is the way to go.

Comment on lines +833 to +838
if interface == "torch":
# Use convert_like to ensure that the tensor is put on the correct
# Torch device
z = qml.math.convert_like(qml.math.zeros([4]), theta)
else:
z = qml.math.zeros([4], like=interface)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antalszava for code simplicity, you could replace this with just?

Suggested change
if interface == "torch":
# Use convert_like to ensure that the tensor is put on the correct
# Torch device
z = qml.math.convert_like(qml.math.zeros([4]), theta)
else:
z = qml.math.zeros([4], like=interface)
z = qml.math.convert_like(qml.math.zeros([4]), theta)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I guess this introduces overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of qml.math is so that places like here don't have to worry about interfaces.

Especially as this occurs in multiple different operations, maybe we should extract this functionality into qml.math.

I think this is okay to keep for now as a bug fix, but in the future, we should try to extract interface-specific behaviour away from places like this.

Copy link
Contributor Author

@antalszava antalszava Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having z = qml.math.convert_like(qml.math.zeros([4]), theta) raises an error for TensorFlow, see the run attached to the commit right before the commit with branching off was added:
1.

  1. 53b4111 (commit after)

@mlxd
Copy link
Member

mlxd commented Nov 24, 2021

Hi @antalszava Happy to report that

    tests/gpu/test_torch.py
    tests/devices/test_default_qubit_torch.py
    tests/interfaces/test_batch_torch.py

pass the given tests. I can attach the log output if required.

Copy link
Contributor

@albi3ro albi3ro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple questions and suggestions.

Comment on lines +833 to +838
if interface == "torch":
# Use convert_like to ensure that the tensor is put on the correct
# Torch device
z = qml.math.convert_like(qml.math.zeros([4]), theta)
else:
z = qml.math.zeros([4], like=interface)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of qml.math is so that places like here don't have to worry about interfaces.

Especially as this occurs in multiple different operations, maybe we should extract this functionality into qml.math.

I think this is okay to keep for now as a bug fix, but in the future, we should try to extract interface-specific behaviour away from places like this.

tests/devices/test_default_qubit_torch.py Show resolved Hide resolved
#####################################################
# Test matrices
#####################################################

U = torch.tensor(
U = np.array(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This is torch tests, so shouldn't all the variables be in torch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do arithmetics and convert to a torch tensor within the test case.

tests/devices/test_default_qubit_torch.py Outdated Show resolved Hide resolved
tests/devices/test_default_qubit_torch.py Outdated Show resolved Hide resolved
# Extract existing set devices, if any
device_set = set(t.device for t in tensors if isinstance(t, torch.Tensor))
if len(device_set) > 1:
raise ValueError("Multiple Torch devices were specified, coercing is not possible.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just stick them all on the GPU if at least one is on the GPU?

Copy link
Contributor

@rmoyard rmoyard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me 💯 The thing I would suggest is try to remove the check on this line

state = state.to(self._torch_device)
ans see how it behaves with the tests. We could also add a warning if user provide a tensor with a different device(cpu or cuda) than the PennyLane device. In my opinion, we should take the PennyLane device (cpu or cuda) as the reference as it has to be only defined once.

# Extract existing set devices, if any
device_set = set(t.device for t in tensors if isinstance(t, torch.Tensor))
if len(device_set) > 1:
raise ValueError("Multiple Torch devices were specified, coercing is not possible.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be better!

torch.tensor(1 + 3j, dtype=torch.complex64, device="cuda"),
]

with pytest.raises(ValueError, match="Multiple Torch devices were specified"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test!

)
if cuda_params_but_not_cuda_specified or not_cuda_params_but_cuda_specified:

warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I think it is now clear for the user that the device in the PennyLane device is the one that will be used 💯

# Extract existing set devices, if any
device_set = set(t.device for t in tensors if isinstance(t, torch.Tensor))
if len(device_set) > 1:
raise ValueError("Multiple Torch devices were specified, coercing is not possible.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think raising an error here is the way to go.

Copy link
Contributor

@rmoyard rmoyard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks great and ready to me 💯

@antalszava antalszava merged commit 3b75368 into v0.19.1-rc0 Nov 25, 2021
@antalszava antalszava deleted the v0.19.1_gpu_fix branch November 25, 2021 01:44
josh146 added a commit that referenced this pull request Nov 29, 2021
* extend the changelog item of the init module deprecation (#1833)

* add sorting to `get_parameters`

* undo accidental direct push to v0.19.0-rc0

* Remove template decorator from PennyLane functions (#1808) (#1835)

* Remove template decorator.

* Correct import and changelog.

* Change interferometer.

* Update pennylane/templates/broadcast.py

Co-authored-by: antalszava <antalszava@gmail.com>

* Update doc/releases/changelog-dev.md

Co-authored-by: antalszava <antalszava@gmail.com>

* Update pennylane/templates/broadcast.py

Co-authored-by: antalszava <antalszava@gmail.com>

* Update from review.

* More import removed.

* Update parametric ops.

* Update pennylane/templates/subroutines/arbitrary_unitary.py

Co-authored-by: antalszava <antalszava@gmail.com>

Co-authored-by: antalszava <antalszava@gmail.com>

Co-authored-by: Romain <rmoyard@gmail.com>

* Update `QNGOptimizer` to handle deprecations (#1834)

* pass approx option in QNG if need be

* convert to array instead of list; add recwarn to QNG test case

* Update pennylane/optimize/qng.py

Co-authored-by: Josh Izaac <josh146@gmail.com>

* add expval(H) with two input params test case

* deprecate diag_approx keyword for QNGOptimizer

* format

* docstring

* changelog

Co-authored-by: Josh Izaac <josh146@gmail.com>

* Sorted `tape.get_parameters` (#1836)

* sorted `get_parameters`

* adapt PR count

* fix metric tensor tests

* also sort in `set_parameters`

* Fix bug to ensure qml.math.dot works in autograph mode (#1842)

* Fix bug to ensure qml.math.dot works in autograph mode

* changelog

Co-authored-by: antalszava <antalszava@gmail.com>

* Update `expand_fn` (rc) (#1838)

* expand_fn

* changes just as in master

* new lines

* Ensure that the correct version of keras is installed for TensorFlow CI (#1847)

* Add deprecation warning for default.tensor (#1851)

* add warning and test

* add changelog

* Update pennylane/beta/devices/default_tensor.py

Co-authored-by: antalszava <antalszava@gmail.com>

Co-authored-by: antalszava <antalszava@gmail.com>

* fix docstrings in hf module (#1853)

* fix hf docstrings

* Update pennylane/hf/hartree_fock.py

Co-authored-by: Romain <rmoyard@gmail.com>

Co-authored-by: Romain <rmoyard@gmail.com>

* Docstring updates of new features before the release (#1859)

* doc format

* docstring example

* Hide :orphan: from displaying in the release notes (#1863)

* Update single_dispatch.py (#1857)

* Add support for abstract tensors in AmplitudeEmbedding (#1845)

* Add support for abstract tensors in AmplitudeEmbedding

* add tests

* update changelog

* Apply suggestions from code review

Co-authored-by: Romain <rmoyard@gmail.com>

* fix

* fix

* fix

* Add batchtracer support

* Update tests/math/test_is_abstract.py

Co-authored-by: antalszava <antalszava@gmail.com>

Co-authored-by: Romain <rmoyard@gmail.com>
Co-authored-by: antalszava <antalszava@gmail.com>

* Add documentation for the hf module (#1839)

* create rst file for hf module

* Documentation Updates (#1868)

* documentation updates

* Update doc/introduction/interfaces/numpy.rst

* Apply suggestions from code review

Co-authored-by: antalszava <antalszava@gmail.com>

* scipy minimize update

* collections of qnode section re-added

* new lines

Co-authored-by: antalszava <antalszava@gmail.com>

* Update deprecated use of `qml.grad`/`qml.jacobian` in docstrings (#1860)

* autograd device requires_grad

* math.quantum example

* tracker

* tracker 2nd

* batch_transform

* batch_params

* fix batch params docstring

* update

* Update pennylane/transforms/batch_params.py

Co-authored-by: antalszava <antalszava@gmail.com>

* `qml.math.is_independent` no deprecation warnings with `qml.grad`/`qml.jacobian` (#1862)

* is_independent no deprecation warning

* is_independent no deprecation warning tests

* docstring

* formatting

* mark arrayboxes as trainable still; adjust test case (no autograd warning)

* define requires_grad=True for random args only with Autograd (not JAX) to avoid errors with JAX tracers

* add in warning checks such that warnings are for sure not emitted

* Update pennylane/transforms/batch_params.py

Co-authored-by: Christina Lee <christina@xanadu.ai>

* Update pennylane/transforms/batch_params.py

Co-authored-by: Christina Lee <christina@xanadu.ai>

* Update pennylane/transforms/batch_params.py

Co-authored-by: Christina Lee <christina@xanadu.ai>

* Update pennylane/transforms/batch_params.py

Co-authored-by: Christina Lee <christina@xanadu.ai>

* Update pennylane/transforms/batch_params.py

Co-authored-by: Christina Lee <christina@xanadu.ai>

* Update pennylane/transforms/batch_params.py

Co-authored-by: Christina Lee <christina@xanadu.ai>

Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: Romain <rmoyard@gmail.com>
Co-authored-by: Christina Lee <christina@xanadu.ai>

* change example (#1875)

* `v0.19.0` release notes (#1852)

* pl + qchem versions

* draft release notes

* example + changelog-dev.md

* changelog-dev

* batch_params output

* shorten U decomp example to print well

* need orbitals kwarg when using hf_state as positional came after kwarg

* need orbitals kwarg when using hf_state as positional came after kwarg

* trainable params

* orphan

* contributors

* notes

* HF solver example

* HF solver example

* HF solver text

* doc item for #1724

* item for 1769

* doc item for #1792

* #1804, #1802

* #1824

* pyscf pin

* PR links

* resolve changelog (#1858)

Co-authored-by: agran2018 <45397799+agran2018@users.noreply.github.com>

* Update release_notes.md

* Apply HF wording suggestion from Josh

* fix batch params docstring

* update

* merge batch_params dim_output

* new shape for the output of the batch_params example

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* transforms order change as suggested

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* shorten ex as suggested

* shorten insert example

* move classical jac change to improvements as suggested

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* move is_independent to improvements

* remove sentence as suggested

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* Update doc/releases/changelog-0.19.0.md

Co-authored-by: Josh Izaac <josh146@gmail.com>

* change the order of the improvements based on the suggestions

* rephrase qml.grad/qml.jacobian deprecation date

* Update doc/releases/changelog-0.19.0.md

* no current release tag for v0.18.0

* no -dev

* fix

* fix

* correct GateFabric api link

* JAX interface as improvement

* create_expand_fn suggestion

* requires_grad.png

* GateFabric full path

Co-authored-by: agran2018 <45397799+agran2018@users.noreply.github.com>
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: Romain <rmoyard@gmail.com>

* Tkt rc (#1870)

* changed readme and index tkt

* kept changelog

* Changed docs image

* Deleted changelog

* Apply suggestions from code review

Co-authored-by: antalszava <antalszava@gmail.com>
Co-authored-by: Nathan Killoran <co9olguy@users.noreply.github.com>

* removed readme tkt

* Removed readme

* README.md form release candidate

Co-authored-by: Catalina Albornoz <catalina@xanadu.ai>
Co-authored-by: antalszava <antalszava@gmail.com>
Co-authored-by: Nathan Killoran <co9olguy@users.noreply.github.com>

* contributors

* Update the norm check for `QubitStateVector` in `default.qubit` (v0.19.1) (#1924)

* norm check fix

* 0.19.1 changelog

* 0.19.1 bump

* pr number

* changelog for 0.19.1 in the release notes

* torch coerce change, qml.math for Hermitian

* test changes

* readd torch_device kwarg

* format

* use torch device arg and add cuda_was_set branch

* polish

* TestApply cover with torch_device; update the def for OrbitalRotation to be compatible with CUDA

* Add notes to tests using execute; CRX + CRY convert_like; Integration tests passing

* passthru test

* more tests

* format

* coerce refine + test

* tidy

* branch off for Torch

* def self._torch_device_requested and extend check in execute

* test and further adjustments

* format

* changelog

* Mark the kernel test not converging as xfail (#1918)

* mark the kernel test not converging as xfail

* format

* skipif

* skipif marker

* format

* changelog

* lint

* todo no longer relevant

* PauliRot identity matrix consider Torch

* format

* paulirot identity matrix test

* add gpu marker

* more torch_device usage

* more fixes with finite diff; update tests

* qml.math.allclose

* Move the state to device conversion from apply_operation to the execute method because it logically belongs there better

* Move the state to device conversion within if statement, add comment; change a test to avoid raising a warning

* format

* error message; gpu marker on the coercion test in qml.math; error matching

* format

* warning

* warning test

* no print

* format

* Fix parametric ops GPU compatibility (`v0.19.1`) (#1927)

* torch coerce change, qml.math for Hermitian

* test changes

* readd torch_device kwarg

* format

* use torch device arg and add cuda_was_set branch

* polish

* TestApply cover with torch_device; update the def for OrbitalRotation to be compatible with CUDA

* Add notes to tests using execute; CRX + CRY convert_like; Integration tests passing

* passthru test

* more tests

* format

* coerce refine + test

* tidy

* branch off for Torch

* def self._torch_device_requested and extend check in execute

* test and further adjustments

* format

* changelog

* Mark the kernel test not converging as xfail (#1918)

* mark the kernel test not converging as xfail

* format

* skipif

* skipif marker

* format

* changelog

* lint

* todo no longer relevant

* PauliRot identity matrix consider Torch

* format

* paulirot identity matrix test

* add gpu marker

* more torch_device usage

* more fixes with finite diff; update tests

* qml.math.allclose

* Move the state to device conversion from apply_operation to the execute method because it logically belongs there better

* Move the state to device conversion within if statement, add comment; change a test to avoid raising a warning

* format

* error message; gpu marker on the coercion test in qml.math; error matching

* format

* warning

* warning test

* no print

* format

* contributors for the bug fix release

* changelog

* Update doc/development/release_notes.md

* import order

* Update tests/devices/test_default_qubit_torch.py

* resolve conflicts

* remove _apply_operation as a whole

* use f string instead of format call to make codefactor happy

* use f string instead of format call in default.qubit to make codefactor happy

* more f strings; {} instead of dict()

* format

* PauliRot Torch run on CPU too to increase coverage

* format

Co-authored-by: dwierichs <davidwierichs@gmail.com>
Co-authored-by: Romain <rmoyard@gmail.com>
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: Maria Schuld <mariaschuld@gmail.com>
Co-authored-by: Christina Lee <christina@xanadu.ai>
Co-authored-by: soranjh <40344468+soranjh@users.noreply.github.com>
Co-authored-by: Guillermo Alonso-Linaje <65235481+KetpuntoG@users.noreply.github.com>
Co-authored-by: agran2018 <45397799+agran2018@users.noreply.github.com>
Co-authored-by: CatalinaAlbornoz <albornoz.catalina@hotmail.com>
Co-authored-by: Catalina Albornoz <catalina@xanadu.ai>
Co-authored-by: Nathan Killoran <co9olguy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants