Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiable parameter-shift gradient transform #1479

Merged
merged 76 commits into from
Aug 3, 2021
Merged

Conversation

josh146
Copy link
Member

@josh146 josh146 commented Jul 27, 2021

Context: As part of the roadmap for supporting differentiable batch execution, gradient logic will be moved out of the subclasses and into a module of pure functions. This is the second PR after #1476; here, we move the finite-difference logic out of QubitParamShiftTape and into a new gradients package.

Description of the Change:

  • Adds a function expval_param_shift for computing the parameter-shift gradient of a tape terminating in expectation values. Directly equivalent to QubitParamShift.parameter_shift.

  • Adds a function var_param_shift for computing the parameter-shift gradient of a tape terminating in one or more variances. Directly equivalent to QubitParamShift.parameter_shift_var.

  • Adds a wrapper function param_shift, which does the following:

    • Basic input validation
    • Performs static analysis of the tape parameters, to determine which support parameter-shift rules.
    • A fallback mode; finite_diff is called for any unsupported parameters
    • Dispatches to one of expval_param_shift or var_param_shift depending on the structure of the tape
    • Provides a processing function to combine the gradients computed via fallback and the parameter-shift methods.

Benefits:

  • The parameter-shift logic is now much more user and dev accessible
  • The parameter-shift logic is now differentiable, allowing higher-order derivatives to be accessed no matter the gradient recipe.
  • Redundant and zero terms are automatically removed from gradient recipes
  • If the output value of the unshifted, input tape is known, and the gradient recipe contains an unshifted component, this can be provided to the parameter-shift rule to reduce the number of evaluations required.

Possible Drawbacks:

  • See drawbacks in Differentiable finite-difference gradient transform #1476
  • For variance support, the only non-involutory observable supported is qml.Hermitian. We should consider extending support to qml.Hamiltonian, now that expval(H) is supported.
  • We require JacobianTape as input, since we continue to rely on that subclasses static gradient analysis methods
  • To cache the static gradient analysis, it is appended to the input tape - a better approach should be considered.

Related GitHub Issues:

@josh146 josh146 added the WIP 🚧 Work-in-progress label Jul 27, 2021
@github-actions
Copy link
Contributor

Hello. You may have forgotten to update the changelog!
Please edit .github/CHANGELOG.md with:

  • A one-to-two sentence description of the change. You may include a small working example for new features.
  • A link back to this PR.
  • Your name (or GitHub username) in the contributors section.

@codecov
Copy link

codecov bot commented Jul 27, 2021

Codecov Report

Merging #1479 (3009847) into master (f2bb1a0) will increase coverage by 0.01%.
The diff coverage is 99.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1479      +/-   ##
==========================================
+ Coverage   98.32%   98.34%   +0.01%     
==========================================
  Files         180      181       +1     
  Lines       12741    12899     +158     
==========================================
+ Hits        12528    12685     +157     
- Misses        213      214       +1     
Impacted Files Coverage Δ
pennylane/gradients/parameter_shift.py 99.35% <99.35%> (ø)
pennylane/gradients/__init__.py 100.00% <100.00%> (ø)
pennylane/gradients/finite_difference.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2bb1a0...3009847. Read the comment docs.

Base automatically changed from finite-diff to master July 29, 2021 08:19
@josh146
Copy link
Member Author

josh146 commented Jul 29, 2021

@antalszava @glassnotes: I fixed the variance rule to take into account Projectors, and also added a test, in commits https://github.com/PennyLaneAI/pennylane/pull/1479/files/be46c0ab7019b80cfae4f932c41a2c8028391191..0c5666c7bdb3bb390ca967660d272953973eed1b

Copy link
Contributor

@antalszava antalszava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have some questions/suggestions, otherwise it's looking awesome! 🎉

After our chat yesterday, curious: how would we handle template/tape expansions?

As you've suggested, the following now works like a charm:

n_layers = 4
n_wires = 4

dev = qml.device('default.qubit', wires=n_wires)

rng = np.random.default_rng(seed=42)
params = np.array(rng.standard_normal((n_layers, n_wires)))

with qml.tape.JacobianTape() as tape:
    qml.templates.BasicEntanglerLayers(params, wires=range(n_wires)).expand()
    qml.expval(qml.PauliZ(0))
    
tape = tape.expand(stop_at=lambda obj: not isinstance(obj, qml.tape.QuantumTape) and dev.supports_operation(obj.name))

tape.trainable_params = set(range(n_layers * n_wires))

Not sure, however, if it will be intuitive to users to call expand on the template and then call tape.expand too. Maybe it's for another PR?

assert res.shape == (1, 2)

# only called for parameter 0
assert spy.call_args[0][0:2] == (tape, [0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right! So the first two elements are checked for spy.call_args[0], though would we want to make sure that there's no spy.call_args[1] here? Or that the function was called only once as per # only called for parameter 0.

pennylane/gradients/parameter_shift.py Show resolved Hide resolved
_gradient_analysis(tape)
gradient_tapes = []

# TODO: replace the JacobianTape._grad_method_validation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I still don't fully grasp the comment. Would we not deprecate JacobianTape as is and JacobianTape._grad_method_validation with it? How come we'd need to revisit this particular spot?

pennylane/gradients/parameter_shift.py Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Show resolved Hide resolved
tests/gradients/test_parameter_shift.py Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Show resolved Hide resolved
@josh146
Copy link
Member Author

josh146 commented Jul 29, 2021

Not sure, however, if it will be intuitive to users to call expand on the template and then call tape.expand too. Maybe it's for another PR?

@antalszava yes exactly 🙂 This PR simply adds the low-level tape transform. A future PR will add support via QNodes, which is always the user-facing level in PL.

Copy link
Contributor

@glassnotes glassnotes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have made it through all the test cases, mostly just caught typos this time around.

One question I have is what happens to the gates that use the four-term shift rule, like the controlled rotations. They're included in some test cases, but the docs for param_shift are pretty clear about the shift value only being valid for the two-term rule, but the controlled rotation tests don't pass any custom parameters or gradient recipes.

pennylane/gradients/parameter_shift.py Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Outdated Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Outdated Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Outdated Show resolved Hide resolved
pennylane/gradients/parameter_shift.py Outdated Show resolved Hide resolved
qml.RY(-0.654, wires=[0])
qml.expval(qml.PauliZ(0))

gradient_recipes = [[[-1e7, 1, 0], [1e7, 1, 1e7]], [[-1e7, 1, 0], [1e7, 1, 1e7]]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these numbers so large? 😨

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh this is forward finite differences haha. So the coefficients are ±1/h=±1/1e-7 =1e7 😆


def test_independent_parameters_analytic(self):
"""Test the case where expectation values are independent of some parameters. For those
parameters, the gradient should be evaluated to zero without executing the device."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually such an awesome feature 😁

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a contentious feature! Because it uses networkx which can be slow 😬

It depends on your perspective; do you want to save quantum compute at the expense of classical compute?

tests/gradients/test_parameter_shift.py Outdated Show resolved Hide resolved
tests/gradients/test_parameter_shift.py Outdated Show resolved Hide resolved
assert len(tapes) == 4

res = fn(dev.batch_execute(tapes))
assert res.shape == (5, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow how this output shape is obtained, is it because there is 1 expectation value and 4 output probabilities, with a gradient for each of the two parameters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! The Jacobian is simply reshaped to be a 2D array, so the 1 expval and the 4 probs = 5 outputs. Coupled with the 2 parameters, the output Jacobian has shape (5, 2).

josh146 and others added 2 commits July 29, 2021 21:50
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
Copy link
Contributor

@glassnotes glassnotes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @josh146 , looks like all the changes are in, happy to approve!

tests/gradients/test_parameter_shift.py Outdated Show resolved Hide resolved
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
@josh146
Copy link
Member Author

josh146 commented Jul 30, 2021

[ch7869]

@@ -413,13 +442,15 @@ def test_variance_gradients_agree_finite_differences(self, tol):
qml.CNOT(wires=[1, 0])
qml.RX(params[2], wires=[0])
qml.CNOT(wires=[0, 1])
qml.expval(qml.PauliZ(0)), qml.var(qml.PauliX(1))
qml.expval(qml.PauliZ(0)), qml.var(qml.PauliZ(0) @ qml.PauliX(1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this was changed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall 🤔

Copy link
Contributor

@antalszava antalszava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! 💪 😍 Excited for this!

@josh146 josh146 merged commit ab71001 into master Aug 3, 2021
@josh146 josh146 deleted the parameter-shift branch August 3, 2021 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review-ready 👌 PRs which are ready for review by someone from the core team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants