Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where non-differentiable NumPy tensors were not unwrapped #941

Merged
merged 21 commits into from
Dec 9, 2020

Conversation

josh146
Copy link
Member

@josh146 josh146 commented Nov 28, 2020

Context:

The autograd interface was not unwrapping tensors where requires_grad=False. This was intentional, since the np.tensor ndarray subclass, due to ducktyping, should be accepted everywhere that a ndarray would be.

However, recent changes to the np.tensor class have resulted in deviating behaviour from standard ndarrays (e.g., indexing an element returns a scalar tensor rather than a Python float), resulting in some devices unable to process gate arguments correctly.

Description of the Change:

Like the PyTorch and the TF interfaces, constant non-ndarray classes are now always unwrapped before device execution.

Benefits:

np.tensor objects are now never passed to the device - the device will only receive ndarray objects.

Possible Drawbacks:

This extra unwrapping may add some overhead. This overhead may be negligible, but it may not(?).

Further, some devices did work fine with np.tensor gate arguments (e.g., default.qubit), hence why this issue was not noticed here. Perhaps an argument can be made that this fix should be done on the per-device level for devices that need it, to avoid additional overhead? But this would add more maintenance overhead to device developers.

Related GitHub Issues: n/a

@josh146 josh146 added the bug 🐛 Something isn't working label Nov 28, 2020
Comment on lines 109 to 110
if return_arraybox:
return params
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to determine which parameters are trainable, we must not wrap the parameters in an autograd list.

Comment on lines 121 to 126
all_params_unwrapped = [p.numpy() if isinstance(p, np.tensor) else p for p in self._all_parameter_values]

# evaluate the tape
self.set_parameters(all_params_unwrapped, trainable_only=False)
res = self.execute_device(params, device=device)
self.set_parameters(self._all_parameter_values, trainable_only=False)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra list comprehension and the two new set_parameters might add some overhead, worth benchmarking.

Copy link
Member Author

@josh146 josh146 Nov 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this is the same solution the TF and Torch interfaces currently implement, so my (gut) feeling is that the overhead is negligible.

@codecov
Copy link

codecov bot commented Nov 28, 2020

Codecov Report

Merging #941 (9eb5353) into master (687fc7c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #941   +/-   ##
=======================================
  Coverage   97.85%   97.85%           
=======================================
  Files         147      147           
  Lines       10124    10127    +3     
=======================================
+ Hits         9907     9910    +3     
  Misses        217      217           
Impacted Files Coverage Δ
pennylane/tape/interfaces/autograd.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 687fc7c...9eb5353. Read the comment docs.

Copy link
Contributor

@speller26 speller26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks kinda fundamental; is it possible to unit test it?

pennylane/tape/interfaces/autograd.py Outdated Show resolved Hide resolved
if idx in self.trainable_params
]
if return_arraybox:
return params

return autograd.builtins.list(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When return_arraybox=False, is the previous behavior bugged, or does params look the same? Just curious because it looks like the default behavior is changing too, with params still being passed to autograd.builtins.list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Cody, this method previously wasn't buggy - while the code here has changed, the (public) logic remains identical.

The main change is that get_parameters() no longer calls self._update_trainable_params(), due to changes in the latter method, which required a bit of a rewrite of the subsequent lines.

So the output of this method should remain the same pre- and post-change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, thanks!

Comment on lines 128 to 130
self.set_parameters(all_params_unwrapped, trainable_only=False)
res = self.execute_device(params, device=device)
self.set_parameters(self._all_parameter_values, trainable_only=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding, what are the two new lines for?

Copy link
Member Author

@josh146 josh146 Nov 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! These two additional lines are the main bugfix.

Previously, res = self.execute_device(params, device=device) would execute the device with the (new) trainable parameters. These would be unwrapped, so that a ndarray would be passed to the device rather than a np.tensor.

However, this did not consider the edge case where non-trainable parameters needed to be unwrapped. This explains why requires_grad=True worked, whereas requires_grad=False didn't!

Here, we are unwrapping all parameters on the tape, executing it, and then reverting to the original parameters (so there are no side-effects for the user)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be made more efficient in the long-term, by making the two new set_parameter calls the only place where parameter setting happens, and remove the analogous logic (for only trainable parameters) from execute_device.

josh146 and others added 2 commits November 28, 2020 16:22
Co-authored-by: Cody Wang <speller26@gmail.com>
Copy link
Contributor

@speller26 speller26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for tracking this down and fixing it so quickly!

Copy link
Contributor

@trbromley trbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @josh146!

pennylane/tape/interfaces/autograd.py Outdated Show resolved Hide resolved
pennylane/tape/interfaces/autograd.py Show resolved Hide resolved
res = self.execute_device(params, device=device)
self.set_parameters(self._all_parameter_values, trainable_only=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix comes with a cost of two more calls to set_parameters? Why do we have to reset the parameters back, isn't it ok to just permanently cast the requires_grad tensors to numpy arrays?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The second call to set_parameters, where the unwrapped parameters are replaced with the wrapped ones, is more to avoid surprising the user. That is, undoing any side effects that tape execution had.

This won't be required if using a QNode, but if using a tape, you might notice the parameter types had changed before/after executing the tape on a device 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, after this PR is merged we can re-visit and remove all the set_parameter() logic in all three interfaces.

Copy link
Contributor

@trbromley trbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @josh146

tests/tape/interfaces/test_qnode_autograd.py Outdated Show resolved Hide resolved
pennylane/tape/interfaces/autograd.py Show resolved Hide resolved
josh146 and others added 4 commits December 1, 2020 15:51
Co-authored-by: Tom Bromley <49409390+trbromley@users.noreply.github.com>
Copy link
Contributor

@trbromley trbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @josh146!

pennylane/tape/interfaces/autograd.py Show resolved Hide resolved
op_idx = self._par_info[p_idx]["p_idx"]
params.append(op.data[op_idx])

return params if return_arraybox else autograd.builtins.list(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm not seeing the subtlety: what was the motivation for updating get_parameters() and _update_trainable_params()?

Copy link
Member Author

@josh146 josh146 Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, get_parameters() would call _update_trainable_params() with every call. I don't think in retrospect this was the right approach;

  • setting the trainable parameters will be called in general less often than get_parameters() -- the dependency should go the other way. To try and mitigate this overhead, I was setting an instance attribute (params = self._all_parameter_values) which was an ugly side effect that I never liked.

  • It resulted in a get_parameter() method that differed wildly from the base tape method.

In this PR, this has been corrected:

  • Now, _update_trainable_params() calls get_parameters(), to get the latest parameters and determine which are trainable.

  • The contents of the get_parameters() method is now almost identical to the base method, with the exception of the final return statement.

The latter approach is the better approach with less overhead, with the subtlety that _update_trainable_params() needs to get the non-autograd-list sequence of parameters to determine trainability. So the latter approach should have been implemented from the start.

As to why it was changed in this PR; I can't answer that properly. The change below to _execute (to fix the bug) caused the old approach to stop working -- through trial and error I discovered that this approach (the better one) does work with the change, and the entire PL test suite passes.

Copy link
Contributor

@trbromley trbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@josh146 josh146 merged commit ce7c1de into master Dec 9, 2020
@josh146 josh146 deleted the fix-autograd-unwrapping branch December 9, 2020 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants