New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added pytorch autograd backend (minimal changes) #66

Open
wants to merge 15 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@leonbottou

leonbottou commented Nov 9, 2018

This contains a pytorch backend for pymanopt without changing anything else. The selection of the pytorch backend depends on adding arg=torch.Tensor() when creating the Problem instance, as illustrated in the additional examples.

The pytorch tape based differentiation requires us to compute the cost whenever we want the gradient, and to compute the gradient whenever we want the Hessian. To avoid all these duplicate computations, the additional arg=torch.Tensor() serves in fact as a container to cache all the computations performed for the latest value of x. Other than that, the implement pretty straightforward.

@sweichwald

This comment has been minimized.

Member

sweichwald commented Nov 9, 2018

Nice, thanks for the PR and the accompanying examples!

This looks good to me -- only the flake8 tests and some minor pedantic remarks to be fixed, then it should be ready to be merged :-)

Show resolved Hide resolved pymanopt/core/problem.py Outdated
Show resolved Hide resolved pymanopt/core/problem.py Outdated
@@ -73,6 +73,7 @@ def precon(x, d):
self._backends = list(
filter(lambda b: b.is_available(), [

This comment has been minimized.

@sweichwald

sweichwald Nov 9, 2018

Member

Prefer alphabetical order

This comment has been minimized.

@leonbottou

leonbottou Nov 9, 2018

One needs to test for PytorchBackend's availability before AutogradBackend's because both check that the objective is callable and only the former tests the argument.

This comment has been minimized.

@sweichwald

sweichwald Nov 12, 2018

Member

Again related to #55 #56
@nkoep Option 1 outlined in #55 resolves the most pressing issues for now, while it can be seen as groundwork for Option 2 should the need for this more intrusive rework arise.
I agree, time permitting, resolving #55 #56 as outlined and then merging #66 is preferable. (In case none of us finds the time, I think it is fair to merge this in the meanwhile and revisit #55 #56 at a later point.)

This comment has been minimized.

@leonbottou

leonbottou Nov 12, 2018

It is clear that using arg=torch.Tensor() to indicate that one wants the pytorch backend is a hack. To make things dirtier, one side effect of this hack is to provide a container for the cache. There are surely cleaner ways to do all this, at the expense of more work as usual. In the end, this is your project and you must decide what makes sense for you.

@nkoep

This comment has been minimized.

Member

nkoep commented Nov 9, 2018

I'm not sure I agree with merging this just yet before coming to a final decision regarding #55 and #56. From a cursory glance, it seems the caching mechanism suggested here should be compatible with the rudimentary pytorch backend proposed here: https://github.com/pymanopt/pymanopt/pull/56/files#diff-25bf4ee53fa80a478acbac699286855e

@sweichwald

This comment has been minimized.

Member

sweichwald commented Nov 9, 2018

I agree that we should move on with (deciding on) #55 / #56 and a more fundamental rework of the backend for the mid-/long-term, which we have not gotten around to yet.

In my opinion, a viable solution to offer pytorch support in the short-term is merging this PR which integrates well with our current backend. This way we are not bottlenecked by the more mid-/long-term plans of reworking the backend more fundamentally. It appears the merger would not hinder any future reworks, while it offers some pytorch support for the current version. At this point, this may help to keep engaged more pytorch users which will also be helpful once the fundamental backend change is to come.

@leonbottou

This comment has been minimized.

leonbottou commented Nov 9, 2018

Extra points of potential value:

  • I noticed that other autodiff backends are designed to handle objectives that take a sequence of matrices as inputs instead of a single matrix and I copied that behavior. But this is not very tested because none of my use cases involved such manifolds.
  • This was designed to work with the current stable version of pytorch, that is, 0.4.1. I did not test with earlier releases.
  • The caching code works but is far from ideal. Comparing matrices has a nonzero cost. I use the id of the underlying ndarray to speedup the negative answers, but that is only 1/3rd of the cases. Best would be to pass extra arguments to cost() to also compute and return derivatives, e.g.
   cost(x, egrad=True, ehess=True, ehessdir=u) --->  (cost, egrad, ehess)

Alas this involves changing the solvers.

@nkoep

This comment has been minimized.

Member

nkoep commented Nov 9, 2018

Fair enough. My only worry was that adding a pytorch backend now, and then immediately breaking the API once we move forward with #55 wouldn't be ideal either. The thing is that there's ultimately nothing which holds #55 back barring a decision on which method to adopt. The actual implementation work is very minor.

leonbottou and others added some commits Nov 9, 2018

Update examples/linreg_multiple_pytorch.py
Co-Authored-By: leonbottou <leon@bottou.org>
Update pymanopt/core/problem.py
Co-Authored-By: leonbottou <leon@bottou.org>
Update pymanopt/core/problem.py
Co-Authored-By: leonbottou <leon@bottou.org>
@coveralls

This comment has been minimized.

coveralls commented Nov 9, 2018

Coverage Status

Coverage increased (+1.6%) to 48.749% when pulling 7d8c46f on leonbottou:master into 3ca07ab on pymanopt:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment