Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added pytorch autograd backend (minimal changes) #66

Closed
wants to merge 15 commits into from

Conversation

leonbottou
Copy link

This contains a pytorch backend for pymanopt without changing anything else. The selection of the pytorch backend depends on adding arg=torch.Tensor() when creating the Problem instance, as illustrated in the additional examples.

The pytorch tape based differentiation requires us to compute the cost whenever we want the gradient, and to compute the gradient whenever we want the Hessian. To avoid all these duplicate computations, the additional arg=torch.Tensor() serves in fact as a container to cache all the computations performed for the latest value of x. Other than that, the implement pretty straightforward.

@sweichwald
Copy link
Member

Nice, thanks for the PR and the accompanying examples!

This looks good to me -- only the flake8 tests and some minor pedantic remarks to be fixed, then it should be ready to be merged :-)

pymanopt/core/problem.py Outdated Show resolved Hide resolved
pymanopt/core/problem.py Outdated Show resolved Hide resolved
@@ -73,6 +73,7 @@ def precon(x, d):
self._backends = list(
filter(lambda b: b.is_available(), [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer alphabetical order

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One needs to test for PytorchBackend's availability before AutogradBackend's because both check that the objective is callable and only the former tests the argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again related to #55 #56
@nkoep Option 1 outlined in #55 resolves the most pressing issues for now, while it can be seen as groundwork for Option 2 should the need for this more intrusive rework arise.
I agree, time permitting, resolving #55 #56 as outlined and then merging #66 is preferable. (In case none of us finds the time, I think it is fair to merge this in the meanwhile and revisit #55 #56 at a later point.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is clear that using arg=torch.Tensor() to indicate that one wants the pytorch backend is a hack. To make things dirtier, one side effect of this hack is to provide a container for the cache. There are surely cleaner ways to do all this, at the expense of more work as usual. In the end, this is your project and you must decide what makes sense for you.

@nkoep
Copy link
Member

nkoep commented Nov 9, 2018

I'm not sure I agree with merging this just yet before coming to a final decision regarding #55 and #56. From a cursory glance, it seems the caching mechanism suggested here should be compatible with the rudimentary pytorch backend proposed here: https://github.com/pymanopt/pymanopt/pull/56/files#diff-25bf4ee53fa80a478acbac699286855e

@sweichwald
Copy link
Member

sweichwald commented Nov 9, 2018

I agree that we should move on with (deciding on) #55 / #56 and a more fundamental rework of the backend for the mid-/long-term, which we have not gotten around to yet.

In my opinion, a viable solution to offer pytorch support in the short-term is merging this PR which integrates well with our current backend. This way we are not bottlenecked by the more mid-/long-term plans of reworking the backend more fundamentally. It appears the merger would not hinder any future reworks, while it offers some pytorch support for the current version. At this point, this may help to keep engaged more pytorch users which will also be helpful once the fundamental backend change is to come.

@leonbottou
Copy link
Author

leonbottou commented Nov 9, 2018

Extra points of potential value:

  • I noticed that other autodiff backends are designed to handle objectives that take a sequence of matrices as inputs instead of a single matrix and I copied that behavior. But this is not very tested because none of my use cases involved such manifolds.
  • This was designed to work with the current stable version of pytorch, that is, 0.4.1. I did not test with earlier releases.
  • The caching code works but is far from ideal. Comparing matrices has a nonzero cost. I use the id of the underlying ndarray to speedup the negative answers, but that is only 1/3rd of the cases. Best would be to pass extra arguments to cost() to also compute and return derivatives, e.g.
   cost(x, egrad=True, ehess=True, ehessdir=u) --->  (cost, egrad, ehess)

Alas this involves changing the solvers.

@nkoep
Copy link
Member

nkoep commented Nov 9, 2018

Fair enough. My only worry was that adding a pytorch backend now, and then immediately breaking the API once we move forward with #55 wouldn't be ideal either. The thing is that there's ultimately nothing which holds #55 back barring a decision on which method to adopt. The actual implementation work is very minor.

leonbottou and others added 5 commits November 9, 2018 11:31
Co-Authored-By: leonbottou <leon@bottou.org>
Co-Authored-By: leonbottou <leon@bottou.org>
Co-Authored-By: leonbottou <leon@bottou.org>
@coveralls
Copy link

coveralls commented Nov 9, 2018

Coverage Status

Coverage increased (+1.6%) to 48.749% when pulling 7d8c46f on leonbottou:master into 3ca07ab on pymanopt:master.

@leonbottou
Copy link
Author

Ping?

@nkoep
Copy link
Member

nkoep commented Jun 14, 2019 via email

@leonbottou
Copy link
Author

leonbottou commented Jun 14, 2019

Niklas: please do not take my previous message as a request for immediate action. Pymanopt really helped us understanding our problem. Then we've made analytical progress and no longer rely on manifold optimization at all. So there is no urgency for our side. I was just curious to know whether the idea of a pytorch backend was still alive. Anyway, this is not a big change. The only nontrivial issues were the caching and the backend selection method. My patch does nothing for the latter...

@nkoep
Copy link
Member

nkoep commented Jun 16, 2019

Oh, no worries, Leon. It's been bugging me for a while that I had to abandon the rewrite somewhere in the middle. Adding support for pytorch is still very much of interest. Thanks for clarifying the situation though.

@nkoep nkoep closed this Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants