Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue1077 chambolle pock #1092

Merged
merged 33 commits into from
Sep 19, 2017
Merged

Conversation

mehrhardt
Copy link
Contributor

I renamed Chambolle-Pock to PDHG / primal dual hybrid gradient in many files. I hope I have not overlooked one.

Moreover, I changed the meaning of "gamma" in PDHG. This is now replaced with gamma_primal. In addition, one can do dual acceleration with gamma_dual.

There are still 3 small issues that should be solved before any merging.

  • The documentation is updated, but I think there are two links that still contain the name "Chambolle-Pock" and I didnt know how to change that as I didnt know where exactly they are pointing to.
  • The gamma_primal part is tested in a new example of L2-TV denoising (ROF). Here I would like to compute the objective function values over the iterations to show that it gets faster with gamma_primal > 0. How would I do that? Moreover, I wanted to compare different versions (assignments to functionals f and g). Somehow they converge to different objective function values (which they should not ...).
  • A simple example where gamma_dual > 0 makes sense would be something with Huberized TV. Is this or something similar available in ODL?

@adler-j
Copy link
Member

adler-j commented Aug 22, 2017

Try having a look at the failing tests, I'll review.

Copy link
Member

@adler-j adler-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall a good change I'd say. With that said the name is quite long, I'd probably prefer pdhg. You should also search the documentation for any residual "chambolle_pock" mentions (e.g. in "getting started"). Those all need to be updated.

Could also need @kohr-h input.

#####################

The `chambolle_pock_solver` was introduced in 2011 by Chambolle and Pock in the paper `A first-order primal-dual algorithm for convex problems with applications to imaging
The `primal_dual_hybrid_gradient_solver` was studied in 2011 by Chambolle and Pock in the paper `A first-order primal-dual algorithm for convex problems with applications to imaging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd scrap _solver from the function name now. Too long as is.

Perhaps we could even go with pdhg?

@@ -0,0 +1,124 @@
"""Total variation denoising using the primal-dual hybrid gradient algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite clear what this function adds? I thought we had something similar before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version of this file runs 3 algorithms for the ROF (L2^2 + TV) problem. 1) the algorithm of the old example 2) a variant that most people would use for this and 3) the 1/k^2 convergence version that makes use of strong convexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also shows how to compare algorithms with ODL. Thereby this add some features that other examples don't have.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

# Make separable sum of functionals, order must correspond to the operator K
f = l1_norm

# Data fit with non-negativity constraint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use a word on what is happening here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is removed and the primal acceleration bit is on the algorithmic comparison

# Gradient operator
gradient = odl.Gradient(space, method='forward')

# Matrix of operators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No matrix to be seen here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

# Original image
orig = space.element(image)

# Add noise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments like this add very little. What you do is obvious from the code here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is your or Holger's comment. I am happy to remove it :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to remove old cruft :-)

''.format(gamma_dual_in))

if gamma_primal is not None and gamma_dual is not None:
raise ValueError('Only one acceleration parameter can be used')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this? Also this should be mentioned in the function definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Depending on which part is strongly convex, the step sizes get either multiplied or divided by theta. Both doesn't make sense.

@mehrhardt
Copy link
Contributor Author

All changes are done and Travis is happy. @adler-j, do you want to have another look? Maybe @kohr-h wants to add anything?

@mehrhardt
Copy link
Contributor Author

Before, I forget. The third point I raised initially about dual acceleration is still not met. I think it would be great if one looked at Huberized ROF and compares two algorithms: 1) PDHG with dual acceleration and 2) linearly convergent PDHG.

I won't be able to do that this week as this needs some non-trivial coding I suppose.

Copy link
Member

@kohr-h kohr-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks (very) good. I have a few comments regarding documentation mostly, but otherwise this can go in for sure. Well done!

@@ -220,7 +220,7 @@ Improvements
* Major review of minor style issues. (:pull:`534`)
* Typeset math in proximals. (:pull:`580`)

- Improved installation docs and update of Chambolle-Pock documentation. (:pull:`121`)
- Improved installation docs and update of PDHG documentation. (:pull:`121`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the changes in old release notes (I guess you used search-and-replace)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @kohr-h. We shouldnt edit old release notes

making use of the strong convexity of the problem.

For further details and a description of the solution method used, see
:ref:`PDHG` in the ODL documentation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this link will work like this. (Well, currently all links are broken :-/, but that's a different story). Perhaps link to a URL in the documentation, you may need to derive the link address from the old one and the new name of the document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me.

self.obj_function_values_ergodic = []

def __call__(self, x):
# Fill in proper calls to functionals here
Copy link
Member

@kohr-h kohr-h Aug 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can remove this comment, now that the proper calls are used :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,221 @@
"""Total variation denoising using PDHG.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a very well done and nicely documented example. Great!

reg_param = 0.3

# l2-squared data matching
factr = 1./reg_param * 0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither Travis nor Spyder complained about pep8 here. Can you be more specific what you think is violating pep8?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess he's comming at 1./reg_param missing spaces.

With that said, why not 0.5/reg_param?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis doesn't run the examples and thus also no PEP8 check. And indeed, Spyder doesn't complain.

autopep8 seems to do a better (more complete job) here. It's available in pip, and if you run it with the -d flag it gives you a nice diff with suggested changes. You can also run with -i to get the changes in-place.

# show images
plt.figure(0)
ax1 = plt.subplot(231)
ax1.imshow(orig, clim=[0, 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually one needs a conversion from ODL's 'xy' axes to matplotlib's 'ij' axes. Probably the images end up correctly oriented like this, but if you do, e.g., orig.show() the image will be rotated by 90 degrees.

When wrapping a raw array representing an image in 'ij' convention, the conversion is np.rot90(img, -1), and the other way around it is np.rot90(odl_elem, 1).

It's up to you if you consider that useful here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally really suggest simply using space.element(img).show() here. Guaranteed to get everything correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to use img.show() but for this example, it is most important to create plot that are next to each other. Of course the user can drag them there but I really would not recommend that to anyone. As far as I know img.show() will always open a new figure, so it cannot be used here. Any ideas how to combine these two ideas?

Copy link
Member

@adler-j adler-j Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well you have a point there, and no, I don't have any idea on how to solve it. I guess this way is fine. One way would be to create a utility:

def convert_to_plt(x):
    # do rotations and stuff

obj_opt = min(obj_alg1 + obj_alg2 + obj_alg3)


def rel_fun(x): return (np.array(x) - obj_opt)/(x[0] - obj_opt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function body on a new line.

"""Chambolle-Pock algorithm for non-smooth convex optimization problems.
def pdhg(x, f, g, L, tau, sigma, niter, **kwargs):
"""Primal-dual hybrid gradient algorithm for non-smooth convex optimization
problems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fit on one line (perhaps simply remove "problems")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Primal-dual hybrid gradient algorithm for convex optimization" is fine imo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -39,12 +40,12 @@ def chambolle_pock_solver(x, f, g, L, tau, sigma, niter, **kwargs):

where ``L`` is an operator and ``F`` and ``G`` are functionals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The F and G references should all be lowercase, could you please fix that while at it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

``F^*`` to be uniformly convex.
with ``tau`` and ``sigma`` as initial values. Requires ``G`` to be
strongly convex. ``gamma_primal`` is the strong convexity constant of
``G``. Acceleration can either be done on the primal part or the dual
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this. It seems impossible that gamma_primal is both a derived property of G and a user choice.
Is the strong convexity constant maybe an upper bound for the choice of acceleration parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. In practice, one wants to have gamma as large as possible. Also the use of "strong convexity constant" is often not unique. Some authors refer to it as the largest constant such that property X holds, others as any constant such that X holds.

Copy link
Member

@adler-j adler-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, overall very nice!

@@ -220,7 +220,7 @@ Improvements
* Major review of minor style issues. (:pull:`534`)
* Typeset math in proximals. (:pull:`580`)

- Improved installation docs and update of Chambolle-Pock documentation. (:pull:`121`)
- Improved installation docs and update of PDHG documentation. (:pull:`121`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @kohr-h. We shouldnt edit old release notes


# Add noise and convert to space element
image += np.random.normal(0, 0.1, shape)
noisy = space.element(image)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noisy = orig + 0.1 * odl.phantom.white_noise(space)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# Run algorithms 2 and 3
x_alg2 = x_start.copy()
callback.reset()
callback.callbacks[1].reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be needed? Given the line above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# show images
plt.figure(0)
ax1 = plt.subplot(231)
ax1.imshow(orig, clim=[0, 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally really suggest simply using space.element(img).show() here. Guaranteed to get everything correct

"""Chambolle-Pock algorithm for non-smooth convex optimization problems.
def pdhg(x, f, g, L, tau, sigma, niter, **kwargs):
"""Primal-dual hybrid gradient algorithm for non-smooth convex optimization
problems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Primal-dual hybrid gradient algorithm for convex optimization" is fine imo

@mehrhardt
Copy link
Contributor Author

Did the corrections. I would propose to merge it for now and we add the dual acceleration example at some other point. For instance when implementing Stochastic PDHG.

@kohr-h
Copy link
Member

kohr-h commented Aug 30, 2017

Did the corrections. I would propose to merge it for now and we add the dual acceleration example at some other point. For instance when implementing Stochastic PDHG.

I agree. Could you open a new issue for this so we have it on the agenda?

@mehrhardt
Copy link
Contributor Author

Done, see #1101

@kohr-h
Copy link
Member

kohr-h commented Aug 30, 2017

I ran the examples and found some issues, PR coming up.

@kohr-h
Copy link
Member

kohr-h commented Aug 30, 2017

See mehrhardt#1

There's one issue left, and I don't know the source: The TGV tomography example craps out after 1 iteration. If you set niter = 1 the result is still OK (not NaN), but the next iteration makes everything NaN.
Any idea where this comes from? I haven't checked against the old version of the example, so I can't say if this PR introduces the issue.

@mehrhardt
Copy link
Contributor Author

Interestingly, if I run the example the nan values sometimes occur and sometimes not. @kohr-h, can you confirm this or do you always get nan?

@kohr-h
Copy link
Member

kohr-h commented Aug 30, 2017

Interestingly, if I run the example the nan values sometimes occur and sometimes not. @kohr-h, can you confirm this or do you always get nan?

I'll check again, but if I remember correctly I ran it about 5 times (also varying tau and sigma`) and got NaN in the second iteration every time.

@kohr-h
Copy link
Member

kohr-h commented Aug 30, 2017

For me it seems to reliably fail. The reason are apparently the derivatives: after 1 iteration I get this (the white parts are NaN values):
derivatives _part_0
derivatives _part_1

@kohr-h
Copy link
Member

kohr-h commented Aug 30, 2017

But actually, on the master branch we have the same issue (I just checked), so this PR is not the cause.

So IMO it's fine that we just merge this and make a new issue for the TGV example.

@mehrhardt
Copy link
Contributor Author

So, what needs to be done here? There seems to be a conflict in two files. How do we resolve this? I am happy to do it if you tell me how.

@adler-j
Copy link
Member

adler-j commented Sep 19, 2017

Basically what you need to do is:

git checkout master
git pull
git checkout Issue1077_ChambollePock
git rebase master

Then you'll get some conflicts that you need to solve, and once that is done we can merge this.

@adler-j
Copy link
Member

adler-j commented Sep 19, 2017

So something went very wrong in the rebase here :O

@mehrhardt
Copy link
Contributor Author

Can you be more specific? What is it what you don't like?

@mehrhardt
Copy link
Contributor Author

mehrhardt commented Sep 19, 2017

What I did was pretty much what you said:

git checkout upstream/master
git pull
git checkout Issue1077_ChambollePock
git rebase upstream/master

I resolved the conflicts by choosing my files and deleting some others. Then continued with the rebase and pushed the final result.

@kohr-h
Copy link
Member

kohr-h commented Sep 19, 2017

Seems to be somehow rebased on the wrong commit. There are old commits apparently added as new ones, which can be seen from the numerous items in history on this page.
You can undo the whole thing by running git reflog, look for the last thing before the git rebase, copy the hash of that commit and git checkout <last_good_commit_hash>. Then you can delete the botched branch git branch -D Issue1077_ChambollePock and git checkout -b Issue1077_ChambollePock to make a good version again.
Then start from that commit again.

It might be that the first part

git checkout upstream/master
git pull

didn't do what you expected it to do. It's good that you rebase on upstream/master, not your local one, since that's also a source of error. My recommended workflow is like this:

git fetch upstream  # updates the remote branches
git rebase upstream/master

Simpler, less error-prone.

@mehrhardt
Copy link
Contributor Author

I did a bit different strategy:

	$ git reset --hard 053f4b251fb408d1944acf97c107b1b0fe6cb78d
	$ git rebase upstream/master

which seems to have done exactly what was needed. Is that OK? Otherwise I can always undo this since I have not pushed these changes. While my branch seems to be exactly what I wanted, I am a bit puzzled that I did not receive conflicts.

I would now push my changes to the remote branch and thereby getting rid of the noise above. Is that fine?

@kohr-h
Copy link
Member

kohr-h commented Sep 19, 2017

I did a bit different strategy:

$ git reset --hard 053f4b2
$ git rebase upstream/master

which seems to have done exactly what was needed. Is that OK? Otherwise I can always undo this since I have not pushed these changes. While my branch seems to be exactly what I wanted, I am a bit puzzled that I did not receive conflicts.

I would now push my changes to the remote branch and thereby getting rid of the noise above. Is that fine?

If that works for you, all good. It's possible that you didn't get conflicts because they could be auto-merged. If you check the messages during rebase and see something like auto-merge then that's it. Usually this happens when the same file has been edited upstream, but in a different place.

Just go ahead and push, there's nothing that can't be fixed 😉

@mehrhardt
Copy link
Contributor Author

Yes, there were auto-merges. I double checked the changes of the current version of this pull request and it looks good to me. I could not find anything that was missing or incorrect.

Copy link
Member

@kohr-h kohr-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor thing that slipped, after that we can merge


where :math:`\|K\|` is the operator norm of :math:`K`.
where :math:`\|K\|` is the operator norm of :math:`L`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be L in the first one, too

@mehrhardt
Copy link
Contributor Author

Done. I found even another "K" that should have been an "L". :)

@kohr-h
Copy link
Member

kohr-h commented Sep 19, 2017

Done. I found even another "K" that should have been an "L". :)

Perfect. Just that nasty PEP8 check blew up :-)

@kohr-h
Copy link
Member

kohr-h commented Sep 19, 2017

Ok, I'll merge like this, no need to rebase on master again. Thanks @matthiasje!

@kohr-h kohr-h merged commit cc909ef into odlgroup:master Sep 19, 2017
@mehrhardt mehrhardt deleted the Issue1077_ChambollePock branch September 19, 2017 14:13
@adler-j
Copy link
Member

adler-j commented Sep 19, 2017

Now this is breaking changes for some downstream stuff (e.g. all my machine learning) that needs to be fixed.

We need to have a look at that.

@kohr-h
Copy link
Member

kohr-h commented Sep 19, 2017

Maybe you should use a pinned git version of ODL in the requirements.txt? 😛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants