Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shifted Composite L1 Norm #123

Closed

Conversation

MaxenceGollier
Copy link

Implementation of the proximal operator of a composite term $||Ax+b||_1$. This is different from other operators of this library in the fact that shifts are no longer simply $\psi(x+s)$ for some non-differentiable function $\psi$.
Hence, some abstract types are added in ShiftedProximalOperators.jl.

In practice, such proximal operators are useful when we want to make a model of a penalty term $x \xrightarrow{} ||c(x)||_1$. The first-order model of the penalty term is then $s \xrightarrow{} ||c(x) + J(x)s||_1$.

As mentionned above, the shift is no longer simply $\psi(x+s)$. Hence, we add a CompositeNormL1 struct which mimics the unshifted function and implements $x\xrightarrow{} ||c(x)||_1$. On shifts, this transforms to a ShiftedCompositeNormL1 struct which implements $s\xrightarrow{} ||c(x)+J(x)s||_1$.

	new file:   src/CompositeNormL1.jl
	new file:   src/ShiftedCompositeNormL1.jl
	modified:   src/ShiftedProximalOperators.jl
	modified:   test/runtests.jl
Copy link
Member

@dpo dpo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think things would be clearer if you defined a type

abstract AbstractCompositeNorm <: CompositeProximableFunction end

and then

CompositeNormL1 <: AbstractCompositeNorm
CompositeNormL2 <: AbstractCompositeNorm

src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/CompositeNormL1.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
src/CompositeNormL1.jl Show resolved Hide resolved
src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/ShiftedCompositeNormL1.jl Show resolved Hide resolved
MaxenceGollier and others added 9 commits March 10, 2024 11:16
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
should no longer need a NormL1 struct to be constructed.
src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/ShiftedCompositeNormL1.jl Outdated Show resolved Hide resolved
src/ShiftedCompositeNormL1.jl Outdated Show resolved Hide resolved
MaxenceGollier and others added 3 commits March 13, 2024 21:26
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
Co-authored-by: Dominique <dominique.orban@gmail.com>
&J(x) : \mathbb{R}^n \xrightarrow[]{} \mathbb{R}^{m\times n}
\end{aligned}
```
such that J is the Jacobian of c. A and b should respectively be a matrix and a vector which can respectively store the values of J and c.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to state how c!() and J!() are expected to be called and what they should return. In particular, is A expected to be dense or sparse? In NLPModels, there is no Jacobian method that fills in a matrix, i.e., something like jac!(nlp, x, A) does not exist. However, there is jac_coord!(nlp, x, vals), which fills in the array vals of nonzeros elements in the sparse coordinate representation of the Jacobian.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit, see MaxenceGollier:affineNormL1@561c415
I just changed the documentation, do you expect to force A by changing the type in the constructor ? I don't know if it is necessary.

src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/CompositeNormL1.jl Outdated Show resolved Hide resolved
@@ -0,0 +1,85 @@
export ShiftedCompositeNormL1

mutable struct ShiftedCompositeNormL1{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some documentation to explain that $c$ will be linearized; otherwise, we would expect the shifted function to be $s \to \Vert c(x+s) \Vert_1$.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See MaxenceGollier:affineNormL1@bb3faaa

src/CompositeNormL1.jl Outdated Show resolved Hide resolved
src/ShiftedCompositeNormL1.jl Outdated Show resolved Hide resolved
src/ShiftedCompositeNormL1.jl Outdated Show resolved Hide resolved
end

function (ψ::CompositeProximableFunction)(y)
z = similar(ψ.b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preallocate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I add z as an argument, there will be problems later with R2 and overall most solvers compute $\psi(y)$ at some point without calling an extra argument.

ψ = CompositeOp(λ,c!,J!,A,b)

# test non shifted operator
@test ψ(ones(Float64,4)) == h([1,2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does $\psi$ take an input of length 4 but $h$ takes an input of length 2 here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our implementation, $\psi$ represents $\psi : \mathbb{R}^n \xrightarrow[]{} \mathbb{R}^m : s \xrightarrow[]{} ||c(x)+J(x)s||_1$. In this test, $n=4$ and $m = 2$.

test/runtests.jl Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants