Stiefel Manifold #264

Fengcheng-Pei · 2024-03-22T21:34:23Z

Hi,

In my problem, my optimization variable lies in a Stiefel Manifold A (3x2). My original problem P1 is:

max z
s.t.  g(A) >= z
      f(A) >= Q
      A lies in Stiefel Manifold (3x2)

Then I use exact penalty to transform the original problem into a standard unconstrained manifold optimization problem P2 as:

minimize -z + a*b*log[1+exp((z-g(A))/b)] + a*b*log[1+exp((Q-f(A))/b)]
s.t.           A lies in Stiefel Manifold (3x2)

where z, Q are constants, a is the penalty factor and b is the smoothing factor. To solve this problem, I increase a and decrease b to iteratively optimize A until convergence. However, the objective function of P2 just increases with the increase of penalty factor a and doesn't converge to -z. The original problem is to find the feasible solution and I do not know whether my transformation could work well and I cannot understand why the object function of P2 just increase until +nan with iteration.

I really need your help and hope for your response!

Best regards,
Fengcheng Pei

The text was updated successfully, but these errors were encountered:

Fengcheng-Pei · 2024-03-24T14:01:48Z

Hi,

I also tried a simple Stiefel Manifold optimization with exact penalty and it does not converge, either. However, when I use the Sphere Manifold optimization with exact penalty, it will converge. I do not know the reason and hope your kind advice. Here, I have attached my code.

def arraysteering(self,beam,ra,rd):
    W=torch.tensor(beam)
    Du=torch.tensor(self.Du)
    D=torch.tensor(self.D)
    rou=1.0
    rou_max=2**(20)
    theta_rou=2.0
    miu=1.0
    min_miu=10**(-6)
    theta_miu=(min_miu/miu)**(1/30)
    diff=1.0
    ts=0.01
    r1=rd.H
    r2=ra.H

    while diff > ts:
        init_ori=np.zeros(shape=(3,2))
        init_ori[:,0]=np.array(r1).squeeze()
        init_ori[:,1]=np.array(r2).squeeze()
        manifold=Stiefel(3,2)
        @pymanopt.function.pytorch(manifold)
        def cost(A):
            a=torch.tensor(1j*np.zeros(shape=(self.T+self.K,self.N)))
            for i in range(self.T+self.K):
                for j in range(self.N):
                    a[i,j]=self.alpha*torch.sin(torch.arccos(A[:,0]@Du[:,i]))*torch.complex(torch.cos(j*torch.pi*Du[:,i]@A[:,1]),torch.sin(-j*torch.pi*Du[:,i]@A[:,1]))
            
            obj_2=0
            for i in range(self.T):
                for j in range(self.K):
                    obj_2=obj_2-torch.linalg.norm(a[self.K+i,:]@W[:,j])**2
            for i in range(self.K):
                sig_p=torch.linalg.norm(1/self.npower*torch.sqrt(torch.tensor(self.beta))/torch.linalg.norm(D[:,i])*a[i,:]@W[:,i])**2
                int_p=-sig_p
                for j in range(self.K):
                    int_p=int_p+torch.linalg.norm(1/self.npower*torch.sqrt(torch.tensor(self.beta))/torch.linalg.norm(D[:,i])*a[i,:]@W[:,j])**2
                obj_2=obj_2+rou*miu*torch.log2(1+torch.exp((self.QoS-sig_p/(int_p+1))/miu))

            return obj_2
        
        prob_2=pymanopt.Problem(manifold, cost)
        optimizer=pymanopt.optimizers.ConjugateGradient()
        Ori=np.matrix(optimizer.run(prob_2, initial_point=init_ori).point)
        r1=np.matrix(np.copy(Ori[:,0]))
        r2=np.matrix(np.copy(Ori[:,1]))

        diff=np.linalg.norm(Ori[:,0]-init_ori[:,0])+np.linalg.norm(Ori[:,1]-init_ori[:,1])
        rou=min(theta_rou*rou, rou_max)
        miu=max(theta_miu*miu, min_miu)

    Ori_opt=np.matrix(np.zeros(shape=(3,2)))
    Ori_opt[:,0]=np.matrix(np.copy(r1))
    Ori_opt[:,1]=np.matrix(np.copy(r2))
    return Ori_opt

Best regards,
Fengcheng Pei

kellertuer · 2024-03-24T16:58:58Z

I do not work much with the Python package here (I am the Julia variant person ;))

But the first thing I do not understand is why A only appears in the constraints. Doesn't that make A a parameter one can just choose and then finding max z Is easy? Or are you maximising over all A? Then I do not understand wh your cost would then be the constant z.
Maybe you can clarify

For your switch of the manifolds, you would then switch from
Stiefel(3,2) – that is two orthonormal vectors in R3 stored columnise in a matrix to
Sphere(3,2) – which is a matrix of 2 unit vectors stored in a matrix.
So this is a completely different set you are optimising over, since for the second manifold there is nothing connecting the two columns (like orthogonality in the first). So why did you try the other manifold then?

For the code I can sadly not contribute much , as I said I do not program in Python (much), and especially never used pytorch.

Fengcheng-Pei · 2024-03-24T18:49:55Z

I do not work much with the Python package here (I am the Julia variant person ;))

But the first thing I do not understand is why A only appears in the constraints. Doesn't that make A a parameter one can just choose and then finding max z Is easy? Or are you maximising over all A? Then I do not understand wh your cost would then be the constant z. Maybe you can clarify

For your switch of the manifolds, you would then switch from Stiefel(3,2) – that is two orthonormal vectors in R3 stored columnise in a matrix to Sphere(3,2) – which is a matrix of 2 unit vectors stored in a matrix. So this is a completely different set you are optimising over, since for the second manifold there is nothing connecting the two columns (like orthogonality in the first). So why did you try the other manifold then?

For the code I can sadly not contribute much , as I said I do not program in Python (much), and especially never used pytorch.

Hi,

Thanks great a lot for your response! Let me explain my problem more clearly. My original problem P1 is:

max min(g_l(w,r1,r2)) for l=1,...,L
s.t.  f_k(w,r1,r2) >= Q for k=1,...,K
      ||r1||=1
      ||r2||=2
      r1.T*r2=0

So r1 and r2 lie in the Stiefel Manifold. Problem P1 is equivalent to problem P2:

max z
s.t.  g_l(w,r1,r2)>=z for l=1,...,L
      f_k(w,r1,r2) >= Q for k=1,...,K
      ||r1||=1
      ||r2||=2
      r1.T*r2=0

With block coordinate descent, I optimize z and w using convex optmization. Then the left problem P3 for optimizing r1 and r2 is:

max z
s.t.  g_l(w,A)>=z for l=1,...,L
      f_k(w,A) >= Q for k=1,...,K
      A=Stiefel(3,2)

where z and w are constants optimized before. To solve P3, I use exact penalty method and get problem P4:

min -z+m*(Σn*log(1+exp((z-g_l)/b))+Σn*log(1+exp((Q-f_k)/b)))
s.t.  A=Stiefel(3,2)

where m and n are penalty factor and smoothing factor, respectively. I start with a small m and iteratively increase m and optimize A untill convergence. However, the objective function of P4 keeps increasing during the iteration because of the increase of m.
Before, I solved P4 by optimizing r1 and r2 separately with Sphere manifold and the exact penalty method worked well. I kept the orthonality by using the trick of null space. However, when using Stiefel Manifold, it does not converge with the increase of penalty factor m. I am just very confuse why it could happen.

Best regards,
Fengcheng Pei

kellertuer · 2024-03-24T19:49:37Z

Hi,
just two further short comments

I See that your r2 has to have a norm of 2, but both Stiefel (second column) or Sphere (second component) would yield norm 1, does that make a difference?
Besides that I am confused that P1 reads “max min” is that a typo?

orthonality by using the trick of null space.

I do not understand this in that shortage, since there might just be too many things called “trick”.

Overall I would probably prefer to go for P1 directly, I am not sure why you rephrase that.

Just as a side remark – If you want to have something to compare to, there is

this add-on for Manopt in MATLAB https://github.com/losangle/Optimization-on-manifolds-with-extra-constraints – to do either EPM or ALM
EPM https://manoptjl.org/stable/solvers/exact_penalty_method/ and ALM https://manoptjl.org/stable/solvers/augmented_Lagrangian_method/ in Manopt.jl (which I develop) in Julia

Fengcheng-Pei · 2024-03-25T16:01:31Z

Hi, just two further short comments

I See that your r2 has to have a norm of 2, but both Stiefel (second column) or Sphere (second component) would yield norm 1, does that make a difference? Besides that I am confused that P1 reads “max min” is that a typo?

orthonality by using the trick of null space.

I do not understand this in that shortage, since there might just be too many things called “trick”.

Overall I would probably prefer to go for P1 directly, I am not sure why you rephrase that.

Just as a side remark – If you want to have something to compare to, there is

this add-on for Manopt in MATLAB https://github.com/losangle/Optimization-on-manifolds-with-extra-constraints – to do either EPM or ALM

EPM https://manoptjl.org/stable/solvers/exact_penalty_method/ and ALM https://manoptjl.org/stable/solvers/augmented_Lagrangian_method/ in Manopt.jl (which I develop) in Julia

Hi,

Thanks great a lot for your help! And I solved my problem, because I made a mistake with the dimension of the output of the optimization problem. However, I found that the simulation time of using Stiefel manifold was much longer than using Sphere manifold. Is it due to that the retraction steps of Stiefel manifold is more complex than Sphere manifold?

Best regards,
Fengcheng Pei

kellertuer · 2024-03-26T06:00:07Z

I still have not understood the “orthogonality trick”, but well – great that it works now!

“Much longer” is a bit vague and hard to tell. Does it take much more steps? Or the same number of steps?
Also it depends a lot on the choice of the retraction. Since one way to see them is as approximations of the exponential map, some might do that less accurately. On the sphere you probably do use the exponential map since that is easy enough to compute; on Stiefel there are several retractions available usually (but not exp).
The same of course for the inverse retraction (or logarithmic map).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stiefel Manifold #264

Stiefel Manifold #264

Fengcheng-Pei commented Mar 22, 2024 •

edited

Loading

Fengcheng-Pei commented Mar 24, 2024 •

edited

Loading

kellertuer commented Mar 24, 2024

Fengcheng-Pei commented Mar 24, 2024 •

edited

Loading

kellertuer commented Mar 24, 2024 •

edited

Loading

Fengcheng-Pei commented Mar 25, 2024

kellertuer commented Mar 26, 2024

Stiefel Manifold #264

Stiefel Manifold #264

Comments

Fengcheng-Pei commented Mar 22, 2024 • edited Loading

Fengcheng-Pei commented Mar 24, 2024 • edited Loading

kellertuer commented Mar 24, 2024

Fengcheng-Pei commented Mar 24, 2024 • edited Loading

kellertuer commented Mar 24, 2024 • edited Loading

Fengcheng-Pei commented Mar 25, 2024

kellertuer commented Mar 26, 2024

Fengcheng-Pei commented Mar 22, 2024 •

edited

Loading

Fengcheng-Pei commented Mar 24, 2024 •

edited

Loading

Fengcheng-Pei commented Mar 24, 2024 •

edited

Loading

kellertuer commented Mar 24, 2024 •

edited

Loading