How to avoid learning a trivial solution? #356

AneleNcube · 2021-08-15T11:44:23Z

Hi! My question is probably superfluous because it is very identical to issue #321. However, I am still having issues with my output after trying some of the suggestions given in the answers. The particular problem I am looking at is a one-dimensional Schrodinger equation with an infinite square-well potential:

For l=1, the exact eigenfunctions and eigenvalues are:

I wanted to find out if I can compute the eigenfunction for n=1 using PINNs, by setting E = pi/2 and h=m=1. The boundary conditions for this problem are \psi(0) = \psi(1) = 0. This is a snippet of my application of your deepXDE library to this forward problem, where I tried a few different approaches (including hard constraints) to impose the boundary conditions:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np

import deepxde as dde
from deepxde.backend import tf


def main():
    N = 1
    E = (1*(np.pi)**2)/2

    def pde(x, psi):
        psi_x = tf.gradients(psi, x)[0]
        psi_xx = tf.gradients(psi_x, x)[0]
        pde = (1/2)*psi_xx + E*psi
        return pde 

    def func(x): 
        func = np.sqrt(2)*np.sin(N*np.pi*x)
        return func    

    geom = dde.geometry.Interval(0, 1)

    def boundary_l(x, on_boundary):
        return on_boundary and np.isclose(x[0], 0)
    def boundary_r(x, on_boundary):
        return on_boundary and np.isclose(x[0], 1) 

    geom = dde.geometry.Interval(0, 1) 

    # boundary conditions
    #bc1 = dde.DirichletBC(geom, func, lambda _, boundary_l : boundary_l) 
    #bc2 = dde.DirichletBC(geom, func, lambda _, boundary_r : boundary_r ) 
    #bc = dde.PointSetBC(np.array([[0.],[1.]]), np.array([[0.],[0.]]), component=0)

    data = dde.data.TimePDE(
        geom,
        pde,
        [],
        num_domain=100,
        num_boundary=2,
        solution=func,
        num_test=2500,
    )
    net = dde.maps.FNN([1] + [20] * 3 + [1], "tanh", "Glorot uniform")
    #net.apply_output_transform(lambda x, y: (1-tf.exp(-x))*(1-tf.exp(x-1))*y) # parametric trick from Jin & Protopapas (2020)
    net.apply_output_transform(lambda x, y: x*(x - 1)*y) #hard constraint

    model = dde.Model(data, net)
    model.compile("adam", lr=0.001, metrics=["l2 relative error"])
    losshistory, train_state = model.train(epochs=10000)
    dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if __name__ == "__main__":
    main()

It seems that the neural network prefers the trivial solution whenever the solution at the boundary is zero. When I narrow the boundary slightly to [0.1, 0.9] and enforce the values of the exact solution, np.sqrt(2)*np.sin(N*np.pi*x), as the Dirichlet boundary condition, I am able to get the expected solution from the neural network.

Is there a way to force the neural network to give a non-trivial solution even with the Dirichlet boundary conditions as zero? I appreciate your assistance with this issue.

The text was updated successfully, but these errors were encountered:

lululxvi · 2021-08-19T04:08:33Z

Just to clarify, is the zero trivial solution a correction solution or not?

AneleNcube · 2021-08-19T08:46:09Z

Just to clarify, is the zero trivial solution a correction solution or not?

Yes, it is a correct solution, but non-physical for this problem.

lululxvi · 2021-08-28T14:54:22Z

If it is indeed a correct solution, then NN would find it with a high chance, because zero solution is easy to find. If we try to prevent this, there are a few things you can do:

If you know some other properties of other solutions, for example, you try to find a solution with a known point value such as u(x0) = u0, then you can use this as a loss. Basically, by add an extra constraint, the zero solution is not a valid solution any more.
If you don't have this, then what you can do could be using an "artificial" loss to prevent zero solution. For example, you may use a L2 regulization, but with a negative weight, so that it can force the NN to be away from all zeros.

There could be other ways, but there two are some ideas you can try.

AneleNcube · 2021-08-31T12:03:44Z

Thank you for your assistance, @lululxvi.

It definitely makes sense that the NN finds the zero solution with a high chance compared to the more complex solutions. For the above problem I am given the reference solution, therefore I can apply your first suggest by using the solution at one point in the domain as an added constraint for the NN. However, there are other cases I am looking at where the reference solution is not provided, and here I would have to consider your second suggestion. Is the negative weight a way to increase the complexity of the NN model?

Regarding other alternatives, in another issue discussion #237 I came across an article https://arxiv.org/abs/2010.05075 that describes unsupervised neural networks for solving quantum eigenvalue problems. One of their examples is the above differential equation where the PyTorch package is used to build neural networks. In that case, the loss function consists of the PDE residual, but also has regularization functions which include:

These terms enable the NN to avoid non-trivial eigenfunctions and eigenvalues respectively. Can these terms be added easily in the loss function in DeepXDE? Solving the differential equations with the PyTorch NN seems to consume a lot of RAM, thus, if I could have the extra regularization functions in DeepXDE I could solve the eigenvalue problem with minimal RAM usage. How can one embed these terms in the loss function? I appreciate your thoughts on this.

lululxvi · 2021-09-04T00:52:21Z

My intuition is a negative weight would increase the complexity of the NN model, but I didn't try this.

It is very easy to do. You can simply assume you have another "PDE" as 1/f^2

AneleNcube changed the title ~~How to avoid learning trivial solution?~~ How to avoid learning a trivial solution? Aug 15, 2021

lululxvi closed this as completed Mar 24, 2022

Bensayah mentioned this issue Aug 10, 2022

The ODE system of the SIR model does not converge #816

Open

mitchelldaneker mentioned this issue Aug 4, 2023

How to setup "ODE_weights" and “data_weights” ？ lu-group/sbinn#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to avoid learning a trivial solution? #356

How to avoid learning a trivial solution? #356

AneleNcube commented Aug 15, 2021 •

edited

lululxvi commented Aug 19, 2021

AneleNcube commented Aug 19, 2021

lululxvi commented Aug 28, 2021

AneleNcube commented Aug 31, 2021 •

edited

lululxvi commented Sep 4, 2021

How to avoid learning a trivial solution? #356

How to avoid learning a trivial solution? #356

Comments

AneleNcube commented Aug 15, 2021 • edited

lululxvi commented Aug 19, 2021

AneleNcube commented Aug 19, 2021

lululxvi commented Aug 28, 2021

AneleNcube commented Aug 31, 2021 • edited

lululxvi commented Sep 4, 2021

AneleNcube commented Aug 15, 2021 •

edited

AneleNcube commented Aug 31, 2021 •

edited