Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid learning a trivial solution? #356

Closed
AneleNcube opened this issue Aug 15, 2021 · 5 comments
Closed

How to avoid learning a trivial solution? #356

AneleNcube opened this issue Aug 15, 2021 · 5 comments

Comments

@AneleNcube
Copy link

AneleNcube commented Aug 15, 2021

Hi! My question is probably superfluous because it is very identical to issue #321. However, I am still having issues with my output after trying some of the suggestions given in the answers. The particular problem I am looking at is a one-dimensional Schrodinger equation with an infinite square-well potential:

CodeCogsEqn

For l=1, the exact eigenfunctions and eigenvalues are:

CodeCogsEqn (1)

I wanted to find out if I can compute the eigenfunction for n=1 using PINNs, by setting E = pi/2 and h=m=1. The boundary conditions for this problem are \psi(0) = \psi(1) = 0. This is a snippet of my application of your deepXDE library to this forward problem, where I tried a few different approaches (including hard constraints) to impose the boundary conditions:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np

import deepxde as dde
from deepxde.backend import tf


def main():
    N = 1
    E = (1*(np.pi)**2)/2

    def pde(x, psi):
        psi_x = tf.gradients(psi, x)[0]
        psi_xx = tf.gradients(psi_x, x)[0]
        pde = (1/2)*psi_xx + E*psi
        return pde 

    def func(x): 
        func = np.sqrt(2)*np.sin(N*np.pi*x)
        return func    

    geom = dde.geometry.Interval(0, 1)

    def boundary_l(x, on_boundary):
        return on_boundary and np.isclose(x[0], 0)
    def boundary_r(x, on_boundary):
        return on_boundary and np.isclose(x[0], 1) 

    geom = dde.geometry.Interval(0, 1) 

    # boundary conditions
    #bc1 = dde.DirichletBC(geom, func, lambda _, boundary_l : boundary_l) 
    #bc2 = dde.DirichletBC(geom, func, lambda _, boundary_r : boundary_r ) 
    #bc = dde.PointSetBC(np.array([[0.],[1.]]), np.array([[0.],[0.]]), component=0)

    data = dde.data.TimePDE(
        geom,
        pde,
        [],
        num_domain=100,
        num_boundary=2,
        solution=func,
        num_test=2500,
    )
    net = dde.maps.FNN([1] + [20] * 3 + [1], "tanh", "Glorot uniform")
    #net.apply_output_transform(lambda x, y: (1-tf.exp(-x))*(1-tf.exp(x-1))*y) # parametric trick from Jin & Protopapas (2020)
    net.apply_output_transform(lambda x, y: x*(x - 1)*y) #hard constraint

    model = dde.Model(data, net)
    model.compile("adam", lr=0.001, metrics=["l2 relative error"])
    losshistory, train_state = model.train(epochs=10000)
    dde.saveplot(losshistory, train_state, issave=True, isplot=True)

if __name__ == "__main__":
    main()

It seems that the neural network prefers the trivial solution whenever the solution at the boundary is zero. When I narrow the boundary slightly to [0.1, 0.9] and enforce the values of the exact solution, np.sqrt(2)*np.sin(N*np.pi*x), as the Dirichlet boundary condition, I am able to get the expected solution from the neural network.

Is there a way to force the neural network to give a non-trivial solution even with the Dirichlet boundary conditions as zero? I appreciate your assistance with this issue.

@AneleNcube AneleNcube changed the title How to avoid learning trivial solution? How to avoid learning a trivial solution? Aug 15, 2021
@lululxvi
Copy link
Owner

Just to clarify, is the zero trivial solution a correction solution or not?

@AneleNcube
Copy link
Author

Just to clarify, is the zero trivial solution a correction solution or not?

Yes, it is a correct solution, but non-physical for this problem.

@lululxvi
Copy link
Owner

If it is indeed a correct solution, then NN would find it with a high chance, because zero solution is easy to find. If we try to prevent this, there are a few things you can do:

  • If you know some other properties of other solutions, for example, you try to find a solution with a known point value such as u(x0) = u0, then you can use this as a loss. Basically, by add an extra constraint, the zero solution is not a valid solution any more.
  • If you don't have this, then what you can do could be using an "artificial" loss to prevent zero solution. For example, you may use a L2 regulization, but with a negative weight, so that it can force the NN to be away from all zeros.

There could be other ways, but there two are some ideas you can try.

@AneleNcube
Copy link
Author

AneleNcube commented Aug 31, 2021

Thank you for your assistance, @lululxvi.

It definitely makes sense that the NN finds the zero solution with a high chance compared to the more complex solutions. For the above problem I am given the reference solution, therefore I can apply your first suggest by using the solution at one point in the domain as an added constraint for the NN. However, there are other cases I am looking at where the reference solution is not provided, and here I would have to consider your second suggestion. Is the negative weight a way to increase the complexity of the NN model?

Regarding other alternatives, in another issue discussion #237 I came across an article https://arxiv.org/abs/2010.05075 that describes unsupervised neural networks for solving quantum eigenvalue problems. One of their examples is the above differential equation where the PyTorch package is used to build neural networks. In that case, the loss function consists of the PDE residual, but also has regularization functions which include:

CodeCogsEqn

These terms enable the NN to avoid non-trivial eigenfunctions and eigenvalues respectively. Can these terms be added easily in the loss function in DeepXDE? Solving the differential equations with the PyTorch NN seems to consume a lot of RAM, thus, if I could have the extra regularization functions in DeepXDE I could solve the eigenvalue problem with minimal RAM usage. How can one embed these terms in the loss function? I appreciate your thoughts on this.

@lululxvi
Copy link
Owner

lululxvi commented Sep 4, 2021

My intuition is a negative weight would increase the complexity of the NN model, but I didn't try this.

It is very easy to do. You can simply assume you have another "PDE" as 1/f^2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants