Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NNODE fails on simple logistic curve #634

Closed
sdwfrost opened this issue Jan 25, 2023 · 10 comments
Closed

NNODE fails on simple logistic curve #634

sdwfrost opened this issue Jan 25, 2023 · 10 comments

Comments

@sdwfrost
Copy link

NeuralPDE.jl fails on this simple example of a logistic curve. I'm not sure whether this would be helped by changing the neural net (as the output is between 0 and 1), though I've tried lots of structures, as well as trying to refine the optimization. The ADAM+BFGS should work OK for this example.

using ModelingToolkit
using OrdinaryDiffEq
using NeuralPDE
using DomainSets
using Flux
using Optimization
using OptimizationOptimJL
using OptimizationOptimisers
using Plots

@parameters t
@variables x(..)
Dt = Differential(t)
eqs = [Dt(x(t)) ~ x(t)*(1-x(t))]

@named ode_sys = ODESystem(eqs)
ode_prob = ODEProblem(ode_sys, [0.01], (0.0,10.0), [])
ode_sol = solve(ode_prob, Tsit5(), saveat=0.1)
plot(ode_sol)

bcs = [x(0) ~ 0.01]
domains = [t  Interval(0.0, 10.0)];
@named pde_sys = PDESystem(eqs, bcs, domains, [t], [x(t)])

numhid = 32
chain = [Flux.Chain(Flux.Dense(1, numhid, Flux.σ), Flux.Dense(numhid, 1))]
grid_strategy = NeuralPDE.GridTraining(0.1)
discretization = NeuralPDE.PhysicsInformedNN(chain, grid_strategy)
pde_prob = NeuralPDE.discretize(pde_sys, discretization)

global i=1
callback = function (p,l)
    println("Epoch $i: Current loss is: $l")
    global i += 1
    return false
end

res = Optimization.solve(pde_prob, OptimizationOptimisers.Adam(0.1); callback=callback, maxiters=5000, abstol = 1e-10, reltol = 1e-10)
pde_prob = remake(pde_prob, u0 = res.minimizer)
res = Optimization.solve(pde_prob, OptimizationOptimJL.BFGS(); callback=callback, maxiters=5000, abstol = 1e-10, reltol = 1e-10)
phi = discretization.phi
ts = [infimum(d.domain):0.1:supremum(d.domain) for d in domains][1]
xpred  = hcat([phi[1]([t],res.u) for t in ts]...)'
plot(ode_sol)
plot!(ts,xpred)
@lhoenig
Copy link

lhoenig commented Jun 24, 2023

I think the problem is not strictly speaking with NeuralPDE but that depending on the initialization, the NN sometimes gets stuck in local minima. With networks this small, I think this is a real concern. Your code sometimes actually trains successfully for me using Adam, BFGS, or both. You could try making the network wider or deeper to increase chances of training success.

@sdwfrost
Copy link
Author

I've tried both wider and deeper (as well as both), and sometimes I get a sigmoidish curve, but still way off the true solution, and with many more parameters than a simple basis function approach would require. I was hoping that this example wouldn't require anything large or complex on the neural net side. The kind of problems I'm looking at will have the same problem as this simple logistic curve.

@lhoenig
Copy link

lhoenig commented Jun 25, 2023

I'm more confused after playing around with it more.. Sometimes the loss is really low at the end (like 1e-10) but the solution is totally wrong. For successful runs the loss is more like 1e-7.

Sometimes I get NaN solutions / parameters without an error.

I also tried using Lux instead of Flux as this is recommended in the docs to use double precision. But using Lux I was never able to train successfully, the loss got lower more consistently but the solutions just don't seem to work.

Another thing I just realized is that the issue title references NNODE (https://docs.sciml.ai/NeuralPDE/stable/manual/ode/#NeuralPDE.NNODE) but the code doesn't actually use NNODE (or is it used internally?). Maybe the NNODE specialization would work better as this is an ODE here, but I could not get NNODE to work based on the example code there. I always get "Optimization algorithm not found. Either the chosen algorithm is not a valid solver
choice for the OptimizationProblem, or the Optimization solver library is not loaded." even though everything is indeed loaded..

@lhoenig
Copy link

lhoenig commented Jun 25, 2023

sometimes I get a sigmoidish curve, but still way off the true solution

Fwiw, the successful runs do come very close to the sigmoid function for me. I was just able to get a successful solution using Lux, and it looks like this:

lux
But I am not really sure about the loss values. Sometimes it works, sometimes it doesn't, sometimes that is reflected in the loss, other times it doesn't seem to be correlated. Maybe it becomes clearer when looking at the individual loss terms (data and physics loss).

@sdwfrost
Copy link
Author

Sorry, the subject comes from a discussion of this issue on the Julia Discourse.

@lhoenig
Copy link

lhoenig commented Jun 26, 2023

I got NNODE to run, I was missing the Optimisers package. But it doesn't really fit the function more reliably than using a PDESystem + PhysicsInformedNN on this problem, for me actually less so. I really don't understand these highly variable loss values of the PINNs and how they relate to the solution quality. With the NNODE I achieve the same results as described by @sdwfrost, i.e. never really an exact fit as I sometimes got with the PDESystem + PhysicsInformedNN code.

Here's my code for using the NNODE:

using Flux
using NeuralPDE
using OrdinaryDiffEq, Optimisers
using Plots
import Lux, OptimizationOptimisers, OptimizationOptimJL

eq = (u, p, t) -> u .* (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.1)

function run()
    chain = Flux.Chain(Flux.Dense(1, 20, σ), Flux.Dense(20, 1))
    luxchain = Lux.Chain(Lux.Dense(1, 20, Lux.σ), Lux.Dense(20, 1))
    opt = OptimizationOptimJL.BFGS()
    # opt = OptimizationOptimisers.Adam(0.1)
    nnode_sol = solve(prob, NeuralPDE.NNODE(luxchain, opt), dt=1 / 10.0, verbose=true, abstol=1.0e-10, maxiters=5000)
    ts = tspan[1]:0.01:tspan[2]
    xpred = [nnode_sol(t) for t in ts]
    plot(ode_sol, label="ODE solution")
    plot!(ts, xpred, label="NNODE solution")
end

run()

@sdesai1287
Copy link
Contributor

sdesai1287 commented Aug 18, 2023

I've been playing with this and applying a new training strategy to it, and I am getting the mean absolute difference between the nnode solution and ode solution is 0.02. With some other strategies, this difference is sometimes very large which confuses me for such a simple problem. Here is the code, I am going to try to make this better

using OrdinaryDiffEq, Optimisers
using Plots, Statistics
import Lux, OptimizationOptimisers, OptimizationOptimJL

eq = (u, p, t) -> u * (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.01)

N = 4
func = Lux.σ
luxchain = Lux.Chain(Lux.Dense(1, N, func), Lux.Dense(N, N, func), Lux.Dense(N, 1,func))
# opt = OptimizationOptimJL.BFGS()
opt = OptimizationOptimisers.Adam(0.01)
weights = [1/14, 4/14, 4/14, 4/14, 1/14]
samples = 200
alg = NeuralPDE.NNODE(luxchain, opt, autodiff = false, strategy = NeuralPDE.WeightedIntervalTraining(weights, samples))
nnode_sol = solve(prob, alg, verbose=true, maxiters=50000, saveat=0.01)
ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
print(abs(mean(ode_sol .- nnode_sol)))

using Plots
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")

Here is the plot
plot_2

@sdesai1287
Copy link
Contributor

Seems like Quadrature Training works pretty well for this, I am getting mean difference in solutions to be 0.01 or less fairly consistently. Here is the code and a graph

using OrdinaryDiffEq, Optimisers, NeuralPDE
using Plots, Statistics
import Lux, OptimizationOptimisers, OptimizationOptimJL

eq = (u, p, t) -> u .* (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.01)

N = 4
func = Lux.σ
luxchain = Lux.Chain(Lux.Dense(1, N, func), Lux.Dense(N, N -1, func), Lux.Dense(N -1, N-2, func), Lux.Dense(N - 2, 1, func))
opt = OptimizationOptimisers.Adam(0.001)
alg = NeuralPDE.NNODE(luxchain, opt, autodiff = false, strategy = NeuralPDE.QuadratureTraining())
nnode_sol = solve(prob, alg, verbose=true, abstol = 1e-10, maxiters=100000, saveat=0.01)

ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
xreal = [ode_sol(t) for t in ts]
println(abs(mean(xpred .- xreal)))
println(abs(mean(nnode_sol .- ode_sol)))

using Plots
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")

plot_4

@ChrisRackauckas
Copy link
Member

👍 . I think we should change the docs to use QuadratureTraining in the tutorials and then close this. GridTraining shouldn't be used (as the docs say in other places).

@sathvikbhagavan
Copy link
Member

@ChrisRackauckas we can close this?
#729 addresses using QuadratureTraining in the docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants