New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shallow Water Equations and Riemann Problems #247
Comments
The training loss is so small, but the solution is not correct. If the code is correct, then one possible reason is that your solution is O(0.001). Try to rescale your problem such that it is O(1). |
Thank you for the recommendation, @lululxvi , the quality of the solution improved dramatically. Now, since there is still some significant gap between the NN outcome and the analytic solution, I'm trying to replicate a procedure presented in https://doi.org/10.1016/j.cma.2019.112789 for a Riemann problem, the very same problem I'm facing: The problem is that I can't get the L-BFGS-B optimizer to iterate up to that epsilon condition: it stops way before. model.compile('adam', lr=0.0005)
model.train(epochs = 8000)
model.compile('L-BFGS-B')
early_stopping = EarlyStopping(min_delta=1e-16, patience=200000)
model.train(epochs = 200000, callbacks=[early_stopping]) but it didn't work. What's wrong? Thank you, |
@Ryszard2 |
@lululxvi |
|
@katayooneshkofti See FAQ. |
Yes, here's the code. Beware that, even if it seem pretty correct to me, I can't get to the scenario where the NN learns the correct solution with monster precision. from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import matplotlib.pyplot as plt
import numpy as np
import deepxde as dde
import time as time
import matplotlib
from matplotlib import cm
from deepxde.backend import tf
from deepxde.callbacks import EarlyStopping
dde.config.real.set_float64()
dim_input = 2
dim_output = 2
scale_h = 1000.0
Time = 6.0
X_min = 0.0
X_max = 10.0
X_0 = 5.0
h_L = 0.005 * scale_h
h_R = 0.001 * scale_h
g = 9.81 / scale_h
def pde(x, y):
h = y[:, 0:1]
u = y[:, 1:2]
U1 = h
U2 = h*u
E1 = h*u
E2 = h*u*u + 0.5 * h*h*g
E1_x = tf.gradients(E1, x)[0][:, 0:1]
E2_x = tf.gradients(E2, x)[0][:, 0:1]
U1_t = tf.gradients(U1, x)[0][:, 1:2]
U2_t = tf.gradients(U2, x)[0][:, 1:2]
Sgx = 0.0
Sfx = 0.0
S1 = 0.0
S2 = g*h * (Sgx - Sfx)
equaz_1 = U1_t + E1_x - S1
equaz_2 = U2_t + E2_x - S2
return [equaz_1, equaz_2]
def on_initial(_, on_initial):
return on_initial
def boundary(_, on_boundary):
return on_boundary
def boundary_0 (x, on_boundary):
return on_boundary and np.isclose(x[0], X_min)
def boundary_L (x, on_boundary):
return on_boundary and np.isclose(x[0], X_max)
def func_IC_h(X):
x = tf.cast(X[:, 0:1], dtype=tf.float64)
c1 = tf.math.less_equal(x, X_0)
f1 = tf.ones_like(x) * h_L
f2 = tf.ones_like(x) * h_R
f3 = tf.where(c1, f1, f2)
return f3
def func_IC_u(x):
return 0.0
def func_BC_h1(x):
return h_L
def func_BC_h2(x):
return h_R
def func_BC_u(x):
return 0.0
geom = dde.geometry.Interval(X_min, X_max)
timedomain = dde.geometry.TimeDomain(0.0, Time)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)
IC_h = dde.IC(geomtime, func_IC_h, on_initial, component = 0)
IC_u = dde.IC(geomtime, func_IC_u, on_initial, component = 1)
BC_h1 = dde.DirichletBC(geomtime, func_BC_h1, boundary_0, component = 0)
BC_h2 = dde.DirichletBC(geomtime, func_BC_h2, boundary_L, component = 0)
BC_u = dde.DirichletBC(geomtime, func_BC_u, boundary, component = 1)
BC = [IC_h, IC_u, BC_h1, BC_h2, BC_u]
data = dde.data.TimePDE(
geomtime, pde, BC,
num_domain = 6000,
num_boundary = 20,
num_initial = 200,
anchors = TOT_an)
net = dde.maps.FNN(layer_size = [dim_input] + [30]*4 + [dim_output],
activation = "tanh",
kernel_initializer = "Glorot uniform")
model = dde.Model(data, net)
model.compile('adam', lr=0.0005)
model.train(epochs = 8000)
model.compile('L-BFGS-B')
early_stopping = EarlyStopping(min_delta=1e-16, patience=200000)
model.train(epochs = 200000, callbacks=[early_stopping])
plot_Time = np.linspace(0.0, Time, 2)
N_nn = 500
X_nn = np.linspace(X_min, X_max, N_nn)
X_nn = np.reshape(X_nn,(len(X_nn),1))
fig_type = 0
for i,dummy in enumerate(plot_Time):
T_nn = np.ones((N_nn, 1)) * plot_Time[i]
Q_nn = np.hstack((X_nn, T_nn))
W_nn = model.predict(Q_nn)
Z_nn = W_nn[:,0] / scale_h
fig = plt.figure()
plt.plot(X_nn, Z_nn, 'r-', lw=3., label='NN Solution')
plt.legend(loc='best')
plt.title(r'T $= {:.2f}$ [s]'.format(plot_Time[i]))
plt.grid(True)
plt.show() |
It worked, the tweaks to the source code were crucial to extend the L-BFGS-B iterations as much as I wanted. I still could't get to a great result despite the mammoth 120 x 4 architecture. No problem. I'll settle for a solution that was pretty much acceptable around the shock. I have no more question, I thank you for your valuable help. |
@Ryszard2 |
I have recently downloaded the latest version of DeepXDE and I can't find train.py in the deepxde folder. How can I change the options for L-BFGS-B? |
Use |
Thank you for your contribution! |
Just to make sure, do I use this line before or after compiling the LBFGS?
or
|
Hello @engsbk, I'll try to explain what I've done and what I got:
I did it with a honest data = dde.data.TimePDE(
geomtime, pde, IC_BC,
num_domain = 6000,
num_boundary = 10,
num_initial = 1000)
net = dde.maps.FNN(
layer_sizes = [dim_input] + [60]*4 + [dim_output],
activation = "tanh",
kernel_initializer = "Glorot uniform")
model = dde.Model(data, net)
model.compile('adam', lr=0.001, loss_weights=[1,1,1e4,5e4])
model.restore(my_path, verbose=1) The results are decent but not great. In order to improve this model I'd recommend additional learning, say 300,000 epochs with Adam, this time with a smaller learning rate, say 1e-4 or 1e-5 or even smaller, depending on how well the convergence improves. That's just my opinion :) |
Thanks a lot!! I did not use the exact same approach you did, but I increased the IC weight and modified the gtol for L-BFGS, and I also increased the number of max iterations for L-BFGS. Results improved drastically! For some reason it was the L-BFGS for me not Adam. Loss for the Adam optimizer would get stuck at one value of error around 20,000 and would not perform any better even when I used 300,000 epochs. |
How about setting non-IC weight loss to be 0? |
I'm trying this option now, and I will update you with the results, but wouldn't this ignore training the FNN for BC and PDE? |
|
Yes, other optimizers are supported. Or, you may consider hard constraint of IC. |
I suggest using hard constraint of IC. |
Thanks @lululxvi for your reply. I will test using hard constraints for IC. Also, how can I add this optimizer to the options of optimizers supported in DeepXDE: https://github.com/juntang-zhuang/Adabelief-Optimizer I think it will be a good idea to test with it as well. |
Modify the code at https://github.com/lululxvi/deepxde/tree/master/deepxde/optimizers |
Thanks! I started modifying it. I added:
And I got this error:
So I used: Because according to the definition of but then I received this error:
While reading about the error, it seems like the tape required here is the tape that computed the loss, but I don't know what it is. It would be great if we worked together to add this optimizer to the DeepXDE library. It has a reputable performance. Thanks again @lululxvi ! |
Which backend do you use? TFv1 or v2? |
I'm using TF 2.7 but the beginning of my code says: |
Thanks @lululxvi ! |
Great. You may also send a PR. |
Thank you for the valuable recommendations! Sure.
|
Which backend did you use? TF v1 or v2? |
I was using TF v2.7 with "tensorflow" backend. |
What is the error of |
This was the error I got. |
Use TF v2 backend. |
but I still get this error when trying to use
However, I can import and run AdaBelief when using the other library |
I prefer tensorflow-addons over adabelief_tf, because tensorflow-addons is well maintained and keeps updated, but adabelief_tf has stopped updating since Dec 2020. How about you open a PR for the modified code of tensorflow-addons? I will look at the code there and figure out the issue. |
I've made the pull requests. I have modified the optimizers.py code for both TFv1 and TFv2 backends. |
My general suggestion is that you start with a similar but simpler problem (smaller domain, shorter time, low-frequency solution, etc); after you have solved the simple one, then move to the current one. |
@lululxvi
is available? and should c be 0.1 or 10? |
See FAQ. |
Hi @Ryszard2 :), |
Hello Riccardo!@Ryszard2 |
Hello Lu Lu,
DeepXDE is great and I am trying to use it to deal with a complex Dam Break problem, that is:
The 2D Shallow Water system I wrote is a pure hyperbolic problem without any sort of laplacian contribution or viscosity.
I set up the training points only with the anchors, so that I can control the interval in every dimension:
that results in 150,801 training points.
The NN is a
2 + [40]*8 + 2
architecture trained with adam and 10000 epochs. Then it takes only 13 steps with L-BFGS-B and I get the message:INFO:tensorflow:Optimization terminated with:
Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
Objective function value: 0.000000
Number of iterations: 11
Number of functions evaluations: 13
The losses at step 10013 are:
more precisely:
[1.03e-08, 4.94e-10, 4.20e-07, 1.98e-09, 9.10e-10, 2.34e-09, 9.91e-10]
Even though I used a lot of training points and a very deep NN, to obtain a solution that fits the analytic one seems pretty much out of reach for me, so far:
I tried to add a laplacian with a little bit of viscosity in the momentum equation but I could’t get to a better result.
I noticed that applying the hard constraint for the IC of a Riemann problem does not work because the initial discontinuity does not move from the initial point. Essentially, I can’t get the Riemann problem to an evolution if I hard constrain the IC.
What is wrong in my approach? What should be tweaked?
Thank you,
Riccardo
The text was updated successfully, but these errors were encountered: