Big Rescaling + O(0.001) IC + Complicated-Shape Domain #345

Ryszard2 · 2021-07-23T09:39:43Z

Good morning Dr. @lululxvi,

I got stuck in a problem I already asked about some time ago. It's a Riemann Problem of the Shallow Water Equations where the domain is complicated and big:

Lmax = 16,500 m
T = 2,000 s

That's the original IC for the Water Height:

I massively rescaled the problem in such a way to keep the equations unchanged, I decided to do it this way because otherwise some coefficients very very far from unitary value emerge in the equations.
Now:

Lmax = 1 m
T = 15.57 s
IC unfortunately becomes O(0.001)

The problems are that:

I can't get the training to convergence in no way at all, basically because the Loss for the IC for the height can't get down
I can't hard-constrain the ICs nor the BCs because otherwise I get nans
Trying and trying I kind of concluded that I need a lot of training points, something that can be done with the small-batch resampling, but still the IC for the height keeps the Loss pretty high
The input data come all in the form of matrices, that seems definitely to be a problem for the learning of the IC

The code seems pretty correct to me, I can't figure out at all what I'm missing to get the job done.

Thank you
Riccardo

lululxvi · 2021-07-28T05:03:27Z

As you discovered, the first issue could be the training of IC. Could you try only with IC loss to see whether the IC can be trained well? You can set the PDE to be None or the weight of PDE is 0.

Ryszard2 · 2021-08-13T09:48:30Z

Thank you @lululxvi

Say I run a couple-hundred-thousands epochs to learn the one IC that bothers me (there are 3 ICs, but the other 2 are not a big deal), and I end up with a train loss like that:

[0.00e+00, 0.00e+00, 0.00e+00, 4.19e-07, 0.00e+00, 0.00e+00, 0.00e+00, 0.00e+00]

as a result of this loss weights setup:

model.compile('adam', lr=0.001, loss_weights=[0, 0, 0, 1e2, 0, 0, 0, 0])

How do I proceed in the next training phase?
Am I supposed to set the remaining weights so that every component of the loss is O(1e-7) to match the loss I've achieved with the IC component?

By the way, I settled for a [50]*4 NN, I can't afford a bigger NN :-(

Thank you
Riccardo

Ryszard2 · 2021-08-13T09:54:22Z

Another question, @lululxvi

is it reasonable to sample 1,000 training points, let the training go for like 20,000 epochs, then sample 1,000 new training points for other 20,000 epochs and going this way up to some hundred thousands epochs?

lululxvi · 2021-08-18T00:57:23Z

You can use the two-stage training in this paper: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007575 with the code https://github.com/alirezayazdani1/SBINNs.

Yes, resampling makes sense. Here is an example: https://github.com/lululxvi/deepxde/blob/master/examples/diffusion_1d_resample.py using dde.callbacks.PDEResidualResampler

Ryszard2 · 2021-08-25T17:12:11Z

Thank you @lululxvi

Do the numbers in diffusion_1d_resample.py

num_domain   = 40,
num_boundary = 20,
num_initial  = 10,

resampler = dde.callbacks.PDEResidualResampler(period=100)

make sense also in the 2D small domain I'm working on? Or rather, the number of points should be increased, keeping the resampling period fixed at 100?

If this works, it would totally bypass - in a limited-computational-power-available situation - the annoying trade-off between the NN size and the number of points.

lululxvi · 2021-09-01T03:16:26Z

The resampling period seems OK. You may increase the other three numbers by a factor of 10. It always good to have more points if the computation cost is not a problem.

Ryszard2 · 2021-09-01T07:04:07Z

Thank you @lululxvi

First of all, I totally scaled the IC like this:

But still the learning of it is a challenge I never faced before. I did it so far without small-batch, only by setting to zero the non-BC loss weights. After 600k epochs, still some bothersome non-physical negative water remains, stucked in the narrow coves near the border

Now I'm going all-in for other hundreds of thousands of epochs, working on the good-but-not-great model I already got to after 600k epochs, for IC-BC only with this setting:

num_domain   = 0,
num_boundary = 400,
num_initial  = 400,

resampler = dde.callbacks.PDEResidualResampler(period=100)

hoping for a faster learning of this absurd IC :-(
I prefer not to go on with the learning of the PDE outputs until the IC is properly learned.

Edit: It didn't work out, this setting got the learning of the IC way worse :-(

Re-Edit: I noticed that setting the training like this:

num_domain   = 0,
num_boundary = 1000,
num_initial  = 4000,

resampler = dde.callbacks.PDEResidualResampler(period=100)

works a little better. Not as good as a full gradient descent with 10,000 IC points, but seems like sampling a zillion IC points every 100 epochs, for this specific case, could bring to something good. I can't figure out why is that.

lululxvi · 2021-09-05T16:51:14Z

To clarify, you currently only learn the IC and BC, right? Does the full gradient descent with 10,000 IC points work?

Ryszard2 · 2021-09-08T07:05:55Z

@lululxvi I made a simple test and I realized that the share of sampled points that goes to the complex non-homogeneous little sector is less than 10% of the points generated, i.e. the 90% of the training points are placed in the easy homogeneous big sector!

This is a pretty big waste, so I decided to anchor the points by myself with the random_points generator:

With such training points' number and distribution - with the [256]*5 NN I'm finally using - the full gradient descent makes totally sense and I'm sure that the learning of the IC will be perfect.

Still, the spatio-temporal problem of PDE and BC remains.
The highly irregular distribution of IC forces me to choose the training points, I'll anchor the points by myself but I can't process a full gradient descent with that amount of points, so I'd ask you:

say that I anchor a zillion points in the spatio-temporal domain, would it be possible for DeepXDE to split that array in small batches? I would like to:
1. create the giant array of training point for IC, BC, Domain
2. ask DeepXDE to take 1,000 points, train 'em for 100 epochs, then take the next 1,000. So on till the end of the array.

Is it possible?
Maybe every small-batch should include points from every component (PDE and BC and IC)?

Honestly I can't see another way to tacke this problem.

lululxvi · 2021-09-16T02:17:37Z

It is great to see that you made some good progress.

If the training points are sampled by yourself, then DeepXDE won't split them into batches. Currently DeepXDE only splits the points that sampled by DeepXDE. Here are a few thoughs:

Actually I think you may not need a zillion points. Usually, a relatively dense and fixed points works well enough, and there is no need to use lots of batches of differenet points.
If you still want to use batches, then you may have to modify the function:

deepxde/deepxde/data/pde.py

Line 188 in 6999a9d

def resample_train_points(self):

to define how to resample each time by yourself.
Then you can use it together with dde.callbacks.PDEResidualResampler

Ryszard2 · 2021-09-16T06:42:27Z

Indeed, @lululxvi, I was now thinking about a kinda trade-off:

I sample the residual points by myself - with the nonuniform distribution I desire - ONLY for the IC (and maybe the BC too, I'll think about it)
I let DeepXDE do the sampling/resampling of the domain points

In my idea, I would make sure to learn well the complicated IC. There wouldn't be strict need of my anchoring afterwards, since the hyperbolic nature of the problem would move the solution downstream, where the polygon widens and the training points are naturally more.

Rdfing · 2021-09-18T14:57:07Z

@Ryszard2 I am very interested in this project. Could you please share a link to your paper once you get the DeepXDE working on 2D SWE? Thanks, Haochen

Ryszard2 · 2021-09-22T13:24:27Z

@Rdfing actually this is a test case for my MSc thesis, a considerably challenging case.

Rdfing · 2021-09-24T18:28:12Z

@Rdfing actually this is a test case for my MSc thesis, a considerably challenging case.

Wow, that is impressive! I get SWE to work with PINNs for some simple 2D benchmark cases, but have not tried for this benchmark.

FZUcipher · 2022-06-01T09:07:40Z

As you discovered, the first issue could be the training of IC. Could you try only with IC loss to see whether the IC can be trained well? You can set the PDE to be None or the weight of PDE is 0.

@lululxvi Excuse me, Lulu. I would like to ask whether to set the weight of all but IC to 0 (including BC and PDE), or just set the weight of PDE to 0?

lululxvi · 2022-06-03T19:53:44Z

@FZUcipher See FAQ Q: I failed to train the network or get the right solution, e.g., large training loss, unbalanced losses.

thevelvetunderground · 2023-05-27T09:27:37Z

Thank you @lululxvi

First of all, I totally scaled the IC like this:

But still the learning of it is a challenge I never faced before. I did it so far without small-batch, only by setting to zero the non-BC loss weights. After 600k epochs, still some bothersome non-physical negative water remains, stucked in the narrow coves near the border

Now I'm going all-in for other hundreds of thousands of epochs, working on the good-but-not-great model I already got to after 600k epochs, for IC-BC only with this setting:
num_domain   = 0,
num_boundary = 400,
num_initial  = 400,

resampler = dde.callbacks.PDEResidualResampler(period=100)
hoping for a faster learning of this absurd IC :-( I prefer not to go on with the learning of the PDE outputs until the IC is properly learned.

Edit: It didn't work out, this setting got the learning of the IC way worse :-(

Re-Edit: I noticed that setting the training like this:
num_domain   = 0,
num_boundary = 1000,
num_initial  = 4000,

resampler = dde.callbacks.PDEResidualResampler(period=100)
works a little better. Not as good as a full gradient descent with 10,000 IC points, but seems like sampling a zillion IC points every 100 epochs, for this specific case, could bring to something good. I can't figure out why is that.

Hello I want to know how do you visualize the training points randomly sampled in the geometry?

gongsunlijiu · 2023-11-08T08:35:30Z

@lululxvi I made a simple test and I realized that the share of sampled points that goes to the complex non-homogeneous little sector is less than 10% of the points generated, i.e. the 90% of the training points are placed in the easy homogeneous big sector!

This is a pretty big waste, so I decided to anchor the points by myself with the random_points generator:

With such training points' number and distribution - with the [256]*5 NN I'm finally using - the full gradient descent makes totally sense and I'm sure that the learning of the IC will be perfect.

Still, the spatio-temporal problem of PDE and BC remains. The highly irregular distribution of IC forces me to choose the training points, I'll anchor the points by myself but I can't process a full gradient descent with that amount of points, so I'd ask you:

say that I anchor a zillion points in the spatio-temporal domain, would it be possible for DeepXDE to split that array in small batches? I would like to:

create the giant array of training point for IC, BC, Domain

ask DeepXDE to take 1,000 points, train 'em for 100 epochs, then take the next 1,000. So on till the end of the array.

Is it possible? Maybe every small-batch should include points from every component (PDE and BC and IC)?

Honestly I can't see another way to tacke this problem.

@Ryszard2 Hello, I see your question about the sampling points, which is very effective.
I am now thinking about the same question and would like to ask you how you customize the sampling points to achieve different sampling densities for narrow and wide areas. Thank you very much

Ryszard2 closed this as completed Nov 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Big Rescaling + O(0.001) IC + Complicated-Shape Domain #345

Big Rescaling + O(0.001) IC + Complicated-Shape Domain #345

Ryszard2 commented Jul 23, 2021 •

edited

lululxvi commented Jul 28, 2021

Ryszard2 commented Aug 13, 2021

Ryszard2 commented Aug 13, 2021

lululxvi commented Aug 18, 2021

Ryszard2 commented Aug 25, 2021

lululxvi commented Sep 1, 2021 •

edited

Ryszard2 commented Sep 1, 2021 •

edited

lululxvi commented Sep 5, 2021

Ryszard2 commented Sep 8, 2021 •

edited

lululxvi commented Sep 16, 2021

Ryszard2 commented Sep 16, 2021 •

edited

Rdfing commented Sep 18, 2021

Ryszard2 commented Sep 22, 2021

Rdfing commented Sep 24, 2021

FZUcipher commented Jun 1, 2022

lululxvi commented Jun 3, 2022

thevelvetunderground commented May 27, 2023

gongsunlijiu commented Nov 8, 2023

Big Rescaling + O(0.001) IC + Complicated-Shape Domain #345

Big Rescaling + O(0.001) IC + Complicated-Shape Domain #345

Comments

Ryszard2 commented Jul 23, 2021 • edited

lululxvi commented Jul 28, 2021

Ryszard2 commented Aug 13, 2021

Ryszard2 commented Aug 13, 2021

lululxvi commented Aug 18, 2021

Ryszard2 commented Aug 25, 2021

lululxvi commented Sep 1, 2021 • edited

Ryszard2 commented Sep 1, 2021 • edited

lululxvi commented Sep 5, 2021

Ryszard2 commented Sep 8, 2021 • edited

lululxvi commented Sep 16, 2021

Ryszard2 commented Sep 16, 2021 • edited

Rdfing commented Sep 18, 2021

Ryszard2 commented Sep 22, 2021

Rdfing commented Sep 24, 2021

FZUcipher commented Jun 1, 2022

lululxvi commented Jun 3, 2022

thevelvetunderground commented May 27, 2023

gongsunlijiu commented Nov 8, 2023

Ryszard2 commented Jul 23, 2021 •

edited

lululxvi commented Sep 1, 2021 •

edited

Ryszard2 commented Sep 1, 2021 •

edited

Ryszard2 commented Sep 8, 2021 •

edited

Ryszard2 commented Sep 16, 2021 •

edited