Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGD/small-batch #320

Closed
Ryszard2 opened this issue Jun 24, 2021 · 2 comments
Closed

SGD/small-batch #320

Ryszard2 opened this issue Jun 24, 2021 · 2 comments

Comments

@Ryszard2
Copy link

Ryszard2 commented Jun 24, 2021

Hello @lululxvi and @smao-astro

I can’t figure out if SGD or small-batch is definitively implemented and available in DeepXDE, without any adjustment to the source code by the user.

I’m trying to train a [3] + [128]*4 + [3] NN in a 2D Riemann problem where I would like to use some tens of thousands of training points. Doing it with the full gradient descent would take like one hour for a single 1000 epochs step (the max power available to me is the Google Colab GPU, whatever GPU it is).

I tried a small-batch-like approach by myself, without touching the source code, also because I’m not sure of what to do.
It’s more or less like this:

  1. I train a small batch of domain/BC/IC, like 200/100/100 for like 10000 epochs, saving the model, save_better_only.

  2. I start over the training, restoring the model, for a brand-new set of 200/100/100 points for another 10000 epochs.

And so on...

One thing I do not like of this approach is that I would like to save really an improved model, but every time I start over the training, after the first new 1000 epochs I get the message of the train loss improved from inf to the new loss, say:

Epoch 1000: train loss improved from inf to 8.91e-01, saving model to ...

where 8.91e-01 is way higher than the loss I got to at the end of the previous 10000 epochs run. I feel like I’m updating bad the hyperparameters every time I start over with the training.

Another approach I’m testing, inspired from #305, is:

  1. I train a medium-size set of IC points, say 5000 points for 10000 epochs, I set to zero the weights and the points for PDE and BC. I save the model, save_better_only.
  2. I start over the training, restoring the model, for a brand-new set of 0/0/5000 points for another 10000 epochs.

And so on...

  1. When I’m satisfied with the results with the IC, I take on the BC part, keeping always some points for the IC, say 0/5000/100
  2. When I’m satisfied with the results with the BC, I take on the PDE part, keeping always some points for BC and IC, say 5000/100/100

That’s what I’m thinking about.
I’m not sure it is a sound way to train a NN.

I’m still in the middle of it, it still “improves” the train loss from inf to the new loss every time it starts over, but it seems to work better than the first approach.

Thank you
Riccardo

@lululxvi
Copy link
Owner

lululxvi commented Jun 28, 2021

Yes, SGD/mini-batch is available in DeepXDE. You can use dde.callbacks.PDEResidualResampler(period=100), see an example https://github.com/lululxvi/deepxde/blob/master/examples/diffusion_1d_resample.py It will resample a new batch of training points every 100 iterations.

@Ryszard2
Copy link
Author

Ryszard2 commented Jun 28, 2021

Thank you @lululxvi , I'm running it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants