Jobs fails when loading previous model #3

pengguo-seismo · 2022-01-02T08:54:57Z

Hi Paul,

I hope you are doing well.

I have a question when trying to running the python script. It requires to load a previous trained model, './model/checkpoint500.pt'. Can you please tell me how to obtain this model, or how to define the weights/biases for initializing the network?

many thanks in advance.

paulpuren · 2022-01-02T16:45:33Z

Hello,

Thank you for your interests in our research. Let us take 2D Burgers equation as an instance. Our goal is to solve the PDE for 1000 time steps. The procedure is to first initialize all the network parameters with function initialize_weights, and then train the model for 100 time steps and save the well-trained model as checkpoint100.pt. Second, we load the checkpoint100.pt as the initialized network parameters for training 200 time steps, then you save another well-trained model as checkpoint200.pt. After repeating many times, you will reach the milestone of 1000 time steps.

Hope that answer your question. Thank you!

norery · 2022-01-14T11:15:55Z

Thank you for your reply. I have the same problem. I observed that there was no adaptation in the code for multiple training rounds. For example, when I train a step 100 times, what should I change? How do I set the value of 'pre_model_save_path ='?
Thank you in advance!

paulpuren · 2022-01-14T15:35:07Z

Thank you for your reply. I have the same problem. I observed that there was no adaptation in the code for multiple training rounds. For example, when I train a step 100 times, what should I change? How do I set the value of 'pre_model_save_path ='? Thank you in advance!

Thank you for your question. Yes, we only show the code for 1000 time steps. When training for the 100 steps, you will directly apply the function initialize_weights, and you do not need pre_model_save_path for 100 steps.

LiShenshen123 · 2022-03-08T08:41:21Z

Hello, how do I get the parameter pre_model_save_path? Very confused, hope to get your help, thank you very much

paulpuren · 2022-03-08T15:09:27Z

Hello, how do I get the parameter pre_model_save_path? Very confused, hope to get your help, thank you very much

Thank you for your question. pre_model_save_path is for the pretrained model. Take 2D burgers as an example. If you pretrain the model starting from 100 steps, then 200 steps, 500 steps. For the 1st pretraining, you do not have pre_model_save_path and directly train the model with the network parameters being initialized based on the function initialize_weights. For the 2nd pretraining, you can initialize the network parameters with the learned model from the 1st pretraining (this is where pre_model_save_path works), and then further train it for 200 steps.

LiShenshen123 · 2022-03-09T01:14:55Z

For the first pre-training, how to train without pre_model_save_path directly using the network parameters initialized based on the function initialize_weights.
It has always reported an error: FileNotFoundError: [Errno 2] No such file or directory: './model/checkpoint500.pt'

paulpuren · 2022-03-09T21:29:51Z

For the first pre-training, how to train without pre_model_save_path directly using the network parameters initialized based on the function initialize_weights. It has always reported an error: FileNotFoundError: [Errno 2] No such file or directory: './model/checkpoint500.pt'

The checkpoint500.pt here is the saved model for training 500 time steps. We show the code for training 1000 time steps based on the pretrained model of 500 time steps, where you find the pre_model_save_path containing checkpoint500.pt. When first training for 100 time steps, you can name it as checkpoint100.

LiShenshen123 · 2022-03-10T01:39:01Z

I'm still confused, because I still can't run it successfully. I read that your code also needs a network pre-training weight for the first training. As for the network initialization weight you said, I don't know how to implement it. I see that a pre-trained model is loaded in the train function defined in your code. I'm messy, can you send me a debugged code on how to get the pretrained model in the first step. Really hope to get your help. My mailbox is 2858724272@qq.com. thank you very much!

paulpuren · 2022-03-10T04:27:11Z

I'm still confused, because I still can't run it successfully. I read that your code also needs a network pre-training weight for the first training. As for the network initialization weight you said, I don't know how to implement it. I see that a pre-trained model is loaded in the train function defined in your code. I'm messy, can you send me a debugged code on how to get the pretrained model in the first step. Really hope to get your help. My mailbox is 2858724272@qq.com. thank you very much!

Hi, we have tested the code. It works well. The code posted in the repo does not have bugs. You may modify it for your own purpose (e.g., for different pretraining schemes or different PDE systems).

Second, for the first training, you do not need pretrained network parameters (e.g., weights). They are initialized based on the function initialize_weights.

Third, the pretrained model is loaded unless there is pretraining happening. Namely, you will only need it after the 1st pretraining.

LiShenshen123 · 2022-03-10T05:18:36Z

I'm still confused, because I still can't run it successfully. I read that your code also needs a network pre-training weight for the first training. As for the network initialization weight you said, I don't know how to implement it. I see that a pre-trained model is loaded in the train function defined in your code. I'm messy, can you send me a debugged code on how to get the pretrained model in the first step. Really hope to get your help. My mailbox is 2858724272@qq.com. thank you very much!

Hi, we have tested the code. It works well. The code posted in the repo does not have bugs. You may modify it for your own purpose (e.g., for different pretraining schemes or different PDE systems).

Second, for the first training, you do not need pretrained network parameters (e.g., weights). They are initialized based on the function initialize_weights.

Third, the pretrained model is loaded unless there is pretraining happening. Namely, you will only need it after the 1st pretraining.

Thank you for your reply, this is my first training process and the error says that a pre-trained model is required. Is there any special setup required for the first pre-training? thank you

richardliuss · 2022-10-24T15:17:48Z

Hi Dr.Ren,
Would you please give us a detailed tutorial that can guide to finish the first training? Like how to modify the code, what kind of file structure is needed.
Please foregive me for my ignorance to your code. Because I am majored in computational fluid dynamics.
Thank you so much.
Richard

paulpuren closed this as completed Mar 10, 2022

paulpuren reopened this Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jobs fails when loading previous model #3

Jobs fails when loading previous model #3

pengguo-seismo commented Jan 2, 2022

paulpuren commented Jan 2, 2022

norery commented Jan 14, 2022

paulpuren commented Jan 14, 2022

LiShenshen123 commented Mar 8, 2022

paulpuren commented Mar 8, 2022

LiShenshen123 commented Mar 9, 2022

paulpuren commented Mar 9, 2022

LiShenshen123 commented Mar 10, 2022

paulpuren commented Mar 10, 2022

LiShenshen123 commented Mar 10, 2022

richardliuss commented Oct 24, 2022

Jobs fails when loading previous model #3

Jobs fails when loading previous model #3

Comments

pengguo-seismo commented Jan 2, 2022

paulpuren commented Jan 2, 2022

norery commented Jan 14, 2022

paulpuren commented Jan 14, 2022

LiShenshen123 commented Mar 8, 2022

paulpuren commented Mar 8, 2022

LiShenshen123 commented Mar 9, 2022

paulpuren commented Mar 9, 2022

LiShenshen123 commented Mar 10, 2022

paulpuren commented Mar 10, 2022

LiShenshen123 commented Mar 10, 2022

richardliuss commented Oct 24, 2022