-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use LM algorithm in a custom train loop with custom loss function? #16
Comments
Hi, to define a custom training loop you can create an instance of the To use a custom loss function you can give a look at how I defined some of them. As an example you can look at: The residual has to be defined so that: It does not need to be literally like that, you can implement a more stable and computationally efficient expression as long as the final results is the one above. Let me know if it helps. |
Thank you for your answer!! It is very helpful to me. To be honest, I am a beginner in the python language. Then I want to custom the loss function, namely, the input variables of the loss function are not only Y_pred and Y_true, there are some fixed constants(not need to be differentiated) that are used to calculate the corresponding loss. My problem is, in my opinion(maybe is wrong),
Because, in my original code, I need to pass X, Y_true, and fixed constants to the training loop, then two losses will be obtained, one is the MSE loss based on Y_true and Y_pred, another is the custom loss based on Y_pred and fixed constants, then the gradient descend will be conducted to get the gradient, finally, Adam is used to minimizing the loss. My goal is to replace the Adam optimizer with your LM optimizer so that I can obtain a much better result (You know the common gradient descent method cannot achieve the global optima). The idea of my code is shown as follows:
So, I do hope you can give me some suggestions to realize this goal. Thank you again!! |
Since w1 is a fixed constant you do not need to pass it as a parameter of the function, but you can save it as a member variable of your custom loss class.
|
Thank you for your patience indication!! It is very helpful! I will try to realize it. |
No problem. Let me know how it goes. |
Hi, Fabio, So, I think the method of model.compile / model_wrapper.compile cannot achieve my goal, because such method seems to determine the loss before the model training, but my goal is to introduce X_train to calculate the custom loss during the training process, meanwhile, the loss using in compile method inherits from tf.keras.losses which only accept output variables (Y_true and Y_pred). What about your advice? |
A simple workaround is to include BTW: I do not think you need to use a custom train loop. I think it is easier for you to just use the ModelWrapper. |
Uhmm..., in my task, the data dimensions of X_train, and Y_train are not the same, and at first, I need to normalize the input and label, then during the training process, namely the calculation of custom loss, I need to inverse-normalize the input, label and predicted label. So the operation could be complex. Maybe I need to study the ModelWrapper carefully. Anyway, thanks for your advice, I will try it, I hope I can bring good news to you. |
If X and Y have a different dimension then you can have Y_train to be a list or a tuple of tensors (X, Y). In order to use the ModelWrapper, I would consider trying to place all the extra operation that you need to do during training inside the CustomLoss or inside the Model itself. |
Hi, Fabio, I am trying to realize the code. By the way, there are two questions I want to discuss with you. The second one is, I know the LM algorithm is used in the neural network toolbox of MATLAB in the early stage, in those years, we don't develop the technique of deep learning or deep neural network. Nowadays, we usually adopt deep networks, by google search, I know some people say the LM algorithm is useful for the shallow network with few neurons, but for the deep network with numerous trainable parameters, the algorithm has poor performance. I also see your test example is a simple nonlinear fitting, so, are you testing this algorithm with a more complex situation, such as image recognition or natural language process? I also find an issue about using the algorithm in PINN, which is an interesting topic, but some physical problems or engineering problems could need a deep network, for such a situation, whether LM algorithm can get a better result than gradient descent? Thanks. |
|
Dear Sir or Madam,
Thanks for your endeavor to develop such code.
And, I try to use it in a custom training loop, and the loss function is also custom (except for y_true, and y_pred, it has other input variables). So, what should I do to solve this problem by modifying your code?
Hope for your response ASAP.
Thanks a lot . @fabiodimarco
The text was updated successfully, but these errors were encountered: