How to use LM algorithm in a custom train loop with custom loss function? #16

mrleiyz · 2023-01-10T11:02:34Z

Dear Sir or Madam,
Thanks for your endeavor to develop such code.
And, I try to use it in a custom training loop, and the loss function is also custom (except for y_true, and y_pred, it has other input variables). So, what should I do to solve this problem by modifying your code?

Hope for your response ASAP.
Thanks a lot . @fabiodimarco

fabiodimarco · 2023-01-10T13:41:22Z

Hi,

to define a custom training loop you can create an instance of the Trainer class defined al line 273. Inside your training loop you can then call the member function train_step defined at line 593.

To use a custom loss function you can give a look at how I defined some of them. As an example you can look at: MeanSquaredError line 28 or CategoricalCrossentropy line 80. There are more in the same file that you can look at.
So the new loss function you define needs to have a call method that implements your loss and a residual which is used inside the LM optimiser.

The residual has to be defined so that:
loss = mean(residual^2)

It does not need to be literally like that, you can implement a more stable and computationally efficient expression as long as the final results is the one above.

Let me know if it helps.

mrleiyz · 2023-01-10T14:09:36Z

Thank you for your answer!! It is very helpful to me. To be honest, I am a beginner in the python language.
The situation is I have written a custom training loop according to the tensorflow guide,

Then I want to custom the loss function, namely, the input variables of the loss function are not only Y_pred and Y_true, there are some fixed constants(not need to be differentiated) that are used to calculate the corresponding loss.

My problem is, in my opinion(maybe is wrong),

we cannot pass two or more loss functions (including MSE loss and custom loss) to model.compile and/or model_wrapper.compile
according to your code, we can only pass the train_dataset to model.fit and/or model_wrapper.fit, but how to pass other parameters to it? just like fixed constants?

Because, in my original code, I need to pass X, Y_true, and fixed constants to the training loop, then two losses will be obtained, one is the MSE loss based on Y_true and Y_pred, another is the custom loss based on Y_pred and fixed constants, then the gradient descend will be conducted to get the gradient, finally, Adam is used to minimizing the loss. My goal is to replace the Adam optimizer with your LM optimizer so that I can obtain a much better result (You know the common gradient descent method cannot achieve the global optima). The idea of my code is shown as follows:

def train_step(x,y,w1):
    with tf.GradientTape() as tape:
        y_pred = model(x)
        loss1 = mse(y_pred , y)
        loss2 = custom_loss(y_pred , w1)  **# Note: w1 is a fixed constant**
        loss_tol= loss1+loss2
    gradients = tape.gradient(loss_tol, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))  **# Here, it needs to be replaced with LM optimizer**
    return loss_tol

So, I do hope you can give me some suggestions to realize this goal. Thank you again!!

fabiodimarco · 2023-01-10T14:23:31Z

Since w1 is a fixed constant you do not need to pass it as a parameter of the function, but you can save it as a member variable of your custom loss class.

class CustomLoss(tf.keras.losses.Loss):
    def __init__(self,
                 w1,
                 reduction=tf.keras.losses.Reduction.AUTO,
                 name='custom_loss'):
        super(CustomLoss, self).__init__(
            reduction=reduction,
            name=name)
        self.w1 = w1

    def call(self, y_true, y_pred):
        y_pred = tf.convert_to_tensor(y_pred)
        y_true = tf.cast(y_true, y_pred.dtype)

        loss1 = mse(y_pred , y_true)
        loss2 = custom_loss(y_pred , self.w1)
        loss_tol = loss1 + loss2

        return loss_tol

    def residuals(self, y_true, y_pred):
        y_pred = tf.convert_to_tensor(y_pred)
        y_true = tf.cast(y_true, y_pred.dtype)
        
        # you have to write here the code according to how you custom loss is defined
        # so that: loss = mean(residuals^2)

mrleiyz · 2023-01-10T14:28:12Z

Thank you for your patience indication!! It is very helpful!

I will try to realize it.

fabiodimarco · 2023-01-10T15:02:04Z

No problem. Let me know how it goes.

mrleiyz · 2023-01-11T06:16:48Z

Hi, Fabio,
Following your suggestions, today I try to realize it, I have mastered how to define the custom loss function and use the Class Trainer, but I find a new problem. Yesterday, in my problem statements, I forgot there is an input variable in my custom loss function, namely, the custom loss not only relies on the Y_true, Y_pred, and some fixed constants but also needs the input variable X_train.

So, I think the method of model.compile / model_wrapper.compile cannot achieve my goal, because such method seems to determine the loss before the model training, but my goal is to introduce X_train to calculate the custom loss during the training process, meanwhile, the loss using in compile method inherits from tf.keras.losses which only accept output variables (Y_true and Y_pred).

What about your advice?
Hope for your response ASAP.
Thanks a lot !!~~

fabiodimarco · 2023-01-11T09:08:21Z

A simple workaround is to include X_train in your Y data. So that Y_true is something like tf.stack([X_train, Y_train], axis=-1) or similar. I do not know the dimensionality of your data.

BTW: I do not think you need to use a custom train loop. I think it is easier for you to just use the ModelWrapper.

mrleiyz · 2023-01-11T12:30:39Z

Uhmm..., in my task, the data dimensions of X_train, and Y_train are not the same, and at first, I need to normalize the input and label, then during the training process, namely the calculation of custom loss, I need to inverse-normalize the input, label and predicted label. So the operation could be complex. Maybe I need to study the ModelWrapper carefully.

Anyway, thanks for your advice, I will try it, I hope I can bring good news to you.

fabiodimarco · 2023-01-11T14:21:24Z

If X and Y have a different dimension then you can have Y_train to be a list or a tuple of tensors (X, Y).

In order to use the ModelWrapper, I would consider trying to place all the extra operation that you need to do during training inside the CustomLoss or inside the Model itself.
In that way you can use all the callback provided by keras to save chekpoints, logging etc.

mrleiyz · 2023-01-12T09:56:58Z

Hi, Fabio, I am trying to realize the code. By the way, there are two questions I want to discuss with you.
The first one is, except to include X_train in Y data (as you mentioned above), how to pass other variables to model.fit ? it seems only accept two parameters, X and Y.

The second one is, I know the LM algorithm is used in the neural network toolbox of MATLAB in the early stage, in those years, we don't develop the technique of deep learning or deep neural network. Nowadays, we usually adopt deep networks, by google search, I know some people say the LM algorithm is useful for the shallow network with few neurons, but for the deep network with numerous trainable parameters, the algorithm has poor performance. I also see your test example is a simple nonlinear fitting, so, are you testing this algorithm with a more complex situation, such as image recognition or natural language process?

I also find an issue about using the algorithm in PINN, which is an interesting topic, but some physical problems or engineering problems could need a deep network, for such a situation, whether LM algorithm can get a better result than gradient descent?

Thanks.

fabiodimarco · 2023-01-12T13:28:53Z

No model.fit only accept X and Y, but since your loss is custom you can include into Y all the data you want even a dictionary so that you can access all the data you need to compute the loss (that is the way to do it).
The LM algorithm is mainly used to train shallow network as they have a small number of parameters.
The problem with LM is that the computational complexity is N^3 with respect to the number of parameters or the batch size depending on which one of the two is larger. You can find the details about it in the Memory, Speed and Convergence considerations section in the readme of the my repo.
You can tradeoff the complexity of LM by reducing the batch size, but the more you reduce it the more you are going towards a simple stochastic gradient descent and hence losing the advantages of LM.
So even if LM has better per step convergence rate (and sometimes can also converge to better training losses), the computational speed of computing each step for large models is the main reason gradient descent is still the standard for training large neural networks.

fabiodimarco closed this as completed Jan 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use LM algorithm in a custom train loop with custom loss function? #16

How to use LM algorithm in a custom train loop with custom loss function? #16

mrleiyz commented Jan 10, 2023 •

edited

Loading

fabiodimarco commented Jan 10, 2023

mrleiyz commented Jan 10, 2023 •

edited

Loading

fabiodimarco commented Jan 10, 2023 •

edited

Loading

mrleiyz commented Jan 10, 2023

fabiodimarco commented Jan 10, 2023

mrleiyz commented Jan 11, 2023

fabiodimarco commented Jan 11, 2023 •

edited

Loading

mrleiyz commented Jan 11, 2023 •

edited

Loading

fabiodimarco commented Jan 11, 2023

mrleiyz commented Jan 12, 2023

fabiodimarco commented Jan 12, 2023

How to use LM algorithm in a custom train loop with custom loss function? #16

How to use LM algorithm in a custom train loop with custom loss function? #16

Comments

mrleiyz commented Jan 10, 2023 • edited Loading

fabiodimarco commented Jan 10, 2023

mrleiyz commented Jan 10, 2023 • edited Loading

fabiodimarco commented Jan 10, 2023 • edited Loading

mrleiyz commented Jan 10, 2023

fabiodimarco commented Jan 10, 2023

mrleiyz commented Jan 11, 2023

fabiodimarco commented Jan 11, 2023 • edited Loading

mrleiyz commented Jan 11, 2023 • edited Loading

fabiodimarco commented Jan 11, 2023

mrleiyz commented Jan 12, 2023

fabiodimarco commented Jan 12, 2023

mrleiyz commented Jan 10, 2023 •

edited

Loading

mrleiyz commented Jan 10, 2023 •

edited

Loading

fabiodimarco commented Jan 10, 2023 •

edited

Loading

fabiodimarco commented Jan 11, 2023 •

edited

Loading

mrleiyz commented Jan 11, 2023 •

edited

Loading