Implement regularization #28

PetrToman · 2012-01-21T16:23:46Z

Hello,
please consider implementing regularization, as it is essential to deal with the overfitting problem.

I recommend watching 12 min. video "Regularization and Bias/Variance" of lesson X. ADVICE FOR APPLYING MACHINE LEARNING at https://class.coursera.org/ml/lecture/preview (Stanford ML course).

It would also be useful to enhance Encog Analyst - it could split data into 3 sets (training, cross validation, testing) and try to find the optimal regularization parameter automatically.

seemasingh · 2012-01-25T16:24:20Z

I will take a look at this to see if we will include in Encog 3.1 or 3.2. I am in the process of finalizing features for 3.1, as we want to release it soon and move to a code freeze. Defiantly an important feature, though. Encog currently has two methods to combat overfitting. Crossvalidation and Early Stopping(new for 3.1). More info here, though these wiki pages are in need of expansion.

http://www.heatonresearch.com/wiki/Overfitting

PetrToman · 2012-01-26T15:27:47Z

Good! Early Stopping may be useful too, but regularization should be more powerful. Basic implementation (not regarding Workbench) shouldn't be much of a problem, as the regularization term is applied after the gradients are computed.

ghost · 2012-03-21T08:34:05Z

I wrote a piece of code in Java for the regularization: I implemented a Strategy to do so.
I only tested it for ResilientPropagation.
Feel free to use it and make remarks.

public class RegularizationStrategy implements Strategy {

    private double lambda; // Weight decay
    private MLTrain train;
    private double[] weights;

    public RegularizationStrategy(double lambda) {
        this.lambda = lambda;
    }

    @Override
    public void init(MLTrain train) {
        this.train = train;
    }

    @Override
    public void preIteration() {
        try {
            weights = ((Propagation) train).getFlatTraining()
                    .getNetwork().getWeights();
        } catch (Exception e) {
            weights = null;
        }
    }

    @Override
    public void postIteration() {
        if (weights != null) {
            double[] newWeights = ((Propagation) train).getFlatTraining()
                    .getNetwork().getWeights();
            for (int i = 0; i < newWeights.length; i++) {
                newWeights[i] -= lambda * weights[i];
            }
            ((Propagation) train).getFlatTraining()
            .getNetwork().setWeights(newWeights);
        } else {
            System.err.println("Error in RegularizationStrategy, weights are null but should not be.");
        }
    }

}

PetrToman · 2012-03-22T14:31:46Z

poussevinm: I like the idea of implementing it as a Strategy. As for the regularization, I think the old values are not needed, so if I'm not mistaken (I haven't tested it), the above code can be simplified to:

public void postIteration() {
    double[] weights = ((Propagation) train).getFlatTraining().
                       getNetwork().getWeights();

    for (int i = 0; i < weights.length; i++) {
        weights[i] += lambda * weights[i];   // also using +
    }
}

In Encog 3.1 the weights are copied to GradientWorkers before postIteration() is called (see Propagation.iteration()), so I guess this code wouldn't work. I suggest to introduce a new Strategy method, something like public void postGradient(), to resolve this.

ghost · 2012-03-22T16:46:28Z

My idea was that regularization is adding a term to the cost function and as the gradient is linear, you can process the influence of regularization in a second time.
So I took the initial weights before the modification by the part of the gradient that is processed with learning examples and let this part of the gradient do its work.
Once it is done, i simply added the gradient of the regularization term.

This is why i needed initial weights. This also means that the code does not depend on the way you process the gradient on training examples.

PetrToman · 2012-03-23T11:35:09Z

Well, the problem is that weights = newWeights in postIteration(), because the array is not cloned, but assigned by a reference (try printing out values).

jeffheaton · 2012-03-26T18:49:27Z

Thanks for the contributed code, I will take a look.

ghost · 2012-03-27T07:51:56Z

I see your point Petr. This is why I used the setWeights(double[]) method in my postIteration() method.
((Propagation) train).getFlatTraining().getNetwork().setWeights(newWeights);

Thanks for your attention in my code.
Do you want me to comment/document it ?

PetrToman · 2012-03-27T08:18:53Z

My point was, that weights actually don't keep the old values. Take a look at Jeff's code (1aa783d), I think this is the way you meant to implement it.

ghost · 2012-03-27T08:20:06Z

Ok, my bad. I see my mistake now.
Thanks.

jeffheaton · 2012-03-27T11:19:57Z

Okay I implemented this, with the code fix, in Encog 3.2. Have not played with it much yet. Also added issues #96 and #97 to make this easily used in the workbench.

thomasj02 · 2012-10-10T23:12:38Z

Actually I think this code is still incorrect. You don't want to regularize weights from bias inputs. It's not clear to me though how to figure out if a weight is from a bias input when you have the flat representation.

joetanto · 2015-06-08T17:27:45Z

I agree with @thomasj02, the bias terms must not be included when regularizing. I'd appreciate if someone can fix that. Thank you.

vincenzodentamaro · 2015-08-19T16:12:08Z

Well, allowing large biases gives our networks more flexibility in behaviour: large biases make it easier for neurons to saturate, which is sometimes desirable. So regularization on biases is not necessary.

jeffheaton · 2017-09-03T19:42:26Z

Since this was submitted, Encog has added dropout, L1 and L2.

PetrToman mentioned this issue Mar 13, 2012

Enhance error metrics of binary classification #91

Open

This was referenced Mar 27, 2012

Allow strategies to be specified for Encog Workbench GUI Training #96

Open

Allow strategies to be specified for Encog Analyst Training #97

Open

daydreamt mentioned this issue Jan 11, 2014

testTemporal fails (and so does mvn package) #163

Closed

lasha81 mentioned this issue Feb 18, 2015

problem with train.getError() #203

Closed

jeffheaton closed this as completed Sep 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement regularization #28

Implement regularization #28

PetrToman commented Jan 21, 2012

seemasingh commented Jan 25, 2012

PetrToman commented Jan 26, 2012

ghost commented Mar 21, 2012

PetrToman commented Mar 22, 2012

ghost commented Mar 22, 2012

PetrToman commented Mar 23, 2012

jeffheaton commented Mar 26, 2012

ghost commented Mar 27, 2012

PetrToman commented Mar 27, 2012

ghost commented Mar 27, 2012

jeffheaton commented Mar 27, 2012

thomasj02 commented Oct 10, 2012

joetanto commented Jun 8, 2015

vincenzodentamaro commented Aug 19, 2015

jeffheaton commented Sep 3, 2017

Implement regularization #28

Implement regularization #28

Comments

PetrToman commented Jan 21, 2012

seemasingh commented Jan 25, 2012

PetrToman commented Jan 26, 2012

ghost commented Mar 21, 2012

PetrToman commented Mar 22, 2012

ghost commented Mar 22, 2012

PetrToman commented Mar 23, 2012

jeffheaton commented Mar 26, 2012

ghost commented Mar 27, 2012

PetrToman commented Mar 27, 2012

ghost commented Mar 27, 2012

jeffheaton commented Mar 27, 2012

thomasj02 commented Oct 10, 2012

joetanto commented Jun 8, 2015

vincenzodentamaro commented Aug 19, 2015

jeffheaton commented Sep 3, 2017