New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement regularization #28
Comments
I will take a look at this to see if we will include in Encog 3.1 or 3.2. I am in the process of finalizing features for 3.1, as we want to release it soon and move to a code freeze. Defiantly an important feature, though. Encog currently has two methods to combat overfitting. Crossvalidation and Early Stopping(new for 3.1). More info here, though these wiki pages are in need of expansion. |
Good! Early Stopping may be useful too, but regularization should be more powerful. Basic implementation (not regarding Workbench) shouldn't be much of a problem, as the regularization term is applied after the gradients are computed. |
I wrote a piece of code in Java for the regularization: I implemented a Strategy to do so.
|
poussevinm: I like the idea of implementing it as a Strategy. As for the regularization, I think the old values are not needed, so if I'm not mistaken (I haven't tested it), the above code can be simplified to:
In Encog 3.1 the weights are copied to GradientWorkers before |
My idea was that regularization is adding a term to the cost function and as the gradient is linear, you can process the influence of regularization in a second time. This is why i needed initial weights. This also means that the code does not depend on the way you process the gradient on training examples. |
Well, the problem is that |
Thanks for the contributed code, I will take a look. |
I see your point Petr. This is why I used the Thanks for your attention in my code. |
My point was, that |
Ok, my bad. I see my mistake now. |
Actually I think this code is still incorrect. You don't want to regularize weights from bias inputs. It's not clear to me though how to figure out if a weight is from a bias input when you have the flat representation. |
I agree with @thomasj02, the bias terms must not be included when regularizing. I'd appreciate if someone can fix that. Thank you. |
Well, allowing large biases gives our networks more flexibility in behaviour: large biases make it easier for neurons to saturate, which is sometimes desirable. So regularization on biases is not necessary. |
Since this was submitted, Encog has added dropout, L1 and L2. |
Hello,
please consider implementing regularization, as it is essential to deal with the overfitting problem.
I recommend watching 12 min. video "Regularization and Bias/Variance" of lesson X. ADVICE FOR APPLYING MACHINE LEARNING at https://class.coursera.org/ml/lecture/preview (Stanford ML course).
It would also be useful to enhance Encog Analyst - it could split data into 3 sets (training, cross validation, testing) and try to find the optimal regularization parameter automatically.
The text was updated successfully, but these errors were encountered: