Skip to content

openworld42/Backpropagation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Backpropagation



Backpropagation

Neural backpropagation (Java) with examples and training.

Maintenance Status dependencies License PRs Welcome


You can find java test/example programs in the test directory on Github.

👷‍♂️   TesterSimpleNumbers.java is the most simple example, training a one-hidden-layer backpropagation network to approximate a result (number) from to input numbers.

👷‍♀️   TesterXOR.java is a classic problem in artificial neural network research that highlights the major differences between a single-layer perceptron and one that has a few more layers. The XOR function is not linearly separable, meaning that a single-layer perceptron cannot converge on it. However, a perceptron with one hidden layer can accurately classify the XOR inputs.

🧑‍🔧   How to use it: Download the newest Github release backpropagation_vx.x.x.jar file . Write your own neuronal network application or start with one of the test program examples, referencing the jar like

java -cp backpropagation_vx.x.x.jar test/TesterSimpleNumbers

where x.x.x is the current version. You need a Java runtime/JDK installed (at least version 17 - check on command line using java -version). To get it: Linux: simply use your package manager, Windows/macOS/others: download and install JDK from here.
You may also build Backpropagation from scratch using Ant and the build.xml file.

Some hints for best practices, using literature:

Note

Hints can help to solve a problem, but an arbitrary problem may need another (special) treatment

  • depending on your problem to be solved, use one or two hidden layers (three are rare, four usually do not have advantages over three).

  • one hidden layer (or the first of the hidden layers) should have some more neurons than the input layer (around 10%, depending on the problem). With many training data sets the number of neurons may be increased - to store more "information".

  • too many neurons in hidden layers: the model does not "learn", it just "stores" the input as weights and increases the amount of computation (time, energy). Therefore an unknown new data to the model will not be recognized (the model does not "realize" the principle).

  • not enough neurons: the model is to small to learn "all the data".

  • sometimes it makes sense to start with a higher learning rate and reduce it after a while of training.

  • learning rates are usually in the range of 0.01 to 0.9

  • high learning rates may miss an optimum or oscillate over it.

  • low learning rates may increase computation time a lot, especially in "flat" areas of optimization and/or get stuck in local minima.

The weights (and biases) are initialized in a random way within this model, to avoid some initialization traps (the model uses "symmetry breaking").


Apache 2.0 licensed, therefore may be used in any other project/program.

🏅   Credits, Kudos and Attribution:

📖   References, algorithms and literature (see also   ☕   Javadoc overview for more clues):

  • David Kriesel, 2007, A Brief Introduction to Neural Networks, www.dkriesel.com
  • R. Rojas, Neural Networks, Springer-Verlag, Berlin

Contributions, examples (or a request 🙂) from any interested party are welcome - please open an issue with a short description.