A description of the repository and usage of the scripts is following soon.
- Add the Swish activation function in the simulation and potentially in the theory. Experiment with ReLU vs. Swish learning a rule defined by a sigmoidal Erf teacher.
- Drift processes and weight decay.
- Exact theoretical on-line ReLU dynamics for general learning rate, if possible.
- Hidden-to-output weights, additional layers, tree-like structures.
- Other learning scenarios, batch learning.
- Active learning