- factor out gradient clipping and NaN checking
- factor out clearing of gradients
- configurable update methods
- LSTMOMP option
- single mat option
- full1 everywhere
- implement LSTM alternative
- implement per-class or per-step weights
- OMP parallel training
- add convolutional layers