Listed below are the key features that are to be developed:
- Restructure API to handle both dataframes + matrices.
- Implement regression.
- Implement confidence estimation (classification with probability output).
- Integrate WEKA.
- Implement feature selection (wrap to another library).
- Implement forkjoin transformer (variant of split-apply-combine with split replaced with generate)
- Implement more ensemble methods (e.g. heterogeneous Adaboost).
- Enable parallel / distributed learning.
- Implement more metrics (e.g. AUR).