This will be the last release supporting Java 6
- General Bug Fixes
- New Jaccard Distance
- New SMOTE and Borderline-SMOTE algorithms for class imbalance
- New SDCA algorithm for general purpose (classification & regression) linear model solving (with elastic-net regularization)
- Added DataWriter interface for CSV, LIBSVM, and JSATData file types. This is particularly useful for cases where you want multiple threads writing out to the file at once.
A small number of commits in this release, but a fair amount packed into them!
- Added new CPM classifier, another fast but non-linear method similar to AMM
- Improved a number of the Vector Collections / Metric spaces
- Fixed bug in RBC
- Huge performance improvements to KDTree
- VPMV avoids pathological worst case performance
- A number of the collections now support incremental insertion (and a new interface for that feature too)
- CoverTree algorithm added
- RandomUtil added to make it easy to get consistent results / repeatable experiments
- Performance improvement to IntSet, original code moved to a new SortedIntSet class
- Bug fix and better bounds checking for HamerlyKMeans
- Bug fix in SOM
- New DC-SVM and SVMnoBias classes for training SVMs faster than before!
- Speed improvements to Kernel K Means
- Minor improvements to some of the distributions classes
- Improved unit test reliability and some minor bug fixes
- API now supports weighted datapoints for clustering. K-means and Kernel K Means have initial support for making use of this
- Improved some of the tree based learning. DecisionTrees (and stump) now parallelize better. Also added 3 schemes for inferring feature importance from tree models. Added that to Random Forest and use the out of bag datums to get better estimates.
- Added an O(n^2) time algorithm for hierarchical clustering. Improved LanceWilliams objects for it
- Improved parallelism of kernelized k-means
- Improved docs for text builders
- Bug fixes.
New release includes a number of bug fixes. Normalized AdaGrad added. Bigger feature is JSAT now supports missing values. Currently only Decision Stump, Tree, and Random Forest handle them naturaly. There is also an Imputer transform so that you can add that first and use whatever you like.
New release, includes the new VisualizationTransform interface - for one-way transformations meant specifically for visualizing higher dimensional datasets. Includes the popular algorithms TSNE (the fast Barnes-Hut approximation), MDS, and IsoMap. Added new CSV reader/writer and a custom binary format "JSATData" that is much better suited toward long term storage and performance, as well as minor improvements to the other loaders. All GUI code has been removed (should make use with Android easier). Also some improvements to various things in the Distribution package and bug fixes.
Major change is a new code to make it easier to add parameters to a GridSearch for parameter tuning. Also added a RandomSearch counterpart. New version also includes a new generic batch solver for L1 regularized objectives, ModifiedOWLQN, and various bug fixes.