SALSA: Software Lab for Advanced Machine Learning with Stochastic Algorithms is a native Julia implementation of the well known stochastic algorithms for sparse linear modelling, linear and non-linear Support Vector Machines. It is distributed under the GPLv3 license and stemmed from the following algorithmic approaches:
-
Pegasos: S. Shalev-Shwartz, Y. Singer, N. Srebro, Pegasos: Primal Estimated sub-GrAdient SOlver for SVM, in: Proceedings of the 24th international conference on Machine learning, ICML ’07, New York, NY, USA, 2007, pp. 807–814.
-
RDA: L. Xiao, Dual averaging methods for regularized stochastic learning and online optimization, J. Mach. Learn. Res. 11 (2010), pp. 2543–2596.
-
Adaptive RDA: J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res. 12 (2011), pp. 2121–2159.
-
Reweighted RDA: V. Jumutc, J.A.K. Suykens, Reweighted stochastic learning, Neurocomputing Special Issue - ISNN2014, 2015. (In Press)
Pkg.add("SALSA")
- Documentation: http://salsajl.readthedocs.org
using MAT, SALSA
# Load Ripley data
data = matread(joinpath(Pkg.dir("SALSA"),"data","ripley.mat"))
# Train and cross-validate Pegasos algorithm (default) on training data
# and evaluate it on the test data provided as the last function argument
model = salsa(data["X"], data["Y"], data["Xt"])
# Compute accuracy in %
@printf "Accuracy: %.2f%%\n" mean(model.output.Ytest .== data["Yt"])*100
# Or use map_predict function and map data beforehand by the extracted mean/std (default)
@printf "Accuracy: %.2f%%\n" mean(map_predict(model, data["Xt"]) .== data["Yt"])*100
or using Q&A tables
using SALSA
model = salsa_qa(readcsv(joinpath(Pkg.dir("SALSA"),"data","iris.data.csv")))
Do you have any target variable of interest in X (or ENTER for default 'yes')? [y/n]:
Please provide the column number of your target variable (or ENTER for default last column):
Is your problem of the classification type (or ENTER for default 'yes')? [y/n]:
Please select a loss function from options (or ENTER for default)
1 : SALSA.PINBALL (Pinball (quantile) Loss, i.e. l(y,p) = τI(yp>=1)yp + I(yp<1)(1 - yp))
2 : SALSA.HINGE (Hinge Loss, i.e. l(y,p) = max(0,1 - yp)) (default)
3 : SALSA.LEAST_SQUARES (Squared Loss, i.e. l(y,p) = 1/2*(p - y)^2)
4 : SALSA.LOGISTIC (Logistic Loss, i.e. l(y,p) = log(1 + exp(-yp)))
5 : SALSA.MODIFIED_HUBER (Modified Huber Loss, i.e. l(y,p) = -4I(yp<-1)yp + I(yp>=-1)max(0,1 - yp)^2)
6 : SALSA.SQUARED_HINGE (Squared Hinge Loss, i.e. l(y,p) = max(0,1 - yp)^2)
:
Please select a cross-validation (CV) criterion from options (or ENTER for default)
1 : SALSA.AUC (Area Under ROC Curve with 100 thresholds)
2 : SALSA.MISCLASS (Misclassification Rate) (default)
3 : SALSA.MSE (Mean Squared Error)
:
Do you want to perform Nyström (nonlinear) approximation (or ENTER for default)? [y/n]
n : SALSA.LINEAR (default)
y : SALSA.NONLINEAR
:
Please select an algorithm from options (or ENTER for default)
1 : SALSA.DROP_OUT (Dropout Pegasos (experimental))
2 : SALSA.PEGASOS (Pegasos: Primal Estimated sub-GrAdient SOlver for SVM) (default)
3 : SALSA.SIMPLE_SGD (Stochastic Gradient Descent)
4 : SALSA.ADA_L1RDA (Adaptive l1-Regularized Dual Averaging)
5 : SALSA.L1RDA (l1-Regularized Dual Averaging)
6 : SALSA.R_L1RDA (Reweighted l1-Regularized Dual Averaging)
7 : SALSA.R_L2RDA (Reweighted l2-Regularized Dual Averaging)
:
Please select a global optimization method from options (or ENTER for default)
1 : SALSA.CSA (Coupled Simulated Annealing) (default)
2 : SALSA.DS (Directional Search)
:
Computing the model...