-
Notifications
You must be signed in to change notification settings - Fork 11
A stand-alone pure Java library for linear algebra and machine learning
License
MingjieQian/LAML
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
Repository files navigation
LAML (Linear Algebra and Machine Learning) LAML is a stand-alone pure Java library for linear algebra and machine learning. The goal is to build efficient and easy-to-use linear algebra and machine learning libraries. The reason why linear algebra and machine learning are built together is that full control of the basic data structures for matrices and vectors is required to have fast implementation for machine learning methods. Additionally, LAML provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code. The built-in linear algebra library supports well-designed dense and sparse matrices and vectors. Standard compressed sparse column (CSC) and compressed sparse row (CSR) are used to design and implement sparse matrices. The matrix multiplication is competitive or even faster than well known linear algebra libraries in Java such as Commons-Math and Colt. Unlike other linear algebra libraries in Java, the built-in linear algebra library in LAML gives users full control of sparse matrices and vectors (e.g., the interior arrays in sparse matrices), which is crucial to make efficient high level implementations. Carefully designed linear algebra library is the basis for machine learning library. ML library aims to provide fast implementation of mature machine learning methods. For instance, the LinearBinarySVM class re-implements the popular liblinear (in Java). For heart_scale data with C = 1.0 and eps = 1e-2, the average running time is 0.04 seconds using an Intel(R) Core(TM) i7 CPU M620 @ 2.67GHz with 4.00GB memory and 64-bit Windows 7 operating system, even a little faster than liblinear, which costs 0.06 seconds in average given the same parameter. JML v.s. LAML LAML is much faster than JML (more than 3 times faster) due to two implementation considerations. First, LAML allows full control of dense and sparse matrices and vectors. Second, LAML extensively uses in-place matrix and vector operations thus avoids too much memory allocation and garbage collection. JML relies on third party linear algebra library, i.e. Apache Commons-math. Sparse matrices and vectors have been deprecated in Commons-math 3.0+, and will be ultimately eliminated. Whereas LAML has its own built-in linear algebra library. Like JML, LAML also provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code. In short, JML has been replaced by LAML. SourceForge https://sourceforge.net/projects/lamal Github: https://github.com/MingjieQian/LAML Documentation http://web.engr.illinois.edu/~mqian2/upload/projects/java/LAML/doc/index.html Features Stand-alone Java library, completely cross-platform Built-in Linear Algebra (LA) library Full control of matrices and vectors Many general-purpose optimization algorithms Fast implementation of Machine Learning (ML) methods Matrix functions with almost the same signature to MATLAB Well documented source code and friendly API, very easy to use Packages la.decomposition LU, QR, eigenvalue decomposition, and SVD la.matrix Sparse and dense matrix implementation la.vector Sparse and dense vector implementation la.io Functions of saving and loading a matrix, a vector, or a data set ml.utils Efficient functions for array and matrix operations, Matlab-style functions, and Printer ml.classification Linear SVM, Linear multi-class SVM, regularized logistic regression, maximum entropy modeling, and AdaBoost ml.clustering K-means, L1NMF, NMF, and spectral clustering ml.optimization L-BFGS, BoundConstrainedPLBFGS, NonnegativePLBFGS, Projection, ProximalMapping, ShrinkageOperator, accelerated proximal gradient, accelerated gradient descent, general quadratic programming, nonlinear conjugate gradient, LBFGS on simplex, quadratic programming with bound constraint, primal-dual interior-point method ml.sequence Hidden Markov Models (HMM) and Conditional Random Fields (CRF) ml.kernel Commonly used kernel functions ('linear' | 'poly' | 'rbf' | 'cosine') ml.manifold Commonly used manifold learning functions such as computing adjacency matrix, Laplacian matrix, and local learning regularization matrix ml.subspace PCA, kernel PCA, Multi-dimensional Scaling (MDS), Isomap, and Locally Linear Embedding (LLE) ml.regression LASSO and linear regression ml.random Multivariate Gaussian distribution ml.recovery Matrix completion and robust PCA ml.topics LDA ml.graph Minimum spanning tree using Prim's algorithm, shortest path using Dijkstra's algorithm, topological order, all-pairs shortest path using Floyd-Warshall algorithm, Huffman codes, and maximum flow using Ford-Fulkerson algorithm ml.recommendation Factorization machines and structured sparse regression (STSR) Code Examples # Eigenvalue Decomposition (For Real Symmetric Matrices) int m = 4; int n = 4; Matrix A = hilb(m, n); fprintf("A:%n"); disp(A); long start = 0; start = System.currentTimeMillis(); Matrix[] VD = EigenValueDecomposition.decompose(A); System.out.format("Elapsed time: %.4f seconds.%n", (System.currentTimeMillis() - start) / 1000.0); fprintf("*****************************************%n"); Matrix V = VD[0]; Matrix D = VD[1]; fprintf("V:%n"); printMatrix(V); fprintf("D:%n"); printMatrix(D); fprintf("VDV':%n"); disp(V.mtimes(D).mtimes(V.transpose())); fprintf("A:%n"); printMatrix(A); fprintf("V'V:%n"); printMatrix(V.transpose().mtimes((V))); # Output A: 1 0.5000 0.3333 0.2500 0.5000 0.3333 0.2500 0.2000 0.3333 0.2500 0.2000 0.1667 0.2500 0.2000 0.1667 0.1429 Elapsed time: 0.0100 seconds. ***************************************** V: 0.7926 0.5821 -0.1792 -0.0292 0.4519 -0.3705 0.7419 0.3287 0.3224 -0.5096 -0.1002 -0.7914 0.2522 -0.5140 -0.6383 0.5146 D: 1.5002 0.1691 0.0067 0.0001 VDV': 1 0.5000 0.3333 0.2500 0.5000 0.3333 0.2500 0.2000 0.3333 0.2500 0.2000 0.1667 0.2500 0.2000 0.1667 0.1429 A: 1 0.5000 0.3333 0.2500 0.5000 0.3333 0.2500 0.2000 0.3333 0.2500 0.2000 0.1667 0.2500 0.2000 0.1667 0.1429 V'V: 1.0000 -0.0000 -0.0000 -0.0000 -0.0000 1.0000 0.0000 0 -0.0000 0.0000 1.0000 -0.0000 -0.0000 0 -0.0000 1.0000 # -------------------------------------------------------------------------- # # LU Decomposition double[][] data = new double[][] { {1, -2, 3}, {2, -5, 12}, {0, 2, -10} }; Matrix A = new DenseMatrix(data); fprintf("A:%n"); printMatrix(A); Matrix[] LUP = LUDecomposition.decompose(A); Matrix L = LUP[0]; Matrix U = LUP[1]; Matrix P = LUP[2]; fprintf("L:%n"); printMatrix(L); fprintf("U:%n"); printMatrix(U); fprintf("P:%n"); printMatrix(P); fprintf("PA:%n"); printMatrix(P.mtimes(A)); fprintf("LU:%n"); printMatrix(L.mtimes(U)); long start = 0; start = System.currentTimeMillis(); LUDecomposition LUDecomp = new LUDecomposition(A); Vector b = new DenseVector(new double[] {2, 3, 4}); Vector x = LUDecomp.solve(b); fprintf("Solution for Ax = b:%n"); printVector(x); fprintf("b = %n"); printVector(b); fprintf("Ax = %n"); printVector(A.operate(x)); fprintf("A^{-1}:%n"); printMatrix(LUDecomp.inverse()); fprintf("det(A) = %.2f%n", LUDecomp.det()); System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F); fprintf("**********************************%n"); A = sparse(A); fprintf("A:%n"); printMatrix(A); LUP = LUDecomposition.decompose(A); L = LUP[0]; U = LUP[1]; P = LUP[2]; fprintf("L:%n"); printMatrix(L); fprintf("U:%n"); printMatrix(U); fprintf("P:%n"); printMatrix(P); fprintf("PA:%n"); printMatrix(P.mtimes(A)); fprintf("LU:%n"); printMatrix(L.mtimes(U)); start = System.currentTimeMillis(); LUDecomp = new LUDecomposition(sparse(A)); b = new DenseVector(new double[] {2, 3, 4}); x = LUDecomp.solve(b); fprintf("Solution for Ax = b:%n"); printVector(x); fprintf("Ax = %n"); printVector(A.operate(x)); fprintf("b = %n"); printVector(b); Matrix B = new DenseMatrix(new double[][] { {2, 4}, {3, 3}, {4, 2} } ); Matrix X = LUDecomp.solve(B); fprintf("Solution for AX = B:%n"); printMatrix(X); fprintf("AX = %n"); printMatrix(A.mtimes(X)); fprintf("B = %n"); printMatrix(B); fprintf("A^{-1}:%n"); printMatrix(LUDecomp.inverse()); fprintf("det(A) = %.2f%n", LUDecomp.det()); System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F); # Output A: 1 -2 3 2 -5 12 0 2 -10 L: 1 0 0 0 1 0 0.5000 0.2500 1 U: 2 -5 12 0 2 -10 0 0 -0.5000 P: 0 1 0 0 0 1 1 0 0 PA: 2 -5 12 0 2 -10 1 -2 3 LU: 2 -5 12 0 2 -10 1 -2 3 Solution for Ax = b: 13 7 1 b = 2 3 4 Ax = 2 3 4 A^{-1}: -13 7 4.5000 -10 5 3 -2 1 0.5000 det(A) = -2.00 Elapsed time: 0.02 seconds. ********************************** A: 1 -2 3 2 -5 12 2 -10 L: 1 1 0.5000 0.2500 1 U: 2 -5 12 2 -10 -0.5000 P: 1 1 1 PA: 2 -5 12 2 -10 1 -2 3 LU: 2 -5 12 2 -10 1 -2 3 Solution for Ax = b: 13 7 1 Ax = 2 3 4 b = 2 3 4 Solution for AX = B: 13 -22 7 -19 1 -4 AX = 2 4 3 3 4 2 B = 2 4 3 3 4 2 A^{-1}: -13 7 4.5000 -10 5 3 -2 1 0.5000 det(A) = -2.00 Elapsed time: 0.02 seconds. # -------------------------------------------------------------------------- # # QR Decomposition int m = 4; int n = 3; Matrix A = hilb(m, n); fprintf("When A is full:%n"); fprintf("A:%n"); printMatrix(A); long start = 0; start = System.currentTimeMillis(); Matrix[] QRP = QRDecomposition.decompose(A); Matrix Q = QRP[0]; Matrix R = QRP[1]; Matrix P = QRP[2]; fprintf("Q:%n"); printMatrix(Q); fprintf("R:%n"); printMatrix(R); fprintf("P:%n"); printMatrix(P); fprintf("AP:%n"); printMatrix(A.mtimes(P)); fprintf("QR:%n"); printMatrix(Q.mtimes(R)); fprintf("Q'Q:%n"); printMatrix(Q.transpose().mtimes(Q)); System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F); fprintf("**********************************%n"); // fprintf("|AP - QR| = "); A = sparse(hilb(m, n)); fprintf("When A is sparse:%n"); fprintf("A:%n"); printMatrix(A); start = System.currentTimeMillis(); QRP = QRDecomposition.decompose(A); Q = QRP[0]; R = QRP[1]; P = QRP[2]; fprintf("Q:%n"); printMatrix(Q); fprintf("R:%n"); printMatrix(R); fprintf("P:%n"); printMatrix(P); fprintf("AP:%n"); printMatrix(A.mtimes(P)); fprintf("QR:%n"); printMatrix(Q.mtimes(R)); fprintf("Q'Q:%n"); printMatrix(Q.transpose().mtimes(Q)); System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F); QRDecomposition QRDecomp = new QRDecomposition((A)); Vector b = new DenseVector(new double[] {2, 3, 4, 9}); Vector x = QRDecomp.solve(b); fprintf("Solution for Ax = b:%n"); printVector(x); fprintf("b = %n"); printVector(b); fprintf("Ax = %n"); printVector(A.operate(x)); # Output When A is full: A: 1 0.5000 0.3333 0.5000 0.3333 0.2500 0.3333 0.2500 0.2000 0.2500 0.2000 0.1667 Q: -0.8381 0.5144 -0.1796 -0.0263 -0.4191 -0.4052 0.7487 0.3157 -0.2794 -0.5351 -0.1132 -0.7892 -0.2095 -0.5338 -0.6280 0.5261 R: -1.1932 -0.4749 -0.6705 0 -0.1258 -0.1184 0 0 0.0059 0 0 0 P: 1 1 1 AP: 1 0.3333 0.5000 0.5000 0.2500 0.3333 0.3333 0.2000 0.2500 0.2500 0.1667 0.2000 QR: 1.0000 0.3333 0.5000 0.5000 0.2500 0.3333 0.3333 0.2000 0.2500 0.2500 0.1667 0.2000 Q'Q: 1.0000 0.0000 -0.0000 0.0000 0.0000 1 -0.0000 0 -0.0000 -0.0000 1.0000 0 0.0000 0 0 1 Elapsed time: 0.05 seconds. ********************************** When A is sparse: A: 1 0.5000 0.3333 0.5000 0.3333 0.2500 0.3333 0.2500 0.2000 0.2500 0.2000 0.1667 Q: -0.8381 0.5144 -0.1796 -0.0263 -0.4191 -0.4052 0.7487 0.3157 -0.2794 -0.5351 -0.1132 -0.7892 -0.2095 -0.5338 -0.6280 0.5261 R: -1.1932 -0.4749 -0.6705 -0.1258 -0.1184 0.0059 P: 1 1 1 AP: 1 0.3333 0.5000 0.5000 0.2500 0.3333 0.3333 0.2000 0.2500 0.2500 0.1667 0.2000 QR: 1.0000 0.3333 0.5000 0.5000 0.2500 0.3333 0.3333 0.2000 0.2500 0.2500 0.1667 0.2000 Q'Q: 1.0000 0.0000 -0.0000 0.0000 0.0000 1 -0.0000 0 -0.0000 -0.0000 1.0000 0 0.0000 0 0 1 Elapsed time: 0.04 seconds. Solution for Ax = b: 117.2346 -719.5017 733.7439 b = 2 3 4 9 Ax = 2.0651 2.2194 5.9516 7.6990 # -------------------------------------------------------------------------- # # Singular Value Decomposition Matrix A = new DenseMatrix(new double[][] { {1d, 2d}, {2d, 0d}, {1d, 7d}}); /*A = new DenseMatrix(new double[][] { {1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {10, 11, 12} });*/ // printMatrix(SingularValueDecomposition.bidiagonalize(A)[1]); // A = IO.loadMatrix("SVDInput"); /*fprintf("When A is full:%n%n"); fprintf("A:%n"); printMatrix(A);*/ long start = 0; start = System.currentTimeMillis(); boolean computeUV = !false; Matrix[] USV = SingularValueDecomposition.decompose(A, computeUV); System.out.format("Elapsed time: %.4f seconds.%n", (System.currentTimeMillis() - start) / 1000.0); fprintf("*****************************************%n"); Matrix U = USV[0]; Matrix S = USV[1]; Matrix V = USV[2]; if (computeUV) { fprintf("USV':%n"); disp(U.mtimes(S).mtimes(V.transpose())); fprintf("A:%n"); printMatrix(A); fprintf("U'U:%n"); printMatrix(U.transpose().mtimes((U))); fprintf("V'V:%n"); printMatrix(V.transpose().mtimes((V))); fprintf("U:%n"); printMatrix(U); fprintf("V:%n"); printMatrix(V); } fprintf("S:%n"); printMatrix(S); fprintf("rank(A): %d%n", rank(A)); # Output Elapsed time: 0.0100 seconds. ***************************************** USV': 1.0000 2.0000 2.0000 -0.0000 1.0000 7.0000 A: 1 2 2 0 1 7 U'U: 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 V'V: 1.0000 -0.0000 -0.0000 1.0000 U: -0.2906 -0.2976 -0.9094 -0.0492 -0.9445 0.3248 -0.9556 0.1391 0.2598 V: -0.1819 -0.9833 -0.9833 0.1819 S: 7.3935 2.0822 rank(A): 2 # -------------------------------------------------------------------------- # # Linear Binary SVM double C = 1.0; double eps = 1e-4; Classifier linearBinarySVM = new LinearBinarySVM(C, eps); int[] pred_labels = null; double[][] data = { {3.5, 4.4, 1.3, 2.3}, {5.3, 2.2, 0.5, 4.5}, {0.2, 0.3, 4.1, -3.1}, {-1.2, 0.4, 3.2, 1.6} }; int[] labels = new int[] {1, 1, -1, -1}; linearBinarySVM.feedData(data); linearBinarySVM.feedLabels(labels); linearBinarySVM.train(); fprintf("W:%n"); printMatrix(linearBinarySVM.W); fprintf("b:%n"); printVector(linearBinarySVM.b); pred_labels = linearBinarySVM.predict(data); getAccuracy(pred_labels, labels); # Output W: 0.2143 0.1312 -0.2407 0.0150 b: -0.0490 Accuracy: 100.00% # -------------------------------------------------------------------------- # # Linear Multi-Class SVM double C = 1.0; double eps = 1e-4; Classifier linearMCSVM = new LinearMCSVM(C, eps); double[][] data = { {3.5, 4.4, 1.3, 2.3}, {5.3, 2.2, 0.5, 4.5}, {0.2, 0.3, 4.1, -3.1}, {-1.2, 0.4, 3.2, 1.6} }; int[] labels = new int[] {1, 2, 3, 4}; linearMCSVM.feedData(data); linearMCSVM.feedLabels(labels); linearMCSVM.train(); fprintf("W:%n"); printMatrix(linearMCSVM.W); fprintf("b:%n"); printVector(linearMCSVM.b); int[] pred_labels = linearMCSVM.predict(data); getAccuracy(pred_labels, labels); # Output .. W: -0.0482 0.1314 0.0630 -0.1462 0.2825 -0.2241 -0.0053 -0.0531 -0.0819 -0.0836 0.0599 0.1056 -0.0838 0.1711 -0.2301 0.1428 b: -0.0248 -0.0089 0.0043 0.0295 Accuracy: 100.00% # -------------------------------------------------------------------------- # # Multi-Class Logistic Regression with Multiple Choices of Regularization double[][] data = { {3.5, 5.3, 0.2, -1.2}, {4.4, 2.2, 0.3, 0.4}, {1.3, 0.5, 4.1, 3.2} }; int[] labels = new int[] {1, 2, 3}; /* * Regularization type. * 0: No regularization * 1: L1 regularization * 2: L2^2 regularization * 3: L2 regularization * 4: Infinity norm regularization */ int regularizationType = 1; double lambda = 0.1; Classifier logReg = new LogisticRegression(regularizationType, lambda); logReg.epsilon = 1e-5; logReg.feedData(data); logReg.feedLabels(labels); // Get elapsed time in seconds tic(); logReg.train(); fprintf("Elapsed time: %.3f seconds.%n", toc()); fprintf("W:%n"); printMatrix(logReg.W); fprintf("b:%n"); printVector(logReg.b); double[][] dataTest = data; fprintf("Ground truth:%n"); printMatrix(logReg.Y); fprintf("Predicted probability matrix:%n"); Matrix Prob_pred = logReg.predictLabelScoreMatrix(dataTest); disp(Prob_pred); fprintf("Predicted label matrix:%n"); Matrix Y_pred = logReg.predictLabelMatrix(dataTest); printMatrix(Y_pred); int[] pred_labels = logReg.predict(dataTest); getAccuracy(pred_labels, labels); # Output # Without regularization L-BFGS converges with norm(Grad) 0.000006 Elapsed time: 0.060 seconds. W: -1.8522 3.1339 -1.2817 3.4138 -1.7282 -1.6856 -1.2455 -1.2630 2.5084 -2.8311 0.5558 2.2753 b: -0.3616 0.2534 0.1082 Ground truth: 1 1 1 Predicted probability matrix: 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 Predicted label matrix: 1 1 1 Accuracy: 100.00% # L1-norm regularization Accelerated proximal gradient method converges with norm(G_Y_k) 0.000004 Elapsed time: 0.430 seconds. W: 0 0.8678 0 0.9579 0 0 0 0 0.9894 0 0 0 b: 0 0 0 Ground truth: 1 1 1 Predicted probability matrix: 0.8790 0.1143 0.0067 0.1493 0.8263 0.0244 0.0258 0.0494 0.9247 Predicted label matrix: 1 1 1 Accuracy: 100.00% # squared Frobenius norm regularization Accelerated proximal gradient method converges with norm(G_Y_k) 0.000009 Elapsed time: 0.242 seconds. W: -0.2817 0.5368 -0.2551 0.5298 -0.3269 -0.2029 -0.1181 -0.3004 0.4185 -0.3814 0.0449 0.3364 b: -0.0417 0.0266 0.0151 Ground truth: 1 1 1 Predicted probability matrix: 0.8872 0.1028 0.0100 0.1241 0.8299 0.0461 0.0137 0.0514 0.9349 Predicted label matrix: 1 1 1 Accuracy: 100.00% # Frobenius norm regularization Accelerated proximal gradient method converges with norm(G_Y_k) 0.000008 Elapsed time: 0.358 seconds. W: -0.4333 0.7669 -0.3336 0.7402 -0.4956 -0.2447 -0.1262 -0.4227 0.5489 -0.5135 0.0810 0.4326 b: -0.0587 0.0375 0.0212 Ground truth: 1 1 1 Predicted probability matrix: 0.9509 0.0461 0.0029 0.0558 0.9249 0.0192 0.0039 0.0221 0.9740 Predicted label matrix: 1 1 1 Accuracy: 100.00% # Infinity norm regularization Accelerated proximal gradient method converges with norm(G_Y_k) 0.000010 Elapsed time: 0.631 seconds. W: -0.7162 0.7162 -0.7162 0.7162 -0.7162 -0.7162 -0.7162 -0.7162 0.7162 -0.7162 0.7162 0.7162 b: -0.6125 -0.3405 0.5728 Ground truth: 1 1 1 Predicted probability matrix: 0.9821 0.0175 0.0004 0.0179 0.9752 0.0068 0.0000 0.0072 0.9928 Predicted label matrix: 1 1 1 Accuracy: 100.00% # -------------------------------------------------------------------------- # # Maximum Entropy Modeling long start = System.currentTimeMillis(); /* * a 3D {@code double} array, where data[n][i][k] * is the i-th feature value on the k-th class * for the n-th sample */ double[][][] data = new double[][][] { {{1, 0, 0}, {2, 1, -1}, {0, 1, 2}, {-1, 2, 1}}, {{0, 2, 0}, {1, 0, -1}, {0, 1, 1}, {-1, 3, 0.5}}, {{0, 0, 0.8}, {2, 1, -1}, {1, 3, 0}, {-0.5, -1, 2}}, {{0.5, 0, 0}, {1, 1, -1}, {0, 0.5, 1.5}, {-2, 1.5, 1}}, }; /*double [][] labels = new double[][] { {1, 0, 0}, {0, 1, 0}, {0, 0, 1}, {1, 0, 0} };*/ int[] labels = new int[] {1, 2, 3, 1}; MaxEnt maxEnt = new MaxEnt(); maxEnt.feedData(data); maxEnt.feedLabels(labels); maxEnt.train(); double elapsedTime = (System.currentTimeMillis() - start) / 1000d; System.out.format("Elapsed time: %.3f seconds\n", elapsedTime); fprintf("MaxEnt parameters:\n"); display(maxEnt.W); String modelFilePath = "MaxEnt-Model.dat"; maxEnt.saveModel(modelFilePath); maxEnt = new MaxEnt(); maxEnt.loadModel(modelFilePath); fprintf("Predicted probability matrix:\n"); display(maxEnt.predictLabelScoreMatrix(data)); fprintf("Predicted label matrix:\n"); display(full(maxEnt.predictLabelMatrix(data))); fprintf("Predicted labels:\n"); display(maxEnt.predict(data)); # Output L-BFGS converges with norm(Grad) 0.000072 Elapsed time: 0.060 seconds MaxEnt parameters: 7.4697 -1.1159 -2.8338 -1.0426 Model saved. Loading model... Model loaded. Predicted probability matrix: 1.0000 0.0000 0.0000 0.0001 0.9999 0.0000 0.0001 0.0000 0.9999 0.9997 0.0002 0.0001 Predicted label matrix: 1 0 0 0 1 0 0 0 1 1 0 0 Predicted labels: 1 2 3 1 # -------------------------------------------------------------------------- # # AdaBoost double[][] data = { {3.5, 4.4, 1.3}, {5.3, 2.2, 0.5}, {0.2, 0.3, 4.1}, {5.3, 2.2, -1.5}, {-1.2, 0.4, 3.2} }; int[] labels = {1, 1, -1, -1, -1}; Matrix X = new DenseMatrix(data); int T = 10; Classifier[] weakClassifiers = new Classifier[T]; for (int t = 0; t < 10; t++) { weakClassifiers[t] = new LogisticRegression(epsilon); } Classifier adaBoost = new AdaBoost(weakClassifiers); adaBoost.feedData(X); adaBoost.feedLabels(labels); tic(); adaBoost.train(); System.out.format("Elapsed time: %.2f seconds.%n", toc()); Xt = X.copy(); display(full(adaBoost.predictLabelMatrix(Xt))); display(adaBoost.predict(Xt)); accuracy = Classifier.getAccuracy(labels, adaBoost.predict(Xt)); fprintf("Accuracy for AdaBoost with logistic regression: %.2f%%\n", 100 * accuracy); # Output Elapsed time: 0.18 seconds. 1 0 1 0 0 1 0 1 0 1 1 1 -1 -1 -1 Accuracy: 100.00% # -------------------------------------------------------------------------- # # K-means double[][] data = { {3.5, 5.3, 0.2, -1.2}, {4.4, 2.2, 0.3, 0.4}, {1.3, 0.5, 4.1, 3.2} }; KMeansOptions options = new KMeansOptions(); options.nClus = 2; options.verbose = true; options.maxIter = 100; KMeans KMeans= new KMeans(options); KMeans.feedData(data); // KMeans.initialize(null); Matrix initializer = null; initializer = new SparseMatrix(3, 2); initializer.setEntry(0, 0, 1); initializer.setEntry(1, 1, 1); initializer.setEntry(2, 0, 1); KMeans.clustering(initializer); // Use null for random initialization System.out.println("Indicator Matrix:"); printMatrix(full(KMeans.getIndicatorMatrix())); # Output Iter 1: mse = 9.534 (0.000 secs) KMeans complete. Indicator Matrix: 0 1 0 1 1 0 # -------------------------------------------------------------------------- # # NMF double[][] data = { {3.5, 4.4, 1.3}, {5.3, 2.2, 0.5}, {0.2, 0.3, 4.1}, {1.2, 0.4, 3.2} }; KMeansOptions options = new KMeansOptions(); options.nClus = 2; options.verbose = true; options.maxIter = 100; KMeans KMeans= new KMeans(options); KMeans.feedData(data); KMeans.initialize(null); KMeans.clustering(); Matrix G0 = KMeans.getIndicatorMatrix(); NMFOptions NMFOptions = new NMFOptions(); NMFOptions.nClus = 2; NMFOptions.maxIter = 50; NMFOptions.verbose = true; NMFOptions.calc_OV = false; NMFOptions.epsilon = 1e-5; Clustering NMF = new NMF(NMFOptions); NMF.feedData(data); // NMF.initialize(null); NMF.clustering(G0); // If null, KMeans will be used for initialization System.out.println("Basis Matrix:"); printMatrix(full(NMF.getCenters())); System.out.println("Indicator Matrix:"); printMatrix(full(NMF.getIndicatorMatrix())); # Output Iter 1: mse = 13.060 (0.000 secs) Iter 2: mse = 2.875 (0.000 secs) KMeans complete. Iteration 10, delta G: 0.001012 Converge successfully! Basis Matrix: 5.0577 3.6774 0.5237 0.2326 0.3013 4.4328 Indicator Matrix: 0.8543 0.2004 0.8928 0 0.0000 0.9251 0.1520 0.7014 # -------------------------------------------------------------------------- # # L1NMF String dataMatrixFilePath = "CNN - DocTermCount.txt"; tic(); Matrix X = loadMatrixFromDocTermCountFile(dataMatrixFilePath); X = Matlab.getTFIDF(X); X = Matlab.normalizeByColumns(X); X = X.transpose(); KMeansOptions kMeansOptions = new KMeansOptions(); kMeansOptions.nClus = 10; kMeansOptions.maxIter = 50; kMeansOptions.verbose = true; KMeans KMeans = new KMeans(kMeansOptions); KMeans.feedData(X); // KMeans.initialize(null); KMeans.clustering(); Matrix G0 = KMeans.getIndicatorMatrix(); // Matrix X = Data.loadSparseMatrix("X.txt"); G0 = loadDenseMatrix("G0.txt"); L1NMFOptions L1NMFOptions = new L1NMFOptions(); L1NMFOptions.nClus = 10; L1NMFOptions.gamma = 1 * 0.0001; L1NMFOptions.mu = 1 * 0.1; L1NMFOptions.maxIter = 50; L1NMFOptions.verbose = true; L1NMFOptions.calc_OV = !true; L1NMFOptions.epsilon = 1e-5; Clustering L1NMF = new L1NMF(L1NMFOptions); L1NMF.feedData(X); // L1NMF.initialize(G0); L1NMF.clustering(G0); // Use null for random initialization System.out.format("Elapsed time: %.3f seconds\n", toc()); # Output Iter 1: mse = 1.524 (0.030 secs) Iter 2: mse = 0.816 (0.030 secs) Iter 3: mse = 0.806 (0.030 secs) Iter 4: mse = 0.805 (0.040 secs) KMeans complete. Iteration 10, delta G: 0.046591 Iteration 20, delta G: 0.047140 Iteration 30, delta G: 0.020651 Iteration 40, delta G: 0.010017 Iteration 50, delta G: 0.007973 Maximal iterations Elapsed time: 3.933 seconds # -------------------------------------------------------------------------- # # Spectral Clustering tic(); int nClus = 2; boolean verbose = false; int maxIter = 100; String graphType = "nn"; double graphParam = 2; String graphDistanceFunction = "euclidean"; String graphWeightType = "heat"; double graphWeightParam = 1; ClusteringOptions options = new SpectralClusteringOptions( nClus, verbose, maxIter, graphType, graphParam, graphDistanceFunction, graphWeightType, graphWeightParam); Clustering spectralClustering = new SpectralClustering(options); double[][] data = { {3.5, 5.3, 0.2, -1.2}, {4.4, 2.2, 0.3, 0.4}, {1.3, 0.5, 4.1, 3.2} }; spectralClustering.feedData(data); spectralClustering.clustering(null); display(full(spectralClustering.getIndicatorMatrix())); System.out.format("Elapsed time: %.3f seconds\n", toc()); # Output Computing directed adjacency graph... Creating the adjacency matrix. Nearest neighbors, N = 2. KMeans complete. Spectral clustering complete. 0 1 0 1 1 0 Elapsed time: 0.070 seconds # -------------------------------------------------------------------------- # # Computing Adjacency Matrix, Graph Laplacian and Local Learning Regularization String filePath = "CNN - DocTermCount.txt"; Matrix X = loadMatrixFromDocTermCountFile(filePath); int NSample = Math.min(20, X.getColumnDimension()); X = X.getSubMatrix(0, X.getRowDimension() - 1, 0, NSample - 1); System.out.println(String.format("%d samples loaded", X.getColumnDimension())); X = X.transpose(); GraphOptions options = new GraphOptions(); options.graphType = "nn"; String type = options.graphType; double NN = options.graphParam; System.out.println(String.format("Graph type: %s with NN: %d", type, (int)NN)); // Parameter setting for text data options.kernelType = "cosine"; options.graphDistanceFunction = "cosine"; // Parameter setting for image data /*options.kernelType = "rbf"; options.graphDistanceFunction = "euclidean";*/ options.graphNormalize = true; options.graphWeightType = "heat"; boolean show = true && !false; // Test adjacency function - pass tic(); String DISTANCEFUNCTION = options.graphDistanceFunction; Matrix A = adjacency(X, type, NN, DISTANCEFUNCTION); System.out.format("Elapsed time: %.2f seconds.%n", toc()); String adjacencyFilePath = "adjacency.txt"; saveMatrix(adjacencyFilePath, A); if (show) disp(A.getSubMatrix(0, 4, 0, 4)); // Test laplacian function - pass tic(); Matrix L = laplacian(X, type, options); System.out.format("Elapsed time: %.2f seconds.%n", toc()); String LaplacianFilePath = "Laplacian.txt"; saveMatrix(LaplacianFilePath, L); if (show) disp(L.getSubMatrix(0, 4, 0, 4)); // Test local learning regularization - pass NN = options.graphParam; String DISTFUNC = options.graphDistanceFunction; String KernelType = options.kernelType; double KernelParam = options.kernelParam; double lambda = 0.001; tic(); Matrix LLR_text = calcLLR(X, NN, DISTFUNC, KernelType, KernelParam, lambda); System.out.format("Elapsed time: %.2f seconds.%n", toc()); String LLRFilePath = "localLearningRegularization.txt"; saveMatrix(LLRFilePath, LLR_text); if (show) display(LLR_text.getSubMatrix(0, 4, 0, 4)); # Output 20 samples loaded Graph type: nn with NN: 6 Computing directed adjacency graph... Creating the adjacency matrix. Nearest neighbors, N = 6. Elapsed time: 0.28 seconds. Data matrix file written: adjacency.txt (2, 1) 0.8162 (3, 1) 0.8841 (1, 2) 0.8162 (3, 2) 0.9041 (4, 2) 0.9074 (1, 3) 0.8841 (2, 3) 0.9041 (2, 4) 0.9074 (5, 4) 0.9558 (4, 5) 0.9558 Computing Graph Laplacian... Computing directed adjacency graph... Creating the adjacency matrix. Nearest neighbors, N = 6. Elapsed time: 0.22 seconds. Data matrix file written: Laplacian.txt (1, 1) 1 (2, 1) -0.1522 (3, 1) -0.1118 (1, 2) -0.1522 (2, 2) 1 (3, 2) -0.1276 (4, 2) -0.1521 (1, 3) -0.1118 (2, 3) -0.1276 (3, 3) 1 (2, 4) -0.1521 (4, 4) 1 (5, 4) -0.1499 (4, 5) -0.1499 (5, 5) 1 Computing directed adjacency graph... Creating the adjacency matrix. Nearest neighbors, N = 6. Elapsed time: 0.26 seconds. Data matrix file written: localLearningRegularization.txt (1, 1) 1.0562 (2, 1) -0.2926 (3, 1) -0.1244 (4, 1) 0.0132 (1, 2) -0.2926 (2, 2) 1.0353 (3, 2) -0.1167 (4, 2) -0.1202 (5, 2) 0.0026 (1, 3) -0.1244 (2, 3) -0.1167 (3, 3) 1.0305 (4, 3) 0.0045 (1, 4) 0.0132 (2, 4) -0.1202 (3, 4) 0.0045 (4, 4) 1.0583 (5, 4) -0.0052 (2, 5) 0.0026 (4, 5) -0.0052 (5, 5) 1.0042 # -------------------------------------------------------------------------- # # Matrix Completion int m = 6; int r = 1; int p = (int) Math.round(m * m * 0.3); Matrix L = randn(m, r); Matrix R = randn(m, r); Matrix A_star = mtimes(L, R.transpose()); int[] indices = randperm(m * m); minusAssign(indices, 1); indices = linearIndexing(indices, colon(0, p - 1)); Matrix Omega = zeros(size(A_star)); linearIndexingAssignment(Omega, indices, 1); Matrix D = zeros(size(A_star)); linearIndexingAssignment(D, indices, linearIndexing(A_star, indices)); Matrix E_star = D.minus(A_star); logicalIndexingAssignment(E_star, Omega, 0); // Run matrix completion MatrixCompletion matrixCompletion = new MatrixCompletion(); matrixCompletion.feedData(D); matrixCompletion.feedIndices(Omega); tic(); matrixCompletion.run(); fprintf("Elapsed time: %.2f seconds.%n", toc()); // Output Matrix A_hat = matrixCompletion.GetLowRankEstimation(); fprintf("A*:\n"); disp(A_star, 4); fprintf("A^:\n"); disp(A_hat, 4); fprintf("D:\n"); disp(D, 4); fprintf("rank(A*): %d\n", rank(A_star)); fprintf("rank(A^): %d\n", rank(A_hat)); fprintf("||A* - A^||_F: %.4f\n", norm(A_star.minus(A_hat), "fro")); # Output Elapsed time: 0.07 seconds. A*: 1.1683 -0.4309 1.7763 -0.3742 -0.4096 -0.6490 -2.9765 1.0978 -4.5257 0.9533 1.0437 1.6536 -0.4342 0.1601 -0.6602 0.1391 0.1522 0.2412 1.6696 -0.6158 2.5386 -0.5347 -0.5854 -0.9276 -0.6291 0.2320 -0.9566 0.2015 0.2206 0.3495 0.4162 -0.1535 0.6329 -0.1333 -0.1459 -0.2312 A^: 1.1683 -0.4309 -0.0000 -0.3742 -0.4096 -0.6490 -2.9765 1.0978 0.0000 0.9533 1.0437 1.6536 -0.4342 0.1601 0.0000 0.1391 0.1522 0.2412 1.6696 -0.6158 -0.0000 -0.5347 -0.5854 -0.9276 -0.6291 0.2320 0.0000 0.2015 0.2206 0.3495 0.4162 -0.1535 -0.0000 -0.1333 -0.1459 -0.2312 D: 0 0 0 0 0 -0.6490 -2.9765 0 0 0 1.0437 1.6536 0 0 0 0.1391 0.1522 0 1.6696 -0.6158 0 -0.5347 0 0 0 0 0 0 0.2206 0 0 0 0 -0.1333 0 0 rank(A*): 1 rank(A^): 1 ||A* - A^||_F: 5.6420 # -------------------------------------------------------------------------- # # Robust PCA int m = 8; int r = m / 4; Matrix L = randn(m, r); Matrix R = randn(m, r); Matrix A_star = mtimes(L, R.transpose()); Matrix E_star = zeros(size(A_star)); int[] indices = randperm(m * m); int nz = m * m / 20; int[] nz_indices = new int[nz]; for (int i = 0; i < nz; i++) { nz_indices[i] = indices[i] - 1; } Matrix E_vec = vec(E_star); setSubMatrix(E_vec, nz_indices, new int[] {0}, (minus(rand(nz, 1), 0.5).times(100))); E_star = reshape(E_vec, size(E_star)); // Input Matrix D = A_star.plus(E_star); double lambda = 1 * Math.pow(m, -0.5); // Run Robust PCA RobustPCA robustPCA = new RobustPCA(lambda); robustPCA.feedData(D); tic(); robustPCA.run(); fprintf("Elapsed time: %.2f seconds.%n", toc()); // Output Matrix A_hat = robustPCA.GetLowRankEstimation(); Matrix E_hat = robustPCA.GetErrorMatrix(); fprintf("A*:\n"); disp(A_star, 4); fprintf("A^:\n"); disp(A_hat, 4); fprintf("E*:\n"); disp(E_star, 4); fprintf("E^:\n"); disp(E_hat, 4); fprintf("rank(A*): %d\n", rank(A_star)); fprintf("rank(A^): %d\n", rank(A_hat)); fprintf("||A* - A^||_F: %.4f\n", norm(A_star.minus(A_hat), "fro")); fprintf("||E* - E^||_F: %.4f\n", norm(E_star.minus(E_hat), "fro")); # Output Elapsed time: 0.05 seconds. A*: 0.4542 1.4239 -0.7884 0.9424 -1.9555 0.3413 -0.9643 1.9974 3.2620 -0.5927 -1.7637 0.6375 1.4598 -1.2037 1.9060 -1.8263 -1.9222 -0.4894 1.3415 -0.8509 0.3416 0.4260 -0.4386 -0.1774 -1.3192 0.7056 0.5454 0.0062 -1.2580 0.6442 -1.1511 1.4349 -2.2590 0.3961 1.2266 -0.4496 -0.9903 0.8288 -1.3082 1.2432 3.2288 1.0411 -2.3324 1.5534 -0.8878 -0.6416 0.5579 0.6255 1.7459 -0.1703 -0.9969 0.4244 0.5708 -0.5946 0.9002 -0.7579 1.0295 -1.0952 -0.2294 -0.3135 1.7622 -0.6867 1.3429 -1.9339 A^: 0.4296 1.4239 -0.7884 0.9422 -1.9322 0.3282 -0.9459 1.9804 3.1412 -0.5927 -1.7637 0.6375 1.4596 -1.2037 1.9060 -1.8263 -1.8337 -0.4894 1.3369 -0.8509 0.3325 0.4261 -0.4385 -0.1774 -1.3192 -0.4311 0.5454 -0.1480 0.2049 0.6442 -0.7934 0.4356 -2.1752 0.3960 1.2266 -0.4496 -0.9903 0.8287 -1.3082 1.2432 2.6885 1.0320 -1.9370 1.2690 -0.8878 -0.6340 0.5579 0.5161 1.6787 -0.1704 -0.9966 0.4247 0.5725 -0.5947 0.9002 -0.7579 1.0103 -1.0939 -0.2342 -0.3135 1.7505 -0.6867 1.3428 -1.9330 E*: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -20.5412 0 0 34.4119 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20.5941 0 0 0 0 0 0 0 0 0 0 0 E^: 0.0245 0 0 0.0002 -0.0233 0.0131 -0.0184 0.0170 0.1208 0 0 0 0.0001 -0.0001 0.0000 0 -0.0885 0 0.0046 -0.0000 0.0091 -0.0001 -0.0001 0 0 -19.4046 0 0.1542 32.9490 0 -0.3577 0.9993 -0.0839 0.0001 0.0000 0 0 0.0000 0.0000 0.0000 0.5403 0.0091 -0.3953 0.2844 0 -0.0076 0 0.1094 0.0672 0.0001 -0.0004 -0.0003 20.5923 0.0000 0 0 0.0193 -0.0014 0.0048 0 0.0117 0 0.0001 -0.0009 rank(A*): 2 rank(A^): 4 ||A* - A^||_F: 2.2716 ||E* - E^||_F: 2.2716 # -------------------------------------------------------------------------- # # LASSO double[][] data = {{1, 2, 3, 2}, {4, 2, 3, 6}, {5, 1, 2, 1}}; double[][] depVars = {{3, 2}, {2, 3}, {1, 4}}; Options options = new Options(); options.maxIter = 600; options.lambda = 0.05; options.verbose = !true; options.calc_OV = !true; options.epsilon = 1e-5; Regression LASSO = new LASSO(options); LASSO.feedData(data); LASSO.feedDependentVariables(depVars); tic(); LASSO.train(); fprintf("Elapsed time: %.3f seconds\n\n", toc()); fprintf("Projection matrix:\n"); display(LASSO.W); Matrix Yt = LASSO.predict(data); fprintf("Predicted dependent variables:\n"); display(Yt); # Output Elapsed time: 0.060 seconds Projection matrix: -0.2295 0.5994 0 0 1.1058 0.5858 -0.0631 -0.1893 Predicted dependent variables: 2.9618 1.9782 2.0209 3.0191 1.0009 3.9791 # -------------------------------------------------------------------------- # # Linear Regression double[][] data = { {1, 2, 3, 2}, {4, 2, 3, 6}, {5, 1, 4, 1} }; double[][] depVars = { {3, 2}, {2, 3}, {1, 4} }; Options options = new Options(); options.maxIter = 600; options.lambda = 0.1; options.verbose = !true; options.calc_OV = !true; options.epsilon = 1e-5; Regression LR = new LinearRegression(options); LR.feedData(data); LR.feedDependentVariables(depVars); tic(); LR.train(); fprintf("Elapsed time: %.3f seconds\n\n", toc()); fprintf("Projection matrix:\n"); display(LR.W); fprintf("Bias vector:\n"); display(((LinearRegression)LR).B); Matrix Yt = LR.predict(data); fprintf("Predicted dependent variables:\n"); display(Yt); # Output Elapsed time: 0.025 seconds Projection matrix: -0.4700 0.4049 0.5621 0.1216 0.6117 0.4454 0.1163 -0.0513 Bias vector: 0.2345 0.1131 Predicted dependent variables: 2.9561 1.9949 2.0111 3.0043 1.0093 3.9895 # -------------------------------------------------------------------------- # # Basic Conditional Random Field (CRF) // Number of data sequences int D = 1000; // Minimal length for the randomly generated data sequences int n_min = 4; // Maximal length for the randomly generated data sequences int n_max = 6; // Number of feature functions int d = 10; // Number of states int N = 2; // Sparseness for the feature matrices double sparseness = 0.2; // Randomly generate labeled sequential data for CRF Object[] dataSequences = CRF.generateDataSequences(D, n_min, n_max, d, N, sparseness); Matrix[][][] Fs = (Matrix[][][]) dataSequences[0]; int[][] Ys = (int[][]) dataSequences[1]; // Train a CRF model for the randomly generated sequential data with labels double epsilon = 1e-4; CRF CRF = new CRF(epsilon); CRF.feedData(Fs); CRF.feedLabels(Ys); CRF.train(); // Save the CRF model String modelFilePath = "CRF-Model.dat"; CRF.saveModel(modelFilePath); fprintf("CRF Parameters:\n"); display(CRF.W); // Prediction CRF = new CRF(); CRF.loadModel(modelFilePath); int ID = new Random().nextInt(D); int[] Yt = Ys[ID]; Matrix[][] Fst = Fs[ID]; fprintf("True label sequence:\n"); display(Yt); fprintf("Predicted label sequence:\n"); display(CRF.predict(Fst)); # Output Initial ofv: 46.6525 Iter 1, ofv: 45.9559, norm(Grad): 0.475827 Iter 2, ofv: 33.8123, norm(Grad): 0.472433 Iter 3, ofv: 25.6766, norm(Grad): 0.335332 Objective function value doesn't decrease, iteration stopped! Iter 4, ofv: 25.6766, norm(Grad): 0.0901032 Model saved. CRF Parameters: -5.7255 6.0560 5.8877 -7.7452 11.2735 -4.5474 -0.6763 3.6997 -4.3791 0.6701 Loading model... Model loaded. True label sequence: 1 0 1 0 Predicted label sequence: P*(YPred|x) = 0.624515 1 1 1 0 # -------------------------------------------------------------------------- # # Hidden Markov Model (HMM) int numStates = 3; int numObservations = 2; double epsilon = 1e-8; int maxIter = 10; double[] pi = new double[] {0.33, 0.33, 0.34}; double[][] A = new double[][] { {0.5, 0.3, 0.2}, {0.3, 0.5, 0.2}, {0.2, 0.4, 0.4} }; double[][] B = new double[][] { {0.7, 0.3}, {0.5, 0.5}, {0.4, 0.6} }; // Generate the data sequences for training int D = 10000; int T_min = 5; int T_max = 10; int[][][] data = HMM.generateDataSequences(D, T_min, T_max, pi, A, B); int[][] Os = data[0]; int[][] Qs = data[1]; boolean trainHMM = !false; if (trainHMM){ HMM HMM = new HMM(numStates, numObservations, epsilon, maxIter); HMM.feedData(Os); HMM.feedLabels(Qs); HMM.train(); fprintf("True Model Parameters: \n"); fprintf("Initial State Distribution: \n"); display(pi); fprintf("State Transition Probability Matrix: \n"); display(A); fprintf("Observation Probability Matrix: \n"); display(B); fprintf("Trained Model Parameters: \n"); fprintf("Initial State Distribution: \n"); display(HMM.pi); fprintf("State Transition Probability Matrix: \n"); display(HMM.A); fprintf("Observation Probability Matrix: \n"); display(HMM.B); String HMMModelFilePath = "HMMModel.dat"; HMM.saveModel(HMMModelFilePath); } // Predict the single best state path int ID = new Random().nextInt(D); int[] O = Os[ID]; HMM HMMt = new HMM(); HMMt.loadModel("HMMModel.dat"); int[] Q = HMMt.predict(O); fprintf("Observation sequence: \n"); HMMt.showObservationSequence(O); fprintf("True state sequence: \n"); HMMt.showStateSequence(Qs[ID]); fprintf("Predicted state sequence: \n"); HMMt.showStateSequence(Q); double p = HMMt.evaluate(O); System.out.format("P(O|Theta) = %f\n", p); # Output Iter: 1, log[P(O|Theta)]: -51857.728975 Iter: 2, log[P(O|Theta)]: -51857.679272 Iter: 3, log[P(O|Theta)]: -51857.633951 Iter: 4, log[P(O|Theta)]: -51857.592261 Iter: 5, log[P(O|Theta)]: -51857.553794 Iter: 6, log[P(O|Theta)]: -51857.518197 Iter: 7, log[P(O|Theta)]: -51857.485156 Iter: 8, log[P(O|Theta)]: -51857.454400 Iter: 9, log[P(O|Theta)]: -51857.425690 Iter: 10, log[P(O|Theta)]: -51857.398814 True Model Parameters: Initial State Distribution: 0.3300 0.3300 0.3400 State Transition Probability Matrix: 0.5000 0.3000 0.2000 0.3000 0.5000 0.2000 0.2000 0.4000 0.4000 Observation Probability Matrix: 0.7000 0.3000 0.5000 0.5000 0.4000 0.6000 Trained Model Parameters: Initial State Distribution: 0.3309 0.3269 0.3422 State Transition Probability Matrix: 0.5000 0.3067 0.1934 0.2959 0.5031 0.2010 0.2037 0.3940 0.4023 Observation Probability Matrix: 0.7030 0.2970 0.4957 0.5043 0.3956 0.6044 Model saved. Loading model... Model loaded. Observation sequence: 1 0 1 0 1 0 0 True state sequence: 1 1 1 1 1 0 2 Predicted state sequence: 1 1 1 1 1 0 0 P(O|Theta) = 0.007996 # -------------------------------------------------------------------------- # # Isomap double[][] data = {{0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6}}; Matrix X = new DenseMatrix(data); X = X.transpose(); int K = 3; int r = 3; Matrix R = Isomap.run(X, K, r); disp("Original Data:"); disp(X); disp("Reduced Data:"); disp(R); # Output Computing directed adjacency graph... Creating the adjacency matrix. Nearest neighbors, N = 3. Original Data: 0 2 3 2 2 0 4.1000 7 3 4 5 1 4 5 6 6 Reduced Data: 2.2473 2.5240 -0.7279 -3.4592 2.1429 0.4972 2.9742 -1.3689 0.7943 -1.7622 -3.2981 -0.5637 # -------------------------------------------------------------------------- # # Kernel PCA double[][] data = { {0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6} }; Matrix X = new DenseMatrix(data).transpose(); int r = 3; Matrix R = KernelPCA.run(X, r); disp("Original Data:"); disp(X); disp("Reduced Data:"); disp(R); # Output Original Data: 0 2 3 2 2 0 4.1000 7 3 4 5 1 4 5 6 6 Reduced Data: 0.5011 0.0000 0.7063 -0.4987 -0.7080 -0.0001 0.4990 0.0018 -0.7078 -0.5013 0.7062 0.0017 # -------------------------------------------------------------------------- # # Locally Linear Embedding (LLE) double[][] data = { {0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6} }; Matrix X = new DenseMatrix(data).transpose(); int K = 3; int r = 3; Matrix R = LLE.run(X, K, r); disp("Original Data:"); disp(X); disp("Reduced Data:"); disp(R); # Output Computing directed adjacency graph... Creating the adjacency matrix. Nearest neighbors, N = 3. Original Data: 0 2 3 2 2 0 4.1000 7 3 4 5 1 4 5 6 6 Reduced Data: -1.0360 -0.8728 1.0793 1.0282 -1.1127 -0.8395 -0.9630 0.8727 -1.1450 0.9707 1.1128 0.9051 # -------------------------------------------------------------------------- # # Multi-dimensional Scaling (MDS) double[][] data = { {0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6} }; Matrix O = new DenseMatrix(data).transpose(); Matrix D = l2Distance(O, O); Matrix X = MDS.run(D, 3); disp("Reduced X:"); disp(X); # Output Reduced X: 2.2473 2.5240 -0.7279 -3.4592 2.1429 0.4972 2.9742 -1.3689 0.7943 -1.7622 -3.2981 -0.5637 # -------------------------------------------------------------------------- # # Principal Component Analysis (PCA) double[][] data = { {0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6} }; Matrix X = new DenseMatrix(data).transpose(); int r = 3; Matrix R = PCA.run(X, r); disp("Original Data:"); disp(X); disp("Reduced Data:"); disp(R); # Output Original Data: 0 2 3 2 2 0 4.1000 7 3 4 5 1 4 5 6 6 Reduced Data: -2.2473 2.5240 0.7279 3.4592 2.1429 -0.4972 -2.9742 -1.3689 -0.7943 1.7622 -3.2981 0.5637 # -------------------------------------------------------------------------- # # General Quadratic Programming /* * min 2 \ x' * Q * x + c' * x * s.t. A * x = b * B * x <= d */ /* * Number of unknown variables */ int n = 5; /* * Number of inequality constraints */ int m = 6; /* * Number of equality constraints */ int p = 3; Matrix x = rand(n, n); Matrix Q = x.mtimes(x.transpose()).plus(times(rand(1), eye(n))); Matrix c = rand(n, 1); double HasEquality = 1; Matrix A = times(HasEquality, rand(p, n)); x = rand(n, 1); Matrix b = A.mtimes(x); Matrix B = rand(m, n); double rou = -2; Matrix d = plus(B.mtimes(x), times(rou, ones(m, 1))); QPSolution solution = GeneralQP.solve(Q, c, A, b, B, d); # Output Phase I: Terminate successfully. x_opt: 640.1439 6991.6488 -2975.2211 -6851.4134 -6405.0501 s_opt: 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 lambda for the inequalities s_i >= 0: 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 B * x - d: -2123.5480 -2842.1394 -4496.4653 -1640.3370 -3613.3843 -14582.9937 lambda for the inequalities fi(x) <= s_i: 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 nu for the equalities A * x = b: -0.0000 -0.0000 -0.0000 residual: 1.07222e-11 A * x - b: 0.0000 0.0000 -0.0000 norm(A * x - b, "fro"): 0.000000 fval_opt: 3.65174e-11 The problem is feasible. Computation time: 0.590000 seconds halt execution temporarily in 1 seconds... Phase II: Terminate successfully. residual: 9.83544e-12 Optimal objective function value: 173.050 Optimizer: 1.4528 7.8845 -3.0459 -7.3020 -6.5998 B * x - d: -0.0000 -1.4506 -3.0662 -0.0000 -2.0318 -14.1904 lambda: 52.4785 0.0000 0.0000 137.3581 0.0000 0.0000 nu: -19.7244 -38.6926 -118.0042 norm(A * x - b, "fro"): 0.000000 Computation time: 0.061000 seconds # -------------------------------------------------------------------------- # # Quadratic Programming with Bound Constraints /* * min 2 \ x' * Q * x + c' * x * s.t. l <= x <= u */ int n = 5; Matrix x = rand(n); Matrix Q = minus(x.mtimes(x.transpose()), times(rand(1).getEntry(0, 0), eye(n))); Matrix c = plus(-2, times(2, rand(n, 1))); double l = 0; double u = 1; double epsilon = 1e-6; QPSolution S = QPWithBoundConstraints.solve(Q, c, l, u, epsilon); disp("Q:"); disp(Q); disp("c:"); disp(c); fprintf("Optimum: %g\n", S.optimum); fprintf("Optimizer:\n"); display(S.optimizer.transpose()); # Output Initial ofv: -0.678917 Iter 1, ofv: -1.28937, norm(PGrad): 3.08886 PLBFGS converges with norm(PGrad) 0.000000 Q: 1.6699 1.4719 0.9729 0.9943 1.6691 1.4719 0.5587 0.5777 0.6273 1.1480 0.9729 0.5777 1.4139 0.6709 1.0215 0.9943 0.6273 0.6709 0.4050 0.6021 1.6691 1.1480 1.0215 0.6021 1.0643 c: -0.3002 -1.7387 -1.7054 -1.8646 -1.9046 Optimum: -1.28937 Optimizer: 0 0.0469 0 0.7148 0 ************************************************** For other general-purpose optimization algorithms such as L-BFGS and accelerated proximal gradient, please refer to the documentation. # -------------------------------------------------------------------------- # # LDA int[][] documents = { {1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 6}, {2, 2, 4, 2, 4, 2, 2, 2, 2, 4, 2, 2}, {1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 0}, {5, 6, 6, 2, 3, 3, 6, 5, 6, 2, 2, 6, 5, 6, 6, 6, 0}, {2, 2, 4, 4, 4, 4, 1, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 0}, {5, 4, 2, 3, 4, 5, 6, 6, 5, 4, 3, 2} }; LDAOptions LDAOptions = new LDAOptions(); LDAOptions.nTopic = 2; LDAOptions.iterations = 5000; LDAOptions.burnIn = 1500; LDAOptions.thinInterval = 200; LDAOptions.sampleLag = 10; LDAOptions.alpha = 2; LDAOptions.beta = 0.5; LDA LDA = new LDA(LDAOptions); LDA.readCorpus(documents); LDA.train(); fprintf("Topic--term associations: \n"); display(LDA.topicMatrix); fprintf("Document--topic associations: \n"); display(LDA.indicatorMatrix); # Output Topic--term associations: 0.1258 0.0176 0.1531 0.0846 0.0327 0.3830 0.0418 0.1835 0.0360 0.2514 0.2713 0.0505 0.3393 0.0294 Document--topic associations: 0.2559 0.7441 0.1427 0.8573 0.8573 0.1427 0.6804 0.3196 0.5491 0.4509 0.4420 0.5580 # -------------------------------------------------------------------------- # # Factorization Machines (FM) String trainFilePath = "Train.txt"; String testFilePath = "Test.txt"; String outputFilePath = "FM-YijPredOnTest.txt"; // Load training data int idxStart = 0; FM.feedTrainingData(trainFilePath, idxStart); // Initialization FM.allocateResource(k); FM.feedParams(maxIter, lambda); FM.initialize(); // Train FM model parameters by training data FM.train(); // Prediction: generate and save FM-YijPredOnTest.txt DataSet testData = FM.loadData(testFilePath, 0); double[] Yij_pred = FM.predict(testData.X); ml.utils.IO.save(Yij_pred, outputFilePath); # -------------------------------------------------------------------------- # # STructured Sparse Regression (STSR) double lambda = 0.01; double nu = 0.00001; int maxIter = 30; String trainFilePath = "Train.txt"; String testFilePath = "Test.txt"; String outputFilePath = "STSR-YijPredOnTest.txt"; // Load training data int idxStart = 0; STSR.feedTrainingData(trainFilePath, idxStart); // Build tree structured pair groups /* * featureSize format: * User[\t]383 * Item[\t]1175 * Event[\t]1 */ String featureSizeFilePath = "FeatureSize.txt"; /* * Each line is a group feature index pair (idx1, idx2) separated by * a tab character, e.g. * (157, 158) * (157, 236) * (24, 157)[\t](157, 158)[\t](157, 236) */ String userFeatureGroupListFilePath = "UserTreeStructuredPairGroupList.txt"; String itemFeatureGroupListFilePath = "ItemTreeStructuredPairGroupList.txt"; STSR.buildTreeStructuredPairGroupList( featureSizeFilePath, userFeatureGroupListFilePath, itemFeatureGroupListFilePath ); // Initialization STSR.allocateResource(); STSR.feedParams(maxIter, lambda, nu); STSR.initialize(); // Train STSR model parameters by training data STSR.train(); // Prediction: generate and save STSR-YijPredOnTest.txt DataSet testData = STSR.loadData(testFilePath, idxStart); double[] Yij_pred = STSR.predict(testData.X); ml.utils.IO.save(Yij_pred, outputFilePath); # -------------------------------------------------------------------------- # ----------------------------------- Author: Mingjie Qian Version: 1.6.5 Date: April 25th, 2016
About
A stand-alone pure Java library for linear algebra and machine learning
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published