Updated documentation

trthatcher · May 1, 2016 · ac816d7 · ac816d7
1 parent 80bf791
commit ac816d7
Show file tree

Hide file tree

Showing 4 changed files with 183 additions and 26 deletions.
diff --git a/README.md b/README.md
@@ -33,6 +33,147 @@ are quadratics:
 <p align="center"><img alt="Quadratic Discriminant Analysis" src="doc/visualization/qda.png"  /></p>
 
 
-## Getting Started ([example.jl](example/example.jl))
+## Getting Started
 
-WIP
+If the package has not yet been installed, run the following line of code:
+
+```julia
+Pkg.add("DiscriminantAnalysis")
+```
+
+Sample code is available in [example.jl](example/example.jl). To get started,
+run the following block of code:
+
+```julia
+using DataFrames, DiscriminantAnalysis;
+
+iris_df = readtable("iris.csv");
+pool!(iris_df, [:Species]);  # Ensure species is made a pooled data vector
+```
+
+Although [Dataframes.jl](https://github.com/JuliaStats/DataFrames.jl) is not
+required, it will help to stage the data. We require a data matrix (row-major 
+or column-major ordering of the observations). We then define the data matrix
+``X`` and the class reference vector ``y`` (vector of 1-based class indicators):
+
+```julia
+X = convert(Array{Float64}, iris_df[[:PetalLength, :PetalWidth, :SepalLength, :SepalWidth]]);
+y = iris_df[:Species].refs;  # Class indices
+```
+
+A Linear Discriminant model can be created using the ``lda`` function:
+
+```julia
+julia> model1 = lda(X, y)
+Linear Discriminant Model
+
+Regularization Parameters:
+    γ = N/A
+
+Class Priors:
+    Class 1 Probability: 33.3333%
+    Class 2 Probability: 33.3333%
+    Class 3 Probability: 33.3333%
+
+Class Means:
+    Class 1 Mean: [1.464, 0.244, 5.006, 3.418]
+    Class 2 Mean: [4.26, 1.326, 5.936, 2.77]
+    Class 3 Mean: [5.552, 2.026, 6.588, 2.974]
+```
+
+Once the model is defined, we can use it to classify new data with the
+``classify`` function:
+
+```julia
+julia> y_pred1 = classify(model1, X);
+
+julia> accuracy1 = sum(y_pred1 .== y)/length(y)
+0.98
+```
+
+By default, the LDA model assumes the data is in row-major ordering of
+observations, the class weights are equal and that the regularization parameter
+gamma is unspecified. 
+
+```julia
+julia> model2 = lda(X', y, order = Val{:col}, gamma = 0.1, priors = [2/5; 2/5; 1/5])
+Linear Discriminant Model
+
+Regularization Parameters:
+    γ = 0.1
+
+Class Priors:
+    Class 1 Probability: 40.0%
+    Class 2 Probability: 40.0%
+    Class 3 Probability: 20.0%
+
+Class Means:
+    Class 1 Mean: [1.464, 0.244, 5.006, 3.418]
+    Class 2 Mean: [4.26, 1.326, 5.936, 2.77]
+    Class 3 Mean: [5.552, 2.026, 6.588, 2.974]
+
+
+
+julia> y_pred2 = classify(model2, X');
+
+julia> accuracy2 = sum(y_pred2 .== y)/length(y)
+0.98
+```
+
+We may also fit a Quadratic Discriminant model using the ``qda`` function:
+
+```julia
+julia> model3 = qda(X, y, lambda = 0.1, gamma = 0.1, priors = [1/3; 1/3; 1/3])
+Quadratic Discriminant Model
+
+Regularization Parameters:
+    γ = 0.1
+    λ = 0.1
+
+Class Priors:
+    Class 1 Probability: 33.3333%
+    Class 2 Probability: 33.3333%
+    Class 3 Probability: 33.3333%
+
+Class Means:
+    Class 1 Mean: [1.464, 0.244, 5.006, 3.418]
+    Class 2 Mean: [4.26, 1.326, 5.936, 2.77]
+    Class 3 Mean: [5.552, 2.026, 6.588, 2.974]
+
+
+
+julia> y_pred3 = classify(model3, X);
+
+julia> accuracy3 = sum(y_pred3 .== y)/length(y)
+0.9866666666666667
+```
+
+Finally, if the number of classes is less than or equal to the number of
+parameters, then a Canonical Discriminant model may be used in place of a Linear
+Discriminant model to simultaneously reduce the dimensionality of the
+classifier:
+
+```julia
+julia> model4 = cda(X, y, gamma = Nullable{Float64}(), priors = [1/3; 1/3; 1/3])
+Canonical Discriminant Model
+
+Regularization Parameters:
+    γ = N/A
+
+Class Priors:
+    Class 1 Probability: 33.3333%
+    Class 2 Probability: 33.3333%
+    Class 3 Probability: 33.3333%
+
+Class Means:
+    Class 1 Mean: [1.464, 0.244, 5.006, 3.418]
+    Class 2 Mean: [4.26, 1.326, 5.936, 2.77]
+    Class 3 Mean: [5.552, 2.026, 6.588, 2.974]
+
+
+
+julia> y_pred4 = classify(model4, X);
+
+julia> accuracy4 = sum(y_pred4 .== y)/length(y)
+0.98
+```
diff --git a/doc/index.rst b/doc/index.rst
@@ -175,9 +175,11 @@ LDA), a whitening matrix :math:`\mathbf{W}_k` is computed such that:
     = \mathbf{W}_k^{\intercal} \mathbf{\Sigma}_k \mathbf{W}_k
     = I \quad \implies \quad \mathbf{W} = \mathbf{\Sigma}^{-1/2}
 
-This is accomplished using an eigendecomposition of the covariance matrix or a
-singular value decomposition of the data matrix. We can then use the 
-transformation:
+This is accomplished using an QR or singular value decomposition of the data 
+matrix where possible. When the covariance matrix must be calculated directly,
+the Cholesky decomposition is used to whiten the data instead.
+
+Once the whitening matrix has been computed, we can then use the transformation:
 
 .. math::
 
@@ -203,23 +205,35 @@ Package Interface
 
 .. note::
 
-    In DiscriminantAnalysis.jl, the input data matrix ``X`` is assumed to be 
-    stored in the same format as a `design matrix`_ in statistics. In other 
-    words, each row of ``X`` corresponds to an observation vector:
+    Data matrices may be stored in either row-major or column-major ordering of
+    observations. Row-major ordering means each row corresponds to an
+    observation and column-major ordering means each column corresponds to an
+    observation:
 
-    .. math:: \mathbf{X} = 
+    .. math:: \mathbf{X}_{row} = 
                   \begin{bmatrix} 
                       \leftarrow \mathbf{x}_1 \rightarrow \\ 
                       \leftarrow \mathbf{x}_2 \rightarrow \\ 
                       \vdots \\ 
                       \leftarrow \mathbf{x}_n \rightarrow 
                    \end{bmatrix}
-
-    This differs from other Julia packages.
+              \qquad
+              \mathbf{X}_{col} = 
+                  \begin{bmatrix}
+                      \uparrow & \uparrow & & \uparrow  \\
+                      \mathbf{x}_1 & \mathbf{x}_2 & \cdots & \mathbf{x_n} \\
+                      \downarrow & \downarrow & & \downarrow
+                  \end{bmatrix}
+
+    In DiscriminantAnalysis.jl, the input data matrix ``X`` is assumed to be 
+    stored in the same format as a `design matrix`_ in statistics (row-major) by
+    default. This ordering can be switched between row-major and column-major by
+    setting the ``order`` argument to ``Val{:row}`` and ``Val{:col}``,
+    respectively.
 
 .. _design matrix: https://en.wikipedia.org/wiki/Design_matrix
 
-.. function:: lda(X, y [; M, priors, gamma])
+.. function:: lda(X, y [; order, M, priors, gamma])
 
     Fit a regularized linear discriminant model based on data ``X`` and class 
     identifier ``y``. ``X`` must be a matrix of floats and ``y`` must be a 
@@ -248,24 +262,25 @@ Package Interface
     object returned by the ``lda`` function:
 
     ========== =====================================================
-    Argument   Description
+    Field      Description
     ========== =====================================================
     ``is_cda`` Boolean value; the model is a CDA model if ``true``
     ``W``      The whitening matrix used to decorrelate observations
+    ``order``  The ordering of observations in the data matrix
     ``M``      A matrix of class means; one per row
     ``priors`` A vector of class prior probabilities
     ``gamma``  The regularization parameter as defined above.
     ========== =====================================================
 
 
-.. function:: cda(X, y [; M, priors, gamma])
+.. function:: cda(X, y [; order, M, priors, gamma])
 
     Fit a regularized canonical discriminant model based on data ``X`` and class 
     identifier ``y``. The CDA model is identical to an LDA model, except that
     dimensionality reduction is included in the whitening transformation matrix.
     See the ``lda`` documentation for information on the arguments.
 
-.. function:: qda(X, y [; M, priors, gamma, lambda])
+.. function:: qda(X, y [; order, M, priors, gamma, lambda])
 
     Fit a regularized quadratic discriminant model based on data ``X`` and class 
     identifier ``y``. ``X`` must be a matrix of floats and ``y`` must be a 
@@ -300,9 +315,10 @@ Package Interface
     object returned by the ``qda`` function:
 
     ========== =====================================================
-    Argument   Description
+    Field      Description
     ========== =====================================================
     ``W_k``    The vector of whitening matrices (one per class)
+    ``order``  The ordering of observations in the data matrix
     ``M``      A matrix of class means; one per row
     ``priors`` A vector of class prior probabilities
     ``gamma``  The regularization parameter as defined above.

diff --git a/example/example.jl b/example/example.jl
@@ -1,34 +1,34 @@
-using DataFrames, DiscriminantAnalysis
+using DataFrames, DiscriminantAnalysis;
 
-iris_df = readtable("iris.csv")
-pool!(iris_df, [:Species])  # Ensure species is made a pooled data vector
+iris_df = readtable("iris.csv");
+pool!(iris_df, [:Species]);  # Ensure species is made a pooled data vector
 
-X = convert(Array{Float64}, iris_df[[:PetalLength, :PetalWidth, :SepalLength, :SepalWidth]])
-y = iris_df[:Species].refs  # Class indices
+X = convert(Array{Float64}, iris_df[[:PetalLength, :PetalWidth, :SepalLength, :SepalWidth]]);
+y = iris_df[:Species].refs;  # Class indices
 
 
 #== Fitting the LDA model ==#
 
 model1 = lda(X, y)
-y_pred1 = classify(model1, X)
+y_pred1 = classify(model1, X);
 accuracy1 = sum(y_pred1 .== y)/length(y)
 
 # The following illustrates column-major ordering with gamma-regularization of 0.1 and non-uniform
 # class priors:
 model2 = lda(X', y, order = Val{:col}, gamma = 0.1, priors = [2/5; 2/5; 1/5])
-y_pred2 = classify(model2, X')
+y_pred2 = classify(model2, X');
 accuracy2 = sum(y_pred2 .== y)/length(y)
 
 
 #== Fitting the QDA model ==#
 
 model3 = qda(X, y, lambda = 0.1, gamma = 0.1, priors = [1/3; 1/3; 1/3])
-y_pred3 = classify(model3, X)
+y_pred3 = classify(model3, X);
 accuracy3 = sum(y_pred3 .== y)/length(y)
 
 
 #== Fitting the CDA model ==#
 
 model4 = cda(X, y, gamma = Nullable{Float64}(), priors = [1/3; 1/3; 1/3])
-y_pred4 = classify(model4, X)
+y_pred4 = classify(model4, X);
 accuracy4 = sum(y_pred4 .== y)/length(y)
diff --git a/example/visualization.jl b/example/visualization.jl
@@ -21,7 +21,7 @@ function boundary(model, xrange, yrange, is_quad::Bool = false)
              vec(Float64[y for x in xrange, y in yrange]))
     δ = DiscriminantAnalysis.discriminants(model, is_quad ? hcat(Z, Z.^2, Z[:,1] .* Z[:,2]) : Z)
     Z = reshape(δ[:,1] - δ[:,2], length(xrange), length(yrange))
-    Contour.coordinates(Contour.contours(xrange, yrange, Z, 0.0)[1].lines[1])
+    Contour.coordinates(Contour.lines(Contour.contour(xrange,yrange,Z,0.0))[1])
 end
 
 limits(X::Matrix) = (minimum(X[:,1]), maximum(X[:,1]), minimum(X[:,2]), maximum(X[:,2]))