# Checking fixed-effects rank

A `MixedModels.LinearMixedModel` can be constructed without being fit.

For the purposes of examples, load the dataset stored with the `MixedModels` package.

In [1]:
using DataFrames, MixedModels, RData
const dat = convert(Dict{Symbol,DataFrame},
    load(Pkg.dir("MixedModels", "test", "dat.rda")))



Dict{Symbol,DataFrames.DataFrame} with 61 entries:
  :bs10          => 1104×6 DataFrames.DataFrame…
  :Genetics      => 60×5 DataFrames.DataFrame…
  :Contraception => 1934×6 DataFrames.DataFrame…
  :Mmmec         => 354×6 DataFrames.DataFrame…
  :kb07          => 1790×10 DataFrames.DataFrame…
  :Rail          => 18×2 DataFrames.DataFrame…
  :KKL           => 53765×24 DataFrames.DataFrame…
  :Bond          => 21×3 DataFrames.DataFrame…
  :VerbAgg       => 7584×9 DataFrames.DataFrame…
  :ergoStool     => 36×3 DataFrames.DataFrame…
  :s3bbx         => 2449×6 DataFrames.DataFrame…
  :cake          => 270×5 DataFrames.DataFrame…
  :Cultivation   => 24×4 DataFrames.DataFrame…
  :Pastes        => 60×4 DataFrames.DataFrame…
  :Exam          => 4059×5 DataFrames.DataFrame…
  :Socatt        => 1056×9 DataFrames.DataFrame…
  :WWheat        => 60×3 DataFrames.DataFrame…
  :Pixel         => 102×5 DataFrames.DataFrame…
  :Arabidopsis   => 625×8 DataFrames.DataFrame…
  :TeachingII    => 96×14 DataFra

A model for the `sleepstudy` data with random-effects for the intercept and for the days of sleep deprivation, `U`, by subject, `G` is declared as

In [2]:
m = lmm(@formula(Y ~ 1 + U + (1+U|G)), dat[:sleepstudy]);

The fields in `m`

In [3]:
fieldnames(m)

6-element Array{Symbol,1}:
 :formula
 :trms   
 :sqrtwts
 :A      
 :L      
 :optsum 

include `trms`, a vector of `AbstractTerms`, and two blocked arrays, `A` and `L`.  The last two terms are always `MatrixTerm`s representing the fixed-effects model matrix, $\bf X$, and the response vector, $\bf y$.  The terms preceding these two are always random-effects terms.

In [4]:
typeof.(m.trms)

3-element Array{DataType,1}:
 MixedModels.VectorFactorReTerm{Float64,String,UInt8}
 MixedModels.MatrixTerm{Float64,Array{Float64,2}}    
 MixedModels.MatrixTerm{Float64,Array{Float64,2}}    

The `A` field is a blocked matrix

In [5]:
typeof(m.A)

BlockArrays.BlockArray{Float64,2,AbstractArray{Float64,2}}

whose rows and columns of blocks correspond to the terms.

In [6]:
nblocks(m.A)

(3, 3)

The second last diagonal block in `A` is $\bf X'X$.

In [7]:
m.A[Block(2,2)]

2×2 Array{Float64,2}:
 180.0   810.0
 810.0  5130.0

This can be verified by noting

In [8]:
dat[:sleepstudy]

Unnamed: 0,Y,U,G
1,249.56,0.0,308
2,258.7047,1.0,308
3,250.8006,2.0,308
4,321.4398,3.0,308
5,356.8519,4.0,308
6,414.6901,5.0,308
7,382.2038,6.0,308
8,290.1486,7.0,308
9,430.5853,8.0,308
10,466.3535,9.0,308


In [9]:
X = m.trms[end - 1].x   # the model matrix X

180×2 Array{Float64,2}:
 1.0  0.0
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0
 1.0  6.0
 1.0  7.0
 1.0  8.0
 1.0  9.0
 1.0  0.0
 1.0  1.0
 1.0  2.0
 ⋮       
 1.0  8.0
 1.0  9.0
 1.0  0.0
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0
 1.0  6.0
 1.0  7.0
 1.0  8.0
 1.0  9.0

In [10]:
X'X

2×2 Array{Float64,2}:
 180.0   810.0
 810.0  5130.0

A pivoted Cholesky decomposition of this matrix returns the computational rank, which is also the rank of $\bf X$.

In [11]:
Lpiv = cholfact(Symmetric(m.A[Block(2,2)], :L), Val{true})

Base.LinAlg.CholeskyPivoted{Float64,Array{Float64,2}}([71.624 810.0; 11.3091 7.2184], 'L', [2, 1], 2, 0.0, 0)

In [12]:
rank(Lpiv)

An optional argument, `tol`, in the call to `cholfact` specifies the tolerance for determining the rank.  It defaults to zero which would correspond to exact rank deficiency.  A more reasonable value may be $\sqrt{\epsilon}$ where $\epsilon$ is the relative machine precision.

In [13]:
√eps()

As explained in the documentation for [`dpstrf`](https://software.intel.com/en-us/mkl-developer-reference-fortran-pstrf), this value should be negated to be used as a relative tolerance.  Otherwise it is used as an absolute tolerance which usually is not what you want.

In [14]:
Lpiv = cholfact(Symmetric(m.A[Block(2,2)], :L), Val{true}, tol = -√eps())

Base.LinAlg.CholeskyPivoted{Float64,Array{Float64,2}}([71.624 810.0; 11.3091 7.2184], 'L', [2, 1], 2, -1.4901161193847656e-8, 0)