# Testing Mixed Models
## Jupyter Example with Python
### This example illustrates how to perform a linear mixed model analysis.

**PREPARE THE ENVIRONMENT**

**LINEAR MIXED MODEL**

**CHECK RESULTS**

## Prepare the Environment

### Import Packages

In [1]:
import swat
import pandas as pd
import matplotlib.pyplot as plt
from IPython.core.display import display, HTML
from swat.render import render_html #to visualize model results
%matplotlib inline

### CAS Server connection details

In [2]:
import os
import swat

os.environ["CAS_CLIENT_SSL_CA_LIST"] = '/etc/pki/tls/certs/trustedcerts.pem'

conn = swat.CAS('viya4-node-4.globalhls.sashq-d.openstack.sas.com',32232,'XXXXX','XXXXX')
out = conn.serverstatus()

NOTE: Grid node action status report: 1 nodes, 8 total actions executed.


### Import action sets

In [3]:
conn.loadactionset(actionset="dataStep")
conn.loadactionset(actionset="dataPreprocess")
conn.loadactionset(actionset="mixed")

NOTE: Added action set 'dataStep'.
NOTE: Added action set 'dataPreprocess'.
NOTE: Added action set 'mixed'.


## Access the Data

In [4]:
School=conn.CASTable(name='SchoolSample', caslib='Public')
conn.summary(School)

Unnamed: 0,Column,Min,Max,N,NMiss,Mean,Sum,Std,StdErr,Var,USS,CSS,CV,TValue,ProbT,Skewness,Kurtosis
0,SchoolID,1.0,1000.0,400000.0,0.0,500.5,200200000.0,288.675351,0.456436,83333.46,133533400000.0,33333300000.0,57.677393,1096.539738,0.0,-1.268599e-17,-1.200002
1,nID,1.0,50.0,400000.0,0.0,25.5,10200000.0,14.430888,0.022817,208.2505,343400000.0,83300000.0,56.591717,1117.576158,0.0,5.723501e-17,-1.20096
2,Neighborhood,1.0,15035.0,400000.0,0.0,7518.0,3007200000.0,4330.154313,6.846575,18750240.0,30108210000000.0,7500076000000.0,57.597158,1098.067262,0.0,1.708937e-15,-1.199976
3,bInt,0.000447,14.999512,400000.0,0.0,7.487396,2994958.0,4.327243,0.006842,18.72504,29914440.0,7489996.0,57.793702,1094.332966,0.0,0.004498134,-1.202599
4,bTime,3.3e-05,14.99972,400000.0,0.0,7.500519,3000207.0,4.350453,0.006879,18.92644,30073670.0,7570556.0,58.002022,1090.402555,0.0,0.004663856,-1.204541
5,bTime2,1.4e-05,1.0,400000.0,0.0,0.499903,199961.4,0.287651,0.000455,0.08274286,133058.4,33097.06,57.541225,1099.134624,0.0,-0.003931826,-1.193717
6,sID,1.0,2.0,400000.0,0.0,1.5,600000.0,0.500001,0.000791,0.2500006,1000000.0,100000.0,33.333375,1897.364224,0.0,9.989713e-17,-2.00001
7,Time,1.0,4.0,400000.0,0.0,2.5,1000000.0,1.118035,0.001768,1.250003,3000000.0,500000.0,44.721415,1414.211795,0.0,-7.631934000000001e-17,-1.360002
8,Math,-2.761392,91.265275,400000.0,0.0,29.987198,11994880.0,17.141076,0.027102,293.8165,477219100.0,117526300.0,57.161312,1106.439839,0.0,0.7888394,-0.005196


## LINEAR MIXED MODEL

In [5]:
res=conn.mixed(
      table={"name":School},
      classVars=['Neighborhood','SchoolID'],
      model={'depVars':'Math',
             'effects':['Time',
                        {'vars':['Time','Time'],'interact':'CROSS'}
                       ],
              'printsol':'TRUE'},
      random=[{'effects':['Time',
                          {'vars':['Time','Time'],'interact':'CROSS'}
                         ],
               'noint':'False',
               'subject':[{'vars':{'Neighborhood','SchoolID'},'interact':'CROSS'}],
               'type':'RANDOM',
               'covType':'UN',
               'printsol':'True'}
             ]
    )

NOTE: Convergence criterion (GCONV=1E-8) satisfied.


### CHECK RESULTS

In [6]:
print(res.ModelInfo)

Model Information

         RowId                Description  \
0         DATA                Data Source   
1  RESPONSEVAR          Response Variable   
2    ESTMETHOD          Estimation Method   
3     DFMETHOD  Degrees of Freedom Method   
4     DMMETHOD       Design Matrix Method   

                                  Value  
0                          SCHOOLSAMPLE  
1                                  Math  
2  Restricted Maximum Likelihood (REML)  
3                              Residual  
4                                 Dense  


In [7]:
print(res.OptInfo)

Optimization Information

          RowId                 Description                        Value
0     TECHNIQUE      Optimization Technique  Newton-Raphson with Ridging
1       HESSIAN     Hessian in Optimization                        Exact
2    PARAMETERS  Parameters in Optimization                            6
3        LOWERB            Lower Boundaries                            3
4        UPPERB            Upper Boundaries                            0
5      RESIDVAR           Residual Variance                     Profiled
6  STARTVALFROM        Starting Values From                         Data


In [8]:
print(res.ClassInfo)

Class Level Information

          Class   Levels                                             Values
0  Neighborhood  15035.0  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1...
1      SchoolID   1000.0  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1...


In [9]:
print(res.NObs)

Number of Observations

   RowId                  Description     Value
0  NREAD  Number of Observations Read  400000.0
1  NUSED  Number of Observations Used  400000.0


In [10]:
print(res.Dimensions)

Dimensions

       RowId                   Description  Value
0  GCOVPARMS  G-side Covariance Parameters      6
1  RCOVPARMS  R-side Covariance Parameters      1
2      NCOLX                  Columns in X      3
3      NCOLZ      Columns in Z per Subject      3
4  NSUBJECTS        Subjects (Blocks in V)  50000


In [11]:
print(res.ConvergenceStatus)

Convergence Status

                                          Reason  Status  pdG
0  Convergence criterion (GCONV=1E-8) satisfied.       0    1


In [12]:
print(res.CovParms)

Covariance Parameter Estimates

   RowId   CovParm                Subject   Estimate
0      1   UN(1,1)  Neighborhood*SchoolID  18.796421
1      2   UN(2,1)  Neighborhood*SchoolID  -0.121460
2      3   UN(2,2)  Neighborhood*SchoolID  18.987352
3      4   UN(3,1)  Neighborhood*SchoolID  -0.000438
4      5   UN(3,2)  Neighborhood*SchoolID  -0.008045
5      6   UN(3,3)  Neighborhood*SchoolID   0.083937
6      7  Residual                          0.999797


In [13]:
print(res.FitStatistics)

Fit Statistics

           RowId               Description         Value
0  LOGLIKELIHOOD     -2 Res Log Likelihood  1.681870e+06
1            AIC  AIC  (smaller is better)  1.681884e+06
2           AICC  AICC (smaller is better)  1.681884e+06
3            BIC  BIC  (smaller is better)  1.681945e+06
4           CAIC  CAIC (smaller is better)  1.681952e+06
5           HQIC  HQIC (smaller is better)  1.681903e+06


In [14]:
print(res.ParameterEstimates)

Solution for Fixed Effects

      Effect  Estimate    StdErr        DF      tValue  Probt
0  Intercept  7.496057  0.021293  399997.0  352.035143    0.0
1       Time  7.493864  0.021077  399997.0  355.549379    0.0
2  Time*Time  0.500864  0.002044  399997.0  245.032794    0.0
