# Replicating Thas et al. 2012
To get some feeling with the PIM model, we will here try to replicate the results obtained in Thas et al. (2012). In this paper, the researchers simulated data under the normal linear model for varying parameters. They then estimated beta coefficients using the PIM model and calculated (among other things) the average of the beta-estimates according to the semi-parametric PIM theory and its sample variance. We will restrict us for now to the setting in which the assumption of homoscedastic variance is met.

### Global parameters

In [6]:
# Library
library(pim)
library(dplyr)


Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



In [3]:
# Seed
set.seed(1990)

# Simulate over 1000 simulations in which we vary alpha, u, sigma and n:
nsim <- 1000
alpha <- c(1,10)
sigma <- c(1,5)
u <- c(1,10)
n <- c(25,50,200)

# All combinations of the parameters 
combinations <- expand.grid('alpha' = alpha,'sigma' = sigma,'u' = u, 'n' = n)

# Empty vector
BetaValues <- array(NA, dim = c(nsim, dim(combinations)[1]))

### Loop over all combinations

In [None]:
# loop over the combinations
for(c in 1:dim(combinations)[1]){
  # Set the parameters for this setting
  nSim <- combinations[c,'n']
  uSim <- combinations[c,'u']
  alphaSim <- combinations[c,'alpha']
  sigmaSim <- combinations[c,'sigma']

  # Generate predictor
  X <- runif(n = nSim, min = 0.1, max = uSim)

  # Fit the model nsim times
  for(i in 1:nsim){
    # Generate data
    Y <- alphaSim*X + rnorm(n = nSim, mean = 0, sd = sigmaSim)

    # PIM package beta parameter: note that we skip this iteration if estimation fails
    value <- try(pim(formula = Y ~ X, link = 'probit', model = 'difference')@coef, silent = TRUE)
    if(class(value) == 'try-error'){
      print(paste0('Error in sim ',i, ' c = ', c, '. Message = ', attr(value,"condition")))
      next
    }else{
      BetaValues[i,c] <- value
    }
  }
}

Now showing the results for 1000 simulations:

In [8]:
# Average beta hat and variance
combinations <- combinations %>% mutate(beta = round(alpha/(sqrt(2) * sigma), digits = 3))
ReplResults <- data.frame(combinations, AvBetaHat = round(colMeans(BetaValues), digits = 5),
              VarBetaHat = round(apply(BetaValues, 2, var), digits = 5))
ReplResults

alpha,sigma,u,n,beta,AvBetaHat,VarBetaHat
1,1,1,25,0.707,0.78883,0.50118
10,1,1,25,7.071,7.41565,1.8844
1,5,1,25,0.141,0.16077,0.35292
10,5,1,25,1.414,1.5446,0.6004
1,1,10,25,0.707,0.7432,0.01653
10,1,10,25,7.071,1.01708,0.0
1,5,10,25,0.141,0.14716,0.00313
10,5,10,25,1.414,,
1,1,1,50,0.707,0.72004,0.22838
10,1,1,50,7.071,7.26259,0.96681


> Why is the combination alpha = 10 and u = 10 not good?

Note: most of the times, the system of nonlinear equations returns the warning that no good solution has been found!