Daniele Durante
This tutorial implementation focuses on assessing the maximization performance and the computational efficiency of the different algorithms for the estimation of latent class models with covariates. In particular, this assessment considers the dataset election
from the R
library poLCA
.
The analyses reproduce those discussed in Section 3 of the paper: Durante, D., Canale, A. and Rigon, T. (2018). A nested expectation-maximization algorithm for latent class models with covariates [arXiv:1705.03864v5], where we propose a novel nested EM algorithm for improved maximum likelihood estimation of latent class models with covariates. Additional assessments and results not discussed in the paper are also considered.
The functions for the implementation of the different algorithms (including the nested EM, and the popular competitors currently considered in the literature) are available in the source file LCA-Covariates-Algorithms.R
. More comments on the different maximization routines can be found in the file LCA-Covariates-Algorithms.R
.
Let us load this source file, along with the election
dataset.
rm(list = ls())
source("LCA-Covariates-Algorithms.R")
data(election)
str(election[, c(1:12,17)])
## 'data.frame': 1785 obs. of 13 variables:
## $ MORALG : Factor w/ 4 levels "1 Extremely well",..: 3 4 1 2 2 2 2 2 2 1 ...
## $ CARESG : Factor w/ 4 levels "1 Extremely well",..: 1 3 2 2 4 3 NA 2 2 1 ...
## $ KNOWG : Factor w/ 4 levels "1 Extremely well",..: 2 4 2 2 2 3 2 2 2 1 ...
## $ LEADG : Factor w/ 4 levels "1 Extremely well",..: 2 3 1 2 3 2 2 2 3 1 ...
## $ DISHONG: Factor w/ 4 levels "1 Extremely well",..: 3 2 3 2 2 2 4 3 4 4 ...
## $ INTELG : Factor w/ 4 levels "1 Extremely well",..: 2 2 2 2 2 2 2 2 2 1 ...
## $ MORALB : Factor w/ 4 levels "1 Extremely well",..: 1 NA 2 2 1 2 NA 3 2 2 ...
## $ CARESB : Factor w/ 4 levels "1 Extremely well",..: 1 NA 2 3 1 NA 3 4 3 4 ...
## $ KNOWB : Factor w/ 4 levels "1 Extremely well",..: 2 2 2 2 2 3 2 4 2 2 ...
## $ LEADB : Factor w/ 4 levels "1 Extremely well",..: 2 3 3 2 2 2 2 4 2 3 ...
## $ DISHONB: Factor w/ 4 levels "1 Extremely well",..: 4 NA 3 3 3 2 4 3 3 3 ...
## $ INTELB : Factor w/ 4 levels "1 Extremely well",..: 2 1 2 1 2 2 NA 4 2 2 ...
## $ PARTY : num 5 3 1 3 7 1 6 1 1 1 ...
Since the main focus is on comparing the computational performance of the different algorithms, let us for simplicity remove all the statistical units having missing values.
election <- na.omit(election)
Consistent with the analyses in Section 3 of the paper: Durante, D., Canale, A. and Rigon, T. (2018). A nested expectation-maximization algorithm for latent class models with covariates [arXiv:1705.03864v5], we focus on the latent class model for the J = 12
categorical evaluations of the candidates Al Gore and George Bush, with the variable PARTY
(affiliation party) as a covariate in the multinomial logistic regression for the latent classes. Using the syntax of the library poLCA
, this model can be defined as follows:
f_election <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ PARTY
We perform estimation of the parameters in the above latent class model with covariates under different computational routines (including our novel nested EM algorithm), and compare maximization performance along with computational efficiency.
Although the assessment with R = 2
classes is not discussed in the paper, let us first consider this simple scenario and then move to R = 3
.
n_c <- 2
To provide a detailed computational assessment, we perform estimation under the different algorithms for Rep_Tot = 100
runs at varying initializations. For each run, the algorithms are all initialized at the same starting values, which are controlled by a seed
(changing across the different runs). Let us, therefore, define this 100 seeds and the variance sigma
of the zero mean Gaussian variables from which the initial values for the β parameters are generated.
Rep_Tot <- 100
seed_rep <- c(1:100)
sigma <- 0.5
Here we consider the one-step EM algorithm with Newton-Raphson methods proposed by Bandeen-Roche et al. (1997), and discussed in Section 1.1 of our paper. This requires the function newton_em()
in the source file LCA-Covariates-Algorithms.R
we uploaded before.
Let us first create the quantities to be monitored for each run. These include the number of iterations to reach convergence, the log-likelihood sequence, and a vector monitoring presence (1) or absence (0) of drops in the log-likelihood sequence at each iteration.
# Create the allocation matrices for the quantities to be monitored.
iter_NR_EM <- rep(0, Rep_Tot)
llik_NR_EM <- matrix(0, Rep_Tot, 1000)
llik_decrement_NR_EM <- matrix(0, Rep_Tot, 1000)
# 1000 means that the maximum number of iterations in the EM algorithm is fixed to 1000.
Finally, let us perform the Rep_Tot = 100
runs of the one-step EM algorithm with Newton-Raphson methods. We also monitor the computational time via the function system.time()
.
# Perform the algorithm.
time_NR_EM <- system.time(
for (rep in 1:Rep_Tot){
fit_NR_EM <- newton_em(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NR_EM[rep] <- fit_NR_EM[[1]]
llik_NR_EM[rep,] <- fit_NR_EM[[2]]
llik_decrement_NR_EM[rep,] <- fit_NR_EM[[3]]}
)[3]
2. EM algorithm with Newton-Raphson as in Formann (1992) and Van der Heijden et al. (1996) (one-step)
As discussed in the paper, the algorithm proposed by Bandeen-Roche et al. (1997) leverages a Newton-Raphson step which is based on a plug-in estimate for the Hessian of the full-model log-likelihood, instead of maximizing the expectation of the complete log-likelihood. A more formal approach implemented in Formann (1992) and Van der Heijden et al. (1996), considers instead a Newton-Raphson step aimed at maximizing a quadratic approximation of the expectation for the complete log-likelihood. Implementation of this routine requires the function newton_em_Q1()
in the source file LCA-Covariates-Algorithms.R
we uploaded before.
# Create the allocation matrices for the quantities to be monitored.
iter_NR_EM_Q1 <- rep(0, Rep_Tot)
llik_NR_EM_Q1 <- matrix(0, Rep_Tot, 1000)
llik_decrement_NR_EM_Q1 <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NR_EM_Q1 <- system.time(
for (rep in 1:Rep_Tot){
fit_NR_EM_Q1 <- newton_em_Q1(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NR_EM_Q1[rep] <- fit_NR_EM_Q1[[1]]
llik_NR_EM_Q1[rep,] <- fit_NR_EM_Q1[[2]]
llik_decrement_NR_EM_Q1[rep,] <- fit_NR_EM_Q1[[3]]}
)[3]
Here we consider the re-scaled version of the above one-step EM algorithm with Newton-Raphson methods. This modification is discussed in Section 1.1 of our paper, and its general version can be found in Chapter 1.5.6 of McLachlan and Krishnan (2007). Also this algorithm requires the function newton_em_Q1()
in the source file LCA-Covariates-Algorithms.R
we uploaded before. However, now the parameter 0 < α < 1 should be modified to reduce concerns about drops in the log-likelihood sequence. Here we consider the case with α = 0.5.
# Create the allocation matrices for the quantities to be monitored.
iter_NR_EM_Q1_0.5 <- rep(0, Rep_Tot)
llik_NR_EM_Q1_0.5 <- matrix(0, Rep_Tot, 1000)
llik_decrement_NR_EM_Q1_0.5 <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NR_EM_Q1_0.5 <- system.time(
for (rep in 1:Rep_Tot){
fit_NR_EM_Q1_0.5 <- newton_em_Q1(f_election, election, alpha=0.5,nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NR_EM_Q1_0.5[rep] <- fit_NR_EM_Q1_0.5[[1]]
llik_NR_EM_Q1_0.5[rep,] <- fit_NR_EM_Q1_0.5[[2]]
llik_decrement_NR_EM_Q1_0.5[rep,] <- fit_NR_EM_Q1_0.5[[3]]}
)[3]
All the previous algorithms are not guaranteed to provide monotone log-likelihood sequences. A more robust routine proposed by Bohning (1992) replaces the Hessian in the above Newton-Raphson updates, with a matrix that dominates it globally, thus guaranteeing monotone and more stable convergence. Although Bohning (1992) focuses on a multinomial logistic regression with observed responses, his procedure can be adapted also to our latent class model with covariates as seen in the function MM_em()
in the source file LCA-Covariates-Algorithms.R
.
# Create the allocation matrices for the quantities to be monitored.
iter_MM <- rep(0, Rep_Tot)
llik_MM <- matrix(0, Rep_Tot, 1000)
llik_decrement_MM <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_MM <- system.time(
for (rep in 1:Rep_Tot){
fit_MM <- MM_em(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_MM[rep] <- fit_MM[[1]]
llik_MM[rep,] <- fit_MM[[2]]
llik_decrement_MM[rep,] <- fit_MM[[3]]}
)[3]
Here we consider the classical three-step strategy to estimate latent class models with covariates (e.g. Clogg 1995). As discussed in Section 1.2 of our paper, this algorithm consists of three steps.
- Estimate a latent class model without covariates
f_election_unconditional <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG, MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ 1
. This requires the functionunconditional_em()
(inLCA-Covariates-Algorithms.R
). - Using the estimates in 1, predict the latent classes si, i = 1, ..., n, by assigning each unit i to the class r with the highest predicted probability.
- Using the
R
functionmultinom
in the librarynnet
, estimate the coefficients β1, ..., βR from a multinomial logistic regression with the predicted s1, ..., sn as responses.
The code to implement this routine and save the relevant quantities is:
# Create the allocation matrix for the full--model log-likelihood sequence.
llik_3_step_classical <- rep(0, Rep_Tot)
# Define the useful quantities to compute the full--model log-likelihood sequence.
f_election_3_step <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ PARTY
nclass = n_c
mframe_election_3_step <- model.frame(f_election_3_step, election)
y_election_3_step <- model.response(mframe_election_3_step)
x_election_3_step <- model.matrix(f_election_3_step, mframe_election_3_step)
R_election_3_step <- nclass
# Perform the three step algorithm.
time_3_step_classical <- system.time(
for (rep in 1:Rep_Tot){
#---------------------------------------------------------------------------------------------------
# 1] Estimate a latent class model without covariates
#---------------------------------------------------------------------------------------------------
f_election_unconditional <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ 1
fit_unconditional <- unconditional_em(f_election_unconditional, election, nclass = n_c, seed = seed_rep[rep])
#---------------------------------------------------------------------------------------------------
# 2] Predict the latent class of each unit via modal assignment
#---------------------------------------------------------------------------------------------------
pred_class <- apply(fit_unconditional[[4]],1, which.max)
#---------------------------------------------------------------------------------------------------
# 3] Estimate the beta coefficients from a multinomial logit with the predicted classes as responses
#---------------------------------------------------------------------------------------------------
b <- c(t(summary(multinom(pred_class ~ election$PARTY, trace = FALSE))$coefficients))
# Compute the log-likelihood of the full model
prior <- poLCA:::poLCA.updatePrior(b, x_election_3_step, R_election_3_step)
llik_3_step_classical[rep] <- sum(log(rowSums(prior * poLCA:::poLCA.ylik.C(fit_unconditional[[3]],
y_election_3_step))))}
)[3]
Here we implement the modification proposed by Vermunt (2010) of the classical three-step methods, in order to reduce the bias of the estimators. This strategy is discussed in Sections 1.2 of our paper, and proceed as follows:
- Estimate a latent class model without covariates
f_election_unconditional <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG, MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ 1
. This requires the functionunconditional_em()
(inLCA-Covariates-Algorithms.R)
. - Using the estimates in 1, predict the latent classes si, i = 1, ..., n, by assigning each unit i to the class r with the highest predicted probability. Compute also the classification error by applying equation (6) in Vermunt (2010).
- Following equation (19) in Vermunt (2010) estimate the coefficients β1, ..., βR from a latent class model with covariates, where the predicted latent classes from 2 act as the only categorical variable available, and its probability mass function within each class is fixed and equal to the classification error. This implementation requires the function
correction_em()
inLCA-Covariates-Algorithms.R
.
The code to implement this routine and save the relevant quantities is:
# Create the allocation matrix for the full--model log-likelihood sequence.
llik_3_step_corrected <- rep(0, Rep_Tot)
# Define the useful quantities to compute the full--model log-likelihood sequence.
f_election_3_step <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ PARTY
nclass = n_c
mframe_election_3_step <- model.frame(f_election_3_step, election)
y_election_3_step <- model.response(mframe_election_3_step)
x_election_3_step <- model.matrix(f_election_3_step, mframe_election_3_step)
R_election_3_step <- nclass
# Perform the three step algorithm.
time_3_step_corrected <- system.time(
for (rep in 1:Rep_Tot){
#---------------------------------------------------------------------------------------------------
# 1] Estimate a latent class model without covariates
#---------------------------------------------------------------------------------------------------
f_election_unconditional <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ 1
fit_unconditional <- unconditional_em(f_election_unconditional, election, nclass = n_c, seed = seed_rep[rep])
#---------------------------------------------------------------------------------------------------
# 2] Predict the latent class of each unit via modal assignment and compute the classification error
#---------------------------------------------------------------------------------------------------
pred_class <- apply(fit_unconditional[[4]], 1, which.max)
class_err <- matrix(0, nclass, nclass)
rownames(class_err) <- paste("W", c(1:nclass), sep="")
colnames(class_err) <- paste("X", c(1:nclass), sep="")
for (r in 1:nclass){
class_err[,r] <- (t(dummy(pred_class))%*%as.matrix(fit_unconditional[[4]][,r],
dim(election)[1],1)/dim(election)[1])/fit_unconditional[[5]][1,r]}
class_err <- t(class_err)
#---------------------------------------------------------------------------------------------------
# 3] Estimate the beta coefficients from the correction procedure in Vermunt (2010)
#---------------------------------------------------------------------------------------------------
f_election_3_step_correct <- cbind(pred_class) ~ PARTY
fit_correct <- correction_em(f_election_3_step_correct, election, nclass = n_c, seed = seed_rep[rep],
classification_error = class_err,sigma_init=0.1)
# Compute the log-likelihood of the full model
prior <- poLCA:::poLCA.updatePrior(fit_correct[[3]], x_election_3_step, R_election_3_step)
llik_3_step_corrected[rep] <- sum(log(rowSums(prior * poLCA:::poLCA.ylik.C(fit_unconditional[[3]],
y_election_3_step))))}
)[3]
We now implement our nested EM algorithm for improved one-step estimation of latent class models with covariates. This routine is carefully described in Section 2.2 of our paper, and leverages the recently developed Pòlya-Gamma data augmentation (Polson et al. 2013). The implementation requires the function nested_em()
in the source file LCA-Covariates-Algorithms.R
.
# Create the allocation matrices for the quantities to be monitored.
iter_NEM <- rep(0, Rep_Tot)
llik_NEM <- matrix(0, Rep_Tot, 1000)
llik_decrement_NEM <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NEM <- system.time(
for (rep in 1:Rep_Tot){
fit_NEM <- nested_em(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NEM[rep] <- fit_NEM[[1]]
llik_NEM[rep,] <- fit_NEM[[2]]
llik_decrement_NEM[rep,] <- fit_NEM[[3]]}
)[3]
Here we consider a more efficient hybrid version of the nested EM algorithm which reaches a neighborhood of the maximum using the more stable nested EM, and then switches to Newton-Raphson methods to speed convergence. This routine is described in Section 2.3 of our paper, and requires the function hybrid_em()
in the source file LCA-Covariates-Algorithms.R
.
# Create the allocation matrices for the quantities to be monitored.
iter_HYB <- rep(0, Rep_Tot)
llik_HYB <- matrix(0, Rep_Tot, 1000)
llik_decrement_HYB <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_HYB <- system.time(
for (rep in 1:Rep_Tot){
fit_HYB <- hybrid_em(f_election, election, nclass = n_c, seed = seed_rep[rep], epsilon = 0.01,sigma_init=sqrt(sigma))
iter_HYB[rep] <- fit_HYB[[1]]
llik_HYB[rep,] <- fit_HYB[[2]]
llik_decrement_HYB[rep,] <- fit_HYB[[3]]}
)[3]
Once the parameters have been estimated under the computational routines implemented above, we compare the maximization performance and the computational efficiency of the different algorithms. In particular, we consider the following quantities, computed for each run of every routine:
Maximization Performance
- Number of runs with a drop in the log-likelihood sequence.
- Number of runs converging to values which are not the maximum log-likelihood.
- For the runs reaching a local mode, we also compute the quartiles of the absolute difference between the log-likelihoods in the local modes and the maximum one. In the paper, only the median is discussed.
Computational Efficiency
- Quartiles of the number of iterations for convergence, computed only for the runs reaching the maximum. In the paper, only the median is discussed.
- Averaged computational time for each run.
Consistent with the above goal, let us create a function which computes the measures of performance from the output of the different algorithms.
performance_algo <- function(max_loglik, n_rep, loglik_seq, loglik_decay, n_iter, time, delta){
#----------------------------------------------------------------------------------------------
# Number of runs with a drop in the log-likelihood sequence.
#----------------------------------------------------------------------------------------------
n_drops <- 0
for (rep in 1:n_rep){
n_drops <- n_drops + (sum(loglik_decay[rep,1:n_iter[rep]]) > 0)*1}
#----------------------------------------------------------------------------------------------
# Number of runs converging to values which are not the maximum log-likelihood.
#----------------------------------------------------------------------------------------------
n_l_modes <- sum(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)]) > delta)
#----------------------------------------------------------------------------------------------
# Quartiles of the difference between log-likelihoods in local modes and the maximum one.
#----------------------------------------------------------------------------------------------
sel_modes <- which(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)]) > delta)
diff_llik <- quantile(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)])[sel_modes])[2:4]
#----------------------------------------------------------------------------------------------
# Quartiles of the number of iterations to reach convergence to maximum log-likelihood.
#----------------------------------------------------------------------------------------------
sel_convergence <- which(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)]) <= delta)
iter_convergence <- quantile(n_iter[sel_convergence])[2:4]
#----------------------------------------------------------------------------------------------
# Averaged computational time for each iteration.
#----------------------------------------------------------------------------------------------
averaged_time <- time/n_rep
#----------------------------------------------------------------------------------------------
#----------------------------------------------------------------------------------------------
#Create the vector with the measures to be saved
output<-c(n_drops, n_l_modes, diff_llik, iter_convergence, averaged_time)
return(output)
}
In performing our assessments, let us first define the correct maximum log-likelihood max_llik
, and a control quantity delta
defining the minimum deviation from max_llik
which is indicative of a local mode.
max_llik <- c(-11102.72)
delta <- 0.01
Finally, let us create a matrix Table_Performance
which contains the performance measures for the different algorithms.
Table_Performance <- matrix(NA,9,8)
rownames(Table_Performance) <- c("N. Decays",
"N. Local Modes",
"Q1 Log-L in Local Modes",
"Q2 Log-L in Local Modes",
"Q3 Log-L in Local Modes",
"Q1 N. Iterat. Converge max(Log-L)",
"Q2 N. Iterat. Converge max(Log-L)",
"Q3 N. Iterat. Converge max(Log-L)",
"Averaged Time")
colnames(Table_Performance) <- c("NR EM","NR EM Q1","NR EM Q1 0.5","MM EM",
"CLASSIC. 3-STEP","CORREC. 3-STEP","NESTED EM","HYBRID EM")
We can now compute the different performance measures for our algorithms.
1. Performance EM algorithm with Newton-Raphson as in Bandeen-Roche et al. (1997)
Table_Performance[,1] <- performance_algo(max_llik, Rep_Tot, llik_NR_EM,
llik_decrement_NR_EM, iter_NR_EM,
time_NR_EM, delta)
2. Performance EM algorithm with Newton-Raphson as in Formann (1992) and Van der Heijden et al. (1996)
Table_Performance[,2] <- performance_algo(max_llik, Rep_Tot, llik_NR_EM_Q1,
llik_decrement_NR_EM_Q1, iter_NR_EM_Q1,
time_NR_EM_Q1, delta)
3. Performance Re-scaled EM algorithm with Newton-Raphson
Table_Performance[,3] <- performance_algo(max_llik, Rep_Tot, llik_NR_EM_Q1_0.5,
llik_decrement_NR_EM_Q1_0.5, iter_NR_EM_Q1_0.5,
time_NR_EM_Q1_0.5, delta)
4. Performance MM-EM algorithm based on the lower-bound routine in Bohning (1992)
Table_Performance[,4] <- performance_algo(max_llik, Rep_Tot, llik_MM,
llik_decrement_MM, iter_MM,
time_MM, delta)
5. Performance Classical 3-step algorithm
As discussed in the paper, since all the three-step runs converge systematically to local modes, we do not study the number of iterations to reach convergence. In fact, these routines never converge to the maximum log-likelihood. Also the number of drops in the log-likelihood sequence is somewhat irrelevant to evaluate the three-step methods, since the estimation routines are based on two separate maximizations in steps 1 and 3, not directly related to the full-model log-likelihood.
Table_Performance[,5] <- performance_algo(max_llik, Rep_Tot, llik_3_step_classical,
matrix(0,Rep_Tot,1000), rep(0,Rep_Tot),
time_3_step_classical, delta)
Table_Performance[1,5] <- NA
6. Performance Bias-corrected 3-step algorithm
Even in this case all the runs converge systematically to local modes. Therefore we do not study the number of iterations to reach convergence. Also the number of drops in the log-likelihood sequence is somewhat irrelevant to evaluate the three-step methods, since the estimation routines are based on two separate maximizations in steps 1 and 3, not directly related to the full-model log-likelihood.
Table_Performance[,6] <- performance_algo(max_llik, Rep_Tot, llik_3_step_corrected,
matrix(0,Rep_Tot,1000), rep(0,Rep_Tot),
time_3_step_corrected, delta)
Table_Performance[1,6] <- NA
7. Performance nested EM algorithm
Table_Performance[,7] <- performance_algo(max_llik, Rep_Tot, llik_NEM, llik_decrement_NEM,
iter_NEM, time_NEM, delta)
8. Performance hybrid nested EM algorithm
Table_Performance[,8] <- performance_algo(max_llik, Rep_Tot, llik_HYB, llik_decrement_HYB,
iter_HYB, time_HYB, delta)
Let us finally visualize the performance table.
In particular, the maximization performance and the computational efficiency of the EM algorithm with one Newton-Raphson step, along with those of the re-scaled modifications, are:
library(knitr)
kable(Table_Performance[,1:4])
NR EM | NR EM Q1 | NR EM Q1 0.5 | MM EM | |
---|---|---|---|---|
N. Decays | 48.00000 | 18.00000 | 4.00000 | 0.00000 |
N. Local Modes | 60.00000 | 19.00000 | 3.00000 | 0.00000 |
Q1 Log-L in Local Modes | 560.75513 | 778.58745 | 778.58745 | NA |
Q2 Log-L in Local Modes | 645.78560 | 778.58745 | 778.58745 | NA |
Q3 Log-L in Local Modes | 831.04045 | 1373.92438 | 778.58745 | NA |
Q1 N. Iterat. Converge max(Log-L) | 81.25000 | 76.00000 | 99.00000 | 122.75000 |
Q2 N. Iterat. Converge max(Log-L) | 104.00000 | 106.00000 | 122.00000 | 139.00000 |
Q3 N. Iterat. Converge max(Log-L) | 108.00000 | 110.00000 | 126.00000 | 144.00000 |
Averaged Time | 0.048 | 0.076 | 0.099 | 0.098 |
The maximization performance and the computational efficiency of the three-step estimation algorithms, along with those of our nested EM and its hybrid modification, are instead:
library(knitr)
kable(Table_Performance[,5:8])
CLASSIC. 3-STEP | CORREC. 3-STEP | NESTED EM | HYBRID EM | |
---|---|---|---|---|
N. Decays | NA | NA | 0.00000 | 0.00000 |
N. Local Modes | 100.00000 | 100.00000 | 0.00000 | 0.00000 |
Q1 Log-L in Local Modes | 17.67188 | 12.19968 | NA | NA |
Q2 Log-L in Local Modes | 17.67189 | 12.19970 | NA | NA |
Q3 Log-L in Local Modes | 17.67189 | 12.19970 | NA | NA |
Q1 N. Iterat. Converge max(Log-L) | NA | NA | 93.75000 | 90.00000 |
Q2 N. Iterat. Converge max(Log-L) | NA | NA | 109.00000 | 106.00000 |
Q3 N. Iterat. Converge max(Log-L) | NA | NA | 113.00000 | 110.25000 |
Averaged Time | 0.074 | 0.063 | 0.107 | 0.095 |
We conclude the analysis by providing the code to obtain a plot which compares, for a selected run sel <- 8
, the log-likelihood sequences obtained under the different algorithms for one-step estimation.
Let us first load some useful libraries and choose the run on which to focus.
library(ggplot2)
library(reshape)
sel <- 8
We then create a dataset data_plot
with six columns containing the log-likelihood sequences of the one-step maximization routines for the selected run.
data_plot <- cbind(llik_NR_EM[sel,],llik_NR_EM_Q1[sel,],llik_NR_EM_Q1_0.5[sel,],llik_MM[sel,],llik_NEM[sel,])
data_plot <- data_plot[c(1:max(iter_NR_EM[sel],iter_NR_EM_Q1[sel],iter_NR_EM_Q1_0.5[sel],iter_MM[sel],iter_NEM[sel])),]
for (m in which(apply(apply(data_plot,2,is.na),2,sum)>0)){
data_plot[which(is.na(data_plot[,m]))[1]:dim(data_plot)[1],m] <- data_plot[which(is.na(data_plot[,m]))[1]-1,m]}
data_plot[c((iter_NR_EM[sel] + 1): max(iter_NEM[sel],iter_NR_EM[sel])),2] <- data_plot[iter_NR_EM[sel],2]
data_plot <- rbind(rep(-20000,5),data_plot)
Finally we create the figure.
data_ggplot <- melt(data_plot)
data_ggplot <- as.data.frame(data_ggplot)
data_ggplot$app <- "ELECTION DATA: Two Latent Classes"
data_ggplot$col <- c(rep(1,dim(data_ggplot)[1]*3/5),rep(2,dim(data_ggplot)[1]*2/5))
data_ggplot$line <- c(rep(1,dim(data_ggplot)[1]/5),rep(4,dim(data_ggplot)[1]/5),rep(5,dim(data_ggplot)[1]/5),rep(2,dim(data_ggplot)[1]/5),rep(1,dim(data_ggplot)[1]/5))
plot <- ggplot(data = data_ggplot, aes(x = X1-1, y = value, group = X2)) +
geom_line(aes(color = as.factor(col),linetype=as.factor(line))) +
coord_cartesian(ylim = c(-17500, -11000), xlim = c(0, 15)) + theme_bw() +
labs(x = "iteration", y = expression(paste(l,"(", theta^{(t)}, ";",x,",",y,")"))) +
theme(legend.position="none", strip.text.x = element_text(size = 11, face = "bold")) +
facet_wrap(~ app)
plot
We consider now the case with R = 3
classes and reproduce the results in Table 1 of the paper Durante, D., Canale, A. and Rigon, T. (2018). A nested expectation-maximization algorithm for latent class models with covariates [arXiv:1705.03864v5]. All the implementation proceeds as before, with the only difference that now n_c <- 3
instead of n_c <- 2
. Hence we summarize below only the code to perform maximization under the different routines. For a finer description of these routines refer to the analysis above.
n_c <- 3
To provide a detailed computational assessment, we perform estimation under the different algorithms for Rep_Tot = 100
runs at varying initializations. For each run, the algorithms are all initialized at the same starting values, which are controlled by a seed
(changing across the different runs). Let us, therefore, define this 100 seeds and the variance sigma
of the zero mean Gaussian variables from which the initial values for the β parameters are generated.
Rep_Tot <- 100
seed_rep <- c(1:100)
sigma <- 0.5
# Create the allocation matrices for the quantities to be monitored.
iter_NR_EM <- rep(0, Rep_Tot)
llik_NR_EM <- matrix(0, Rep_Tot, 1000)
llik_decrement_NR_EM <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NR_EM <- system.time(
for (rep in 1:Rep_Tot){
fit_NR_EM <- newton_em(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NR_EM[rep] <- fit_NR_EM[[1]]
llik_NR_EM[rep,] <- fit_NR_EM[[2]]
llik_decrement_NR_EM[rep,] <- fit_NR_EM[[3]]}
)[3]
2. EM algorithm with Newton-Raphson as in Formann (1992) and Van der Heijden et al. (1996) (one-step)
# Create the allocation matrices for the quantities to be monitored.
iter_NR_EM_Q1 <- rep(0, Rep_Tot)
llik_NR_EM_Q1 <- matrix(0, Rep_Tot, 1000)
llik_decrement_NR_EM_Q1 <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NR_EM_Q1 <- system.time(
for (rep in 1:Rep_Tot){
fit_NR_EM_Q1 <- newton_em_Q1(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NR_EM_Q1[rep] <- fit_NR_EM_Q1[[1]]
llik_NR_EM_Q1[rep,] <- fit_NR_EM_Q1[[2]]
llik_decrement_NR_EM_Q1[rep,] <- fit_NR_EM_Q1[[3]]}
)[3]
# Create the allocation matrices for the quantities to be monitored.
iter_NR_EM_Q1_0.5 <- rep(0, Rep_Tot)
llik_NR_EM_Q1_0.5 <- matrix(0, Rep_Tot, 1000)
llik_decrement_NR_EM_Q1_0.5 <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NR_EM_Q1_0.5 <- system.time(
for (rep in 1:Rep_Tot){
fit_NR_EM_Q1_0.5 <- newton_em_Q1(f_election, election, alpha=0.5,nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NR_EM_Q1_0.5[rep] <- fit_NR_EM_Q1_0.5[[1]]
llik_NR_EM_Q1_0.5[rep,] <- fit_NR_EM_Q1_0.5[[2]]
llik_decrement_NR_EM_Q1_0.5[rep,] <- fit_NR_EM_Q1_0.5[[3]]}
)[3]
# Create the allocation matrices for the quantities to be monitored.
iter_MM <- rep(0, Rep_Tot)
llik_MM <- matrix(0, Rep_Tot, 1000)
llik_decrement_MM <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_MM <- system.time(
for (rep in 1:Rep_Tot){
fit_MM <- MM_em(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_MM[rep] <- fit_MM[[1]]
llik_MM[rep,] <- fit_MM[[2]]
llik_decrement_MM[rep,] <- fit_MM[[3]]}
)[3]
# Create the allocation matrix for the full--model log-likelihood sequence.
llik_3_step_classical <- rep(0, Rep_Tot)
# Define the useful quantities to compute the full--model log-likelihood sequence.
f_election_3_step <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ PARTY
nclass = n_c
mframe_election_3_step <- model.frame(f_election_3_step, election)
y_election_3_step <- model.response(mframe_election_3_step)
x_election_3_step <- model.matrix(f_election_3_step, mframe_election_3_step)
R_election_3_step <- nclass
# Perform the three step algorithm.
time_3_step_classical <- system.time(
for (rep in 1:Rep_Tot){
#---------------------------------------------------------------------------------------------------
# 1] Estimate a latent class model without covariates
#---------------------------------------------------------------------------------------------------
f_election_unconditional <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ 1
fit_unconditional <- unconditional_em(f_election_unconditional, election, nclass = n_c, seed = seed_rep[rep])
#---------------------------------------------------------------------------------------------------
# 2] Predict the latent class of each unit via modal assignment
#---------------------------------------------------------------------------------------------------
pred_class <- apply(fit_unconditional[[4]],1, which.max)
#---------------------------------------------------------------------------------------------------
# 3] Estimate the beta coefficients from a multinomial logit with the predicted classes as responses
#---------------------------------------------------------------------------------------------------
b <- c(t(summary(multinom(pred_class ~ election$PARTY, trace = FALSE))$coefficients))
# Compute the log-likelihood of the full model
prior <- poLCA:::poLCA.updatePrior(b, x_election_3_step, R_election_3_step)
llik_3_step_classical[rep] <- sum(log(rowSums(prior * poLCA:::poLCA.ylik.C(fit_unconditional[[3]],
y_election_3_step))))}
)[3]
# Create the allocation matrix for the full--model log-likelihood sequence.
llik_3_step_corrected <- rep(0, Rep_Tot)
# Define the useful quantities to compute the full--model log-likelihood sequence.
f_election_3_step <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ PARTY
nclass = n_c
mframe_election_3_step <- model.frame(f_election_3_step, election)
y_election_3_step <- model.response(mframe_election_3_step)
x_election_3_step <- model.matrix(f_election_3_step, mframe_election_3_step)
R_election_3_step <- nclass
# Perform the three step algorithm.
time_3_step_corrected <- system.time(
for (rep in 1:Rep_Tot){
#---------------------------------------------------------------------------------------------------
# 1] Estimate a latent class model without covariates
#---------------------------------------------------------------------------------------------------
f_election_unconditional <- cbind(MORALG, CARESG, KNOWG, LEADG, DISHONG, INTELG,
MORALB, CARESB, KNOWB, LEADB, DISHONB, INTELB) ~ 1
fit_unconditional <- unconditional_em(f_election_unconditional, election, nclass = n_c, seed = seed_rep[rep])
#---------------------------------------------------------------------------------------------------
# 2] Predict the latent class of each unit via modal assignment and compute the classification error
#---------------------------------------------------------------------------------------------------
pred_class <- apply(fit_unconditional[[4]], 1, which.max)
class_err <- matrix(0, nclass, nclass)
rownames(class_err) <- paste("W", c(1:nclass), sep="")
colnames(class_err) <- paste("X", c(1:nclass), sep="")
for (r in 1:nclass){
class_err[,r] <- (t(dummy(pred_class))%*%as.matrix(fit_unconditional[[4]][,r],
dim(election)[1],1)/dim(election)[1])/fit_unconditional[[5]][1,r]}
class_err <- t(class_err)
#---------------------------------------------------------------------------------------------------
# 3] Estimate the beta coefficients from the correction procedure in Vermunt (2010)
#---------------------------------------------------------------------------------------------------
f_election_3_step_correct <- cbind(pred_class) ~ PARTY
fit_correct <- correction_em(f_election_3_step_correct, election, nclass = n_c, seed = seed_rep[rep],
classification_error = class_err,sigma_init=0.1)
# Compute the log-likelihood of the full model
prior <- poLCA:::poLCA.updatePrior(fit_correct[[3]], x_election_3_step, R_election_3_step)
llik_3_step_corrected[rep] <- sum(log(rowSums(prior * poLCA:::poLCA.ylik.C(fit_unconditional[[3]],
y_election_3_step))))}
)[3]
# Create the allocation matrices for the quantities to be monitored.
iter_NEM <- rep(0, Rep_Tot)
llik_NEM <- matrix(0, Rep_Tot, 1000)
llik_decrement_NEM <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_NEM <- system.time(
for (rep in 1:Rep_Tot){
fit_NEM <- nested_em(f_election, election, nclass = n_c, seed = seed_rep[rep],sigma_init=sqrt(sigma))
iter_NEM[rep] <- fit_NEM[[1]]
llik_NEM[rep,] <- fit_NEM[[2]]
llik_decrement_NEM[rep,] <- fit_NEM[[3]]}
)[3]
# Create the allocation matrices for the quantities to be monitored.
iter_HYB <- rep(0, Rep_Tot)
llik_HYB <- matrix(0, Rep_Tot, 1000)
llik_decrement_HYB <- matrix(0, Rep_Tot, 1000)
# Perform the algorithm.
time_HYB <- system.time(
for (rep in 1:Rep_Tot){
fit_HYB <- hybrid_em(f_election, election, nclass = n_c, seed = seed_rep[rep], epsilon = 0.01,sigma_init=sqrt(sigma))
iter_HYB[rep] <- fit_HYB[[1]]
llik_HYB[rep,] <- fit_HYB[[2]]
llik_decrement_HYB[rep,] <- fit_HYB[[3]]}
)[3]
As before, let us create a function which computes the measures of performance from the output of the different algorithms.
performance_algo <- function(max_loglik, n_rep, loglik_seq, loglik_decay, n_iter, time, delta){
#----------------------------------------------------------------------------------------------
# Number of runs with a drop in the log-likelihood sequence.
#----------------------------------------------------------------------------------------------
n_drops <- 0
for (rep in 1:n_rep){
n_drops <- n_drops + (sum(loglik_decay[rep,1:n_iter[rep]]) > 0)*1}
#----------------------------------------------------------------------------------------------
# Number of runs converging to values which are not the maximum log-likelihood.
#----------------------------------------------------------------------------------------------
n_l_modes <- sum(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)]) > delta)
#----------------------------------------------------------------------------------------------
# Quartiles of the difference between log-likelihoods in local modes and the maximum one.
#----------------------------------------------------------------------------------------------
sel_modes <- which(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)]) > delta)
diff_llik <- quantile(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)])[sel_modes])[2:4]
#----------------------------------------------------------------------------------------------
# Quartiles of the number of iterations to reach convergence to maximum log-likelihood.
#----------------------------------------------------------------------------------------------
sel_convergence <- which(abs(max_loglik - loglik_seq[cbind(1:n_rep,n_iter)]) <= delta)
iter_convergence <- quantile(n_iter[sel_convergence])[2:4]
#----------------------------------------------------------------------------------------------
# Averaged computational time for each iteration.
#----------------------------------------------------------------------------------------------
averaged_time <- time/n_rep
#----------------------------------------------------------------------------------------------
#----------------------------------------------------------------------------------------------
#Create the vector with the measures to be saved
output<-c(n_drops, n_l_modes, diff_llik, iter_convergence, averaged_time)
return(output)
}
In reproducing (and extending) the results in Table 1, let us first define the correct maximum log-likelihood max_llik
, and a control quantity delta
defining the minimum deviation from max_llik
which is indicative of a local mode.
max_llik <- c(-10670.94)
delta <- 0.01
Finally, let us create a matrix Table_Performance
which contains the performance measures for the different algorithms.
Table_Performance <- matrix(NA,9,8)
rownames(Table_Performance) <- c("N. Decays",
"N. Local Modes",
"Q1 Log-L in Local Modes",
"Q2 Log-L in Local Modes",
"Q3 Log-L in Local Modes",
"Q1 N. Iterat. Converge max(Log-L)",
"Q2 N. Iterat. Converge max(Log-L)",
"Q3 N. Iterat. Converge max(Log-L)",
"Averaged Time")
colnames(Table_Performance) <- c("NR EM","NR EM Q1","NR EM Q1 0.5","MM EM",
"CLASSIC. 3-STEP","CORREC. 3-STEP","NESTED EM","HYBRID EM")
We can now compute the different performance measures for our algorithms.
1. Performance EM algorithm with Newton-Raphson as in Bandeen-Roche et al. (1997)
Table_Performance[,1] <- performance_algo(max_llik, Rep_Tot, llik_NR_EM,
llik_decrement_NR_EM, iter_NR_EM,
time_NR_EM, delta)
2. Performance EM algorithm with Newton-Raphson as in Formann (1992) and Van der Heijden et al. (1996)
Table_Performance[,2] <- performance_algo(max_llik, Rep_Tot, llik_NR_EM_Q1,
llik_decrement_NR_EM_Q1, iter_NR_EM_Q1,
time_NR_EM_Q1, delta)
3. Performance Re-scaled EM algorithm with Newton-Raphson
Table_Performance[,3] <- performance_algo(max_llik, Rep_Tot, llik_NR_EM_Q1_0.5,
llik_decrement_NR_EM_Q1_0.5, iter_NR_EM_Q1_0.5,
time_NR_EM_Q1_0.5, delta)
4. Performance MM-EM algorithm based on the lower-bound routine in Bohning (1992)
Table_Performance[,4] <- performance_algo(max_llik, Rep_Tot, llik_MM,
llik_decrement_MM, iter_MM,
time_MM, delta)
5. Performance Classical 3-step algorithm
As discussed in the paper, since all the three-step runs converge systematically to local modes, we do not study the number of iterations to reach convergence. In fact, these routines never converge to the maximum log-likelihood. Also the number of drops in the log-likelihood sequence is somewhat irrelevant to evaluate the three-step methods, since the estimation routines are based on two separate maximizations in steps 1 and 3, not directly related to the full-model log-likelihood.
Table_Performance[,5] <- performance_algo(max_llik, Rep_Tot, llik_3_step_classical,
matrix(0,Rep_Tot,1000), rep(0,Rep_Tot),
time_3_step_classical, delta)
Table_Performance[1,5] <- NA
6. Performance Bias-corrected 3-step algorithm
Even in this case all the runs converge systematically to local modes. Therefore we do not study the number of iterations to reach convergence. Also the number of drops in the log-likelihood sequence is somewhat irrelevant to evaluate the three-step methods, since the estimation routines are based on two separate maximizations in steps 1 and 3, not directly related to the full-model log-likelihood.
Table_Performance[,6] <- performance_algo(max_llik, Rep_Tot, llik_3_step_corrected,
matrix(0,Rep_Tot,1000), rep(0,Rep_Tot),
time_3_step_corrected, delta)
Table_Performance[1,6] <- NA
7. Performance nested EM algorithm
Table_Performance[,7] <- performance_algo(max_llik, Rep_Tot, llik_NEM, llik_decrement_NEM,
iter_NEM, time_NEM, delta)
8. Performance hybrid nested EM algorithm
Table_Performance[,8] <- performance_algo(max_llik, Rep_Tot, llik_HYB, llik_decrement_HYB,
iter_HYB, time_HYB, delta)
Let us finally visualize the performance table, which reproduces (and extends) Table 1 in the paper: Durante, D., Canale, A. and Rigon, T. (2018). A nested expectation-maximization algorithm for latent class models with covariates [arXiv:1705.03864v5].
In particular, the maximization performance and the computational efficiency of the EM algorithm with one Newton-Raphson step, along with those of the re-scaled modifications, are:
library(knitr)
kable(Table_Performance[,1:4])
NR EM | NR EM Q1 | NR EM Q1 0.5 | MM EM | |
---|---|---|---|---|
N. Decays | 78.00000 | 37.0000 | 12.0000000 | 0.0000000 |
N. Local Modes | 94.00000 | 51.0000 | 24.0000000 | 27.0000000 |
Q1 Log-L in Local Modes | 688.22053 | 1081.7374 | 0.1070165 | 0.3757734 |
Q2 Log-L in Local Modes | 830.04608 | 1105.6276 | 4.8945293 | 0.6445302 |
Q3 Log-L in Local Modes | 1092.57963 | 1805.7044 | 1082.1905381 | 3.4021476 |
Q1 N. Iterat. Converge max(Log-L) | 135.75000 | 127.0000 | 128.0000000 | 179.0000000 |
Q2 N. Iterat. Converge max(Log-L) | 146.00000 | 151.0000 | 161.0000000 | 229.0000000 |
Q3 N. Iterat. Converge max(Log-L) | 151.00000 | 192.0000 | 224.5000000 | 307.0000000 |
Averaged Time | 0.073 | 0.182 | 0.230 | 0.283 |
The maximization performance and the computational efficiency of the three-step estimation algorithms, along with those of our nested EM and its hybrid modification, are instead:
library(knitr)
kable(Table_Performance[,5:8])
CLASSIC. 3-STEP | CORREC. 3-STEP | NESTED EM | HYBRID EM | |
---|---|---|---|---|
N. Decays | NA | NA | 0.0000000 | 0.0000000 |
N. Local Modes | 100.00000 | 100.00000 | 24.0000000 | 25.0000000 |
Q1 Log-L in Local Modes | 42.22420 | 39.10380 | 0.1070165 | 0.1070165 |
Q2 Log-L in Local Modes | 42.22421 | 39.10381 | 0.6445302 | 0.6445302 |
Q3 Log-L in Local Modes | 42.23407 | 39.10381 | 0.6445302 | 0.6445302 |
Q1 N. Iterat. Converge max(Log-L) | NA | NA | 126.7500000 | 117.5000000 |
Q2 N. Iterat. Converge max(Log-L) | NA | NA | 171.0000000 | 166.0000000 |
Q3 N. Iterat. Converge max(Log-L) | NA | NA | 210.2500000 | 205.5000000 |
Averaged Time | 0.229 | 0.212 | 0.359 | 0.265 |
Also here, we conclude the analysis by providing the code to obtain a plot which compares, for a selected run sel <- 72
, the log-likelihood sequences obtained under the different algorithms for one-step estimation.
Let us first load some useful libraries and choose the run on which to focus.
library(ggplot2)
library(reshape)
sel <- 72
We then create a dataset data_plot
with six columns containing the log-likelihood sequences of the one-step maximization routines for the selected run.
data_plot <- cbind(llik_NR_EM[sel,],llik_NR_EM_Q1[sel,],llik_NR_EM_Q1_0.5[sel,],llik_MM[sel,],llik_NEM[sel,])
data_plot <- data_plot[c(1:max(iter_NR_EM[sel],iter_NR_EM_Q1[sel],iter_NR_EM_Q1_0.5[sel],iter_MM[sel],iter_NEM[sel])),]
for (m in which(apply(apply(data_plot,2,is.na),2,sum)>0)){
data_plot[which(is.na(data_plot[,m]))[1]:dim(data_plot)[1],m] <- data_plot[which(is.na(data_plot[,m]))[1]-1,m]}
data_plot[c((iter_NR_EM[sel] + 1): max(iter_NEM[sel],iter_NR_EM[sel])),2] <- data_plot[iter_NR_EM[sel],2]
data_plot <- rbind(rep(-20000,5),data_plot)
Finally we create the figure.
data_ggplot <- melt(data_plot)
data_ggplot <- as.data.frame(data_ggplot)
data_ggplot$app <- "ELECTION DATA: Three Latent Classes"
data_ggplot$col <- c(rep(1,dim(data_ggplot)[1]*3/5),rep(2,dim(data_ggplot)[1]*2/5))
data_ggplot$line <- c(rep(1,dim(data_ggplot)[1]/5),rep(4,dim(data_ggplot)[1]/5),rep(5,dim(data_ggplot)[1]/5),rep(2,dim(data_ggplot)[1]/5),rep(1,dim(data_ggplot)[1]/5))
plot <- ggplot(data = data_ggplot, aes(x = X1-1, y = value, group = X2)) +
geom_line(aes(color = as.factor(col),linetype=as.factor(line))) +
coord_cartesian(ylim = c(-19500, -10500), xlim = c(0, 15)) + theme_bw() +
labs(x = "iteration", y = expression(paste(l,"(", theta^{(t)}, ";",x,",",y,")"))) +
theme(legend.position="none", strip.text.x = element_text(size = 11, face = "bold")) +
facet_wrap(~ app)
plot