<img src="https://github.com/IKNL/guidelines/blob/master/resources/logos/iknl_nl.png?raw=true" width=200 align="right">

# `validation`
In this notebook, we validate our proposed federated GLM by comparing its performance against that of `R`'s `glm()` (i.e., centralized).

The families that are shown in this notebook are: 

* `gaussian(link = "identity")`: Linear regression
* `poisson(link = "log")`: Poisson regression
* `binomial(link = "logit")`: Logistic regression
* `rs.poi`: Custom GLM relative survival model with Poisson error


## Requirements
In order to be able to run it properly, make sure that the following libraries are installed:

* `vtg` - Basic `vantage6` tools. These will allow you to use the mock client. They can be installed with the command:

   `devtools::install_github('IKNL/vtg')`
   
   
   
* `vtg.glm` - The actual federated GLM algorithm. It can be installed with the command:

   `devtools::install_github(repo='IKNL/vantage6-algorithms', ref='glm', subdir='models/glm/src')`

## Preliminaries

In [1]:
# Imports
library('stargazer')


Please cite as: 


 Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

 R package version 5.2.3. https://CRAN.R-project.org/package=stargazer 




## Data
The `.csv` files for the data can be generated running the accompanying (Python) script `create_data.py`. Each family (i.e., model type) has its corresponding data. In all cases, they represent a horizontally-partitioned scenario (i.e., all parties have the same features from different individuals).

## Validation

#### The `validate()` function
This function compares the centralized and federated results. The most important values (coefficients, standard errors, $p$- and $z$-values are printed to the console using `stargazer`. Additionally, it check if other values (such as conversion, formula, family, etc.) are equal between them (shown as `TRUE`) as a sanity check.

In [2]:
validate <- function(results_centralized, results_federated_plain){
  
  results_centralized_summary <- summary.glm(results_centralized)
  results_federated <- vtg.glm::as.GLM(results_federated_plain)
  
  precision <- 10
  options(digits=precision)
  my_round <- function(x) {round(x, digits = 10)}
  
  validation_result <- list()
    
  # Coefficients
  coeffs_centralized = sapply(results_centralized$coefficients, my_round, USE.NAMES = TRUE)    
  coeffs_federated = sapply(results_federated_plain$coefficients, my_round, USE.NAMES = TRUE)
  
  df = data.frame(coeff_c = c(coeffs_centralized),
                  coeff_f = c(coeffs_federated))  
    
  # Standard error
  stderr_centralized = sapply(results_centralized_summary$coefficients[, 'Std. Error'], my_round, USE.NAMES = TRUE)
  stderr_federated = sapply(results_federated_plain$Std.Error, my_round, USE.NAMES = TRUE)
    
  df['stderr_c'] <- c(stderr_centralized)
  df['stderr_f'] <- c(stderr_federated)
  

  if(results_centralized_summary$family$family %in% c('poisson', 'binomial')) {letter='z'} 
  else if(results_centralized_summary$family$family=='gaussian') {letter='t'}
  
  # p-value
  pval_centralized = sapply(results_centralized_summary$coefficients[, sprintf("Pr(>|%s|)", letter)], my_round, USE.NAMES = TRUE)
  pval_federated = sapply(results_federated_plain$pvalue, my_round, USE.NAMES = TRUE)
   
  df['pvalues_c'] <- c(pval_centralized)
  df['pvalues_f'] <- c(pval_federated)
    
    
  # z-value
  zval_centralized = sapply(results_centralized_summary$coefficients[, sprintf("%s value", letter)], my_round, USE.NAMES = TRUE)
  zval_federated = sapply(results_federated_plain$zvalue, my_round, USE.NAMES = TRUE)
  
  df['zvalues_c'] <- c(zval_centralized)
  df['zvalues_f'] <- c(zval_federated)
    
  print("LaTeX")
  stargazer(df, type="latex", summary=FALSE)
  print("Text")
  stargazer(df, type="text", summary=FALSE)
    
    
  # Converged
  validation_result['converged'] = results_federated_plain$converged==results_centralized$converged
  print(validation_result['converged'])
    
  # Dispersion
  FL_dispersion = sapply(results_federated_plain$dispersion, my_round, USE.NAMES = TRUE)
  central_dispersion = sapply(results_centralized_summary$dispersion, my_round, USE.NAMES = TRUE)
  validation_result['dispersion'] = FL_dispersion==central_dispersion
  
  # Formula
  validation_result['formula'] = results_federated_plain$formula==results_centralized$formula
  
  # Family
  FL_family = jsonlite::toJSON(results_federated_plain$family, auto_unbox = TRUE, force = T)
  central_family = jsonlite::toJSON(results_centralized$family, auto_unbox = TRUE, force = T)
  validation_result['family'] = FL_family==central_family
  
  # null.deviance
  FL_nulldeviance = sapply(results_federated_plain$null.deviance, my_round, USE.NAMES = TRUE)
  central_nulldeviance = sapply(results_centralized$null.deviance, my_round, USE.NAMES = TRUE)
  validation_result['null.deviance'] = FL_nulldeviance==central_nulldeviance
  
  # DoF
  FL_dof = sapply(results_federated$df.residual, my_round, USE.NAMES = TRUE)
  central_dof = sapply(results_centralized$df.residual, my_round, USE.NAMES = TRUE)
  validation_result['df.residual'] = FL_dof==central_dof
  
  # null.dof
  FL_null = sapply(results_federated$df.null, my_round, USE.NAMES = TRUE)
  central_null = sapply(results_centralized$df.null, my_round, USE.NAMES = TRUE)
  validation_result['df.null'] = FL_null==central_null
    
  validation_result
}

### Linear
#### Centralized analysis

In [3]:
# Housekeeping
if(exists('results_centralized')){
    rm(results_centralized)
}
if(exists('results_federated_plain')){
    rm(results_federated_plain)
}

datasets <- list(
  read.csv('../data/linear_party1.csv'),
  read.csv('../data/linear_party2.csv'),
  read.csv('../data/linear_party3.csv')
)
datasets_combined <- do.call(rbind.data.frame, datasets)

results_centralized <- glm(data=datasets_combined, formula = y ~ x1 + x2, family=gaussian(link = 'identity'))

#### Federated analysis

In [4]:
client <- vtg::MockClient$new(datasets, 'vtg.glm')
results_federated_plain <- vtg.glm::dglm(client, formula = y ~ x1 + x2, family=gaussian(link = 'identity'), tol=1e-08, maxit=25)

-----------------------------------------
  Welcome to the vantage6 Infrastructure
-----------------------------------------




INFO  [11:56:30.386]  
INFO  [11:56:30.416] ############################################### 
INFO  [11:56:30.425] # Starting iteration 0 
INFO  [11:56:30.427] ############################################### 
INFO  [11:56:30.429]  
INFO  [11:56:30.431] 0.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:30.434] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:30.437] Regular call 
[90mDEBUG[39m [11:56:30.439] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.468] Log for site 1: 
[90mDEBUG[39m [11:56:30.471] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:30.447] Initializing node beta... 
[90mDEBUG[39m [11:56:30.454] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:30.457] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.486] Log for site 2: 
[90mDEBUG[39m [11:56:30.490] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:30.473] Initializing node beta... 
[90mDEBUG[39m [11:56:30.479] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:30.483] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.506] Log for site 3: 
[90mDEBUG[39m [11:56:30.509] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:30.494] Initializing node beta... 
[90mDEBUG[39m [11:56:30.501] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:30.504] Calculating the Betas.  
[90mDEBUG[39m [11:56:30.512]   - [DONE] 
INFO  [11:56:30.514] 0.2 - Master beta 
[90mDEBUG[39m [11:56:30.517] Initializing master Beta... 
[90mDEBUG[39m [11:56:30.520] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:30.522] Updating the Betas. 
[90mDEBUG[39m [11:56:30.526]   - [DONE] 
INFO  [11:56:30.529] 0.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:30.539] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:30.569] Regular call 
[90mDEBUG[39m [11:56:30.572] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.586] Log for site 1: 
[90mDEBUG[39m [11:56:30.589] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:30.575] Starting node deviance. 
[90mDEBUG[39m [11:56:30.581] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:30.584] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.603] Log for site 2: 
[90mDEBUG[39m [11:56:30.605] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:30.592] Starting node deviance. 
[90mDEBUG[39m [11:56:30.597] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:30.600] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.621] Log for site 3: 
[90mDEBUG[39m [11:56:30.624] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:30.608] Starting node deviance. 
[90mDEBUG[39m [11:56:30.617] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:30.619] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:30.630]   - [DONE] 
INFO  [11:56:30.632] 0.4 - Master deviance 
[90mDEBUG[39m [11:56:30.635] Starting master deviance. 
[90mDEBUG[39m [11:56:30.637]   - [DONE] 
INFO  [11:56:30.640] 0.5 - Termination conditions 
[90mDEBUG[39m [11:56:30.643]   - [DONE] 
INFO  [11:56:30.646]  
INFO  [11:56:30.648] ############################################### 
INFO  [11:56:30.653] # Starting iteration 1 
INFO  [11:56:30.655] ############################################### 
INFO  [11:56:30.658]  
INFO  [11:56:30.660] 1.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:30.663] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:30.666] R

"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.683] Log for site 1: 
[90mDEBUG[39m [11:56:30.685] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:30.671] Initializing node beta... 
[90mDEBUG[39m [11:56:30.679] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.696] Log for site 2: 
[90mDEBUG[39m [11:56:30.699] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:30.688] Initializing node beta... 
[90mDEBUG[39m [11:56:30.693] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.711] Log for site 3: 
[90mDEBUG[39m [11:56:30.716] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:30.701] Initializing node beta... 
[90mDEBUG[39m [11:56:30.708] Calculating the Betas.  
[90mDEBUG[39m [11:56:30.718]   - [DONE] 
INFO  [11:56:30.720] 1.2 - Master beta 
[90mDEBUG[39m [11:56:30.723] Initializing master Beta... 
[90mDEBUG[39m [11:56:30.725] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:30.728] Updating the Betas. 
[90mDEBUG[39m [11:56:30.732]   - [DONE] 
INFO  [11:56:30.735] 1.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:30.738] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:30.741] Regular call 
[90mDEBUG[39m [11:56:30.744] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.755] Log for site 1: 
[90mDEBUG[39m [11:56:30.757] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:30.747] Starting node deviance. 
[90mDEBUG[39m [11:56:30.752] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.773] Log for site 2: 
[90mDEBUG[39m [11:56:30.775] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:30.760] Starting node deviance. 
[90mDEBUG[39m [11:56:30.770] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:30.786] Log for site 3: 
[90mDEBUG[39m [11:56:30.789] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:30.778] Starting node deviance. 
[90mDEBUG[39m [11:56:30.784] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:30.791]   - [DONE] 
INFO  [11:56:30.794] 1.4 - Master deviance 
[90mDEBUG[39m [11:56:30.796] Starting master deviance. 
[90mDEBUG[39m [11:56:30.799]   - [DONE] 
INFO  [11:56:30.801] 1.5 - Termination conditions 
[90mDEBUG[39m [11:56:30.804]   - [CONVERGED] 
[90mDEBUG[39m [11:56:30.806] Preparing output 


#### Comparison

In [5]:
val_linear_res = validate(results_centralized, results_federated_plain)
print(val_linear_res)

[1] "LaTeX"

% Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
% Date and time: Thu, May 05, 2022 - 11:56:30
\begin{table}[!htbp] \centering 
  \caption{} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}} ccccccccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & coeff\_c & coeff\_f & stderr\_c & stderr\_f & pvalues\_c & pvalues\_f & zvalues\_c & zvalues\_f \\ 
\hline \\[-1.8ex] 
(Intercept) & $0.069$ & $0.069$ & $0.045$ & $0.045$ & $0.129$ & $0.129$ & $1.520$ & $1.520$ \\ 
x1 & $0.221$ & $0.221$ & $0.017$ & $0.017$ & $0$ & $0$ & $12.763$ & $12.763$ \\ 
x2 & $0.499$ & $0.499$ & $0.018$ & $0.018$ & $0$ & $0$ & $27.212$ & $27.212$ \\ 
\hline \\[-1.8ex] 
\end{tabular} 
\end{table} 
[1] "Text"

            coeff_c coeff_f stderr_c stderr_f pvalues_c pvalues_f zvalues_c zvalues_f
-------------------------------------------------------------------------------------
(Intercept)  0.069   0.069   0.045    0.045     0.129     0.129     1.52

### Poisson
#### Centralized

In [9]:
# Housekeeping
if(exists('results_centralized')){
    rm(results_centralized)
}
if(exists('results_federated_plain')){
    rm(results_federated_plain)
}

datasets <- list(
  read.csv('../data/poisson_party1.csv'),
  read.csv('../data/poisson_party2.csv'),
  read.csv('../data/poisson_party3.csv')
)
datasets_combined <- do.call(rbind.data.frame, datasets)

results_centralized <- glm(data=datasets_combined, formula = y ~ x1 + x2, family=poisson(link = "log"))

#### Federated

In [10]:
client <- vtg::MockClient$new(datasets, "vtg.glm")
results_federated_plain <- vtg.glm::dglm(client, formula = y ~ x1 + x2, family=poisson(link = "log"), tol=1e-08, maxit=25)

[90mDEBUG[39m [11:56:32.462] Initializing... 
INFO  [11:56:32.465]  
INFO  [11:56:32.467] ############################################### 
INFO  [11:56:32.470] # Starting iteration 0 
INFO  [11:56:32.472] ############################################### 
INFO  [11:56:32.475]  
INFO  [11:56:32.478] 0.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:32.480] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:32.483] Regular call 
[90mDEBUG[39m [11:56:32.487] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.502] Log for site 1: 
[90mDEBUG[39m [11:56:32.505] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.490] Initializing node beta... 
[90mDEBUG[39m [11:56:32.497] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:32.500] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.522] Log for site 2: 
[90mDEBUG[39m [11:56:32.525] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.507] Initializing node beta... 
[90mDEBUG[39m [11:56:32.513] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:32.515] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.539] Log for site 3: 
[90mDEBUG[39m [11:56:32.542] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.528] Initializing node beta... 
[90mDEBUG[39m [11:56:32.534] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:32.536] Calculating the Betas.  
[90mDEBUG[39m [11:56:32.544]   - [DONE] 
INFO  [11:56:32.546] 0.2 - Master beta 
[90mDEBUG[39m [11:56:32.549] Initializing master Beta... 
[90mDEBUG[39m [11:56:32.552] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:32.555] Updating the Betas. 
[90mDEBUG[39m [11:56:32.558]   - [DONE] 
INFO  [11:56:32.560] 0.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:32.563] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:32.566] Regular call 
[90mDEBUG[39m [11:56:32.569] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.583] Log for site 1: 
[90mDEBUG[39m [11:56:32.585] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.571] Starting node deviance. 
[90mDEBUG[39m [11:56:32.577] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:32.580] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.601] Log for site 2: 
[90mDEBUG[39m [11:56:32.603] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.588] Starting node deviance. 
[90mDEBUG[39m [11:56:32.595] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:32.598] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.621] Log for site 3: 
[90mDEBUG[39m [11:56:32.623] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.608] Starting node deviance. 
[90mDEBUG[39m [11:56:32.615] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:32.618] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:32.625]   - [DONE] 
INFO  [11:56:32.628] 0.4 - Master deviance 
[90mDEBUG[39m [11:56:32.631] Starting master deviance. 
[90mDEBUG[39m [11:56:32.633]   - [DONE] 
INFO  [11:56:32.636] 0.5 - Termination conditions 
[90mDEBUG[39m [11:56:32.638]   - [DONE] 
INFO  [11:56:32.641]  
INFO  [11:56:32.643] ############################################### 
INFO  [11:56:32.646] # Starting iteration 1 
INFO  [11:56:32.649] ############################################### 
INFO  [11:56:32.651]  
INFO  [11:56:32.654] 1.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:32.656] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:32.659] R

"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.674] Log for site 1: 
[90mDEBUG[39m [11:56:32.677] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.665] Initializing node beta... 
[90mDEBUG[39m [11:56:32.671] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.690] Log for site 2: 
[90mDEBUG[39m [11:56:32.693] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.680] Initializing node beta... 
[90mDEBUG[39m [11:56:32.686] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.709] Log for site 3: 
[90mDEBUG[39m [11:56:32.712] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.699] Initializing node beta... 
[90mDEBUG[39m [11:56:32.705] Calculating the Betas.  
[90mDEBUG[39m [11:56:32.715]   - [DONE] 
INFO  [11:56:32.717] 1.2 - Master beta 
[90mDEBUG[39m [11:56:32.720] Initializing master Beta... 
[90mDEBUG[39m [11:56:32.722] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:32.725] Updating the Betas. 
[90mDEBUG[39m [11:56:32.728]   - [DONE] 
INFO  [11:56:32.730] 1.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:32.733] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:32.736] Regular call 
[90mDEBUG[39m [11:56:32.738] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.751] Log for site 1: 
[90mDEBUG[39m [11:56:32.753] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.741] Starting node deviance. 
[90mDEBUG[39m [11:56:32.748] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.766] Log for site 2: 
[90mDEBUG[39m [11:56:32.769] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.755] Starting node deviance. 
[90mDEBUG[39m [11:56:32.762] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.782] Log for site 3: 
[90mDEBUG[39m [11:56:32.784] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.772] Starting node deviance. 
[90mDEBUG[39m [11:56:32.779] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:32.789]   - [DONE] 
INFO  [11:56:32.792] 1.4 - Master deviance 
[90mDEBUG[39m [11:56:32.795] Starting master deviance. 
[90mDEBUG[39m [11:56:32.798]   - [DONE] 
INFO  [11:56:32.800] 1.5 - Termination conditions 
[90mDEBUG[39m [11:56:32.803]   - [DONE] 
INFO  [11:56:32.806]  
INFO  [11:56:32.810] ############################################### 
INFO  [11:56:32.812] # Starting iteration 2 
INFO  [11:56:32.814] ############################################### 
INFO  [11:56:32.817]  
INFO  [11:56:32.819] 2.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:32.821] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:32.823] Regular call 
[90mDEBUG[39m [11:56:32.827] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.842] Log for site 1: 
[90mDEBUG[39m [11:56:32.845] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.830] Initializing node beta... 
[90mDEBUG[39m [11:56:32.839] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.857] Log for site 2: 
[90mDEBUG[39m [11:56:32.860] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.848] Initializing node beta... 
[90mDEBUG[39m [11:56:32.854] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.872] Log for site 3: 
[90mDEBUG[39m [11:56:32.875] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.862] Initializing node beta... 
[90mDEBUG[39m [11:56:32.869] Calculating the Betas.  
[90mDEBUG[39m [11:56:32.881]   - [DONE] 
INFO  [11:56:32.884] 2.2 - Master beta 
[90mDEBUG[39m [11:56:32.886] Initializing master Beta... 
[90mDEBUG[39m [11:56:32.888] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:32.891] Updating the Betas. 
[90mDEBUG[39m [11:56:32.894]   - [DONE] 
INFO  [11:56:32.897] 2.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:32.900] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:32.902] Regular call 
[90mDEBUG[39m [11:56:32.905] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.917] Log for site 1: 
[90mDEBUG[39m [11:56:32.920] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.907] Starting node deviance. 
[90mDEBUG[39m [11:56:32.914] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.931] Log for site 2: 
[90mDEBUG[39m [11:56:32.933] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.922] Starting node deviance. 
[90mDEBUG[39m [11:56:32.928] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.945] Log for site 3: 
[90mDEBUG[39m [11:56:32.948] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.936] Starting node deviance. 
[90mDEBUG[39m [11:56:32.942] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:32.950]   - [DONE] 
INFO  [11:56:32.952] 2.4 - Master deviance 
[90mDEBUG[39m [11:56:32.955] Starting master deviance. 
[90mDEBUG[39m [11:56:32.957]   - [DONE] 
INFO  [11:56:32.960] 2.5 - Termination conditions 
[90mDEBUG[39m [11:56:32.962]   - [DONE] 
INFO  [11:56:32.967]  
INFO  [11:56:32.970] ############################################### 
INFO  [11:56:32.973] # Starting iteration 3 
INFO  [11:56:32.975] ############################################### 
INFO  [11:56:32.978]  
INFO  [11:56:32.980] 3.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:32.982] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:32.985] Regular call 
[90mDEBUG[39m [11:56:32.988] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.999] Log for site 1: 
[90mDEBUG[39m [11:56:33.002] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.990] Initializing node beta... 
[90mDEBUG[39m [11:56:32.996] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.013] Log for site 2: 
[90mDEBUG[39m [11:56:33.016] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:33.004] Initializing node beta... 
[90mDEBUG[39m [11:56:33.011] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.029] Log for site 3: 
[90mDEBUG[39m [11:56:33.031] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:33.019] Initializing node beta... 
[90mDEBUG[39m [11:56:33.026] Calculating the Betas.  
[90mDEBUG[39m [11:56:33.033]   - [DONE] 
INFO  [11:56:33.036] 3.2 - Master beta 
[90mDEBUG[39m [11:56:33.038] Initializing master Beta... 
[90mDEBUG[39m [11:56:33.041] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:33.043] Updating the Betas. 
[90mDEBUG[39m [11:56:33.046]   - [DONE] 
INFO  [11:56:33.052] 3.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:33.055] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:33.059] Regular call 
[90mDEBUG[39m [11:56:33.065] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.079] Log for site 1: 
[90mDEBUG[39m [11:56:33.084] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:33.068] Starting node deviance. 
[90mDEBUG[39m [11:56:33.076] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.104] Log for site 2: 
[90mDEBUG[39m [11:56:33.109] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:33.087] Starting node deviance. 
[90mDEBUG[39m [11:56:33.097] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.123] Log for site 3: 
[90mDEBUG[39m [11:56:33.127] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:33.113] Starting node deviance. 
[90mDEBUG[39m [11:56:33.120] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:33.130]   - [DONE] 
INFO  [11:56:33.133] 3.4 - Master deviance 
[90mDEBUG[39m [11:56:33.137] Starting master deviance. 
[90mDEBUG[39m [11:56:33.139]   - [DONE] 
INFO  [11:56:33.143] 3.5 - Termination conditions 
[90mDEBUG[39m [11:56:33.150]   - [DONE] 
INFO  [11:56:33.156]  
INFO  [11:56:33.161] ############################################### 
INFO  [11:56:33.164] # Starting iteration 4 
INFO  [11:56:33.166] ############################################### 
INFO  [11:56:33.169]  
INFO  [11:56:33.172] 4.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:33.179] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:33.182] Regular call 
[90mDEBUG[39m [11:56:33.187] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.204] Log for site 1: 
[90mDEBUG[39m [11:56:33.207] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:33.191] Initializing node beta... 
[90mDEBUG[39m [11:56:33.199] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.220] Log for site 2: 
[90mDEBUG[39m [11:56:33.222] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:33.210] Initializing node beta... 
[90mDEBUG[39m [11:56:33.217] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.235] Log for site 3: 
[90mDEBUG[39m [11:56:33.238] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:33.224] Initializing node beta... 
[90mDEBUG[39m [11:56:33.231] Calculating the Betas.  
[90mDEBUG[39m [11:56:33.240]   - [DONE] 
INFO  [11:56:33.243] 4.2 - Master beta 
[90mDEBUG[39m [11:56:33.246] Initializing master Beta... 
[90mDEBUG[39m [11:56:33.248] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:33.250] Updating the Betas. 
[90mDEBUG[39m [11:56:33.253]   - [DONE] 
INFO  [11:56:33.255] 4.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:33.258] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:33.261] Regular call 
[90mDEBUG[39m [11:56:33.264] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.281] Log for site 1: 
[90mDEBUG[39m [11:56:33.284] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:33.267] Starting node deviance. 
[90mDEBUG[39m [11:56:33.279] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.295] Log for site 2: 
[90mDEBUG[39m [11:56:33.298] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:33.286] Starting node deviance. 
[90mDEBUG[39m [11:56:33.292] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:33.311] Log for site 3: 
[90mDEBUG[39m [11:56:33.314] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:33.301] Starting node deviance. 
[90mDEBUG[39m [11:56:33.308] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:33.316]   - [DONE] 
INFO  [11:56:33.319] 4.4 - Master deviance 
[90mDEBUG[39m [11:56:33.321] Starting master deviance. 
[90mDEBUG[39m [11:56:33.323]   - [DONE] 
INFO  [11:56:33.327] 4.5 - Termination conditions 
[90mDEBUG[39m [11:56:33.330]   - [CONVERGED] 
[90mDEBUG[39m [11:56:33.333] Preparing output 


#### Comparison

In [11]:
val_poisson_res = validate(results_centralized, results_federated_plain)
print(val_poisson_res)

[1] "LaTeX"

% Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
% Date and time: Thu, May 05, 2022 - 11:56:33
\begin{table}[!htbp] \centering 
  \caption{} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}} ccccccccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & coeff\_c & coeff\_f & stderr\_c & stderr\_f & pvalues\_c & pvalues\_f & zvalues\_c & zvalues\_f \\ 
\hline \\[-1.8ex] 
(Intercept) & $0.595$ & $0.595$ & $0.021$ & $0.021$ & $0$ & $0$ & $28.465$ & $28.465$ \\ 
x1 & $0.269$ & $0.269$ & $0.007$ & $0.007$ & $0$ & $0$ & $37.220$ & $37.220$ \\ 
x2 & $0.446$ & $0.446$ & $0.007$ & $0.007$ & $0$ & $0$ & $62.566$ & $62.566$ \\ 
\hline \\[-1.8ex] 
\end{tabular} 
\end{table} 
[1] "Text"

            coeff_c coeff_f stderr_c stderr_f pvalues_c pvalues_f zvalues_c zvalues_f
-------------------------------------------------------------------------------------
(Intercept)  0.595   0.595   0.021    0.021       0         0      28.465    2

### Logistic

#### Centralized

In [6]:
# Housekeeping
if(exists('results_centralized')){
    rm(results_centralized)
}
if(exists('results_federated_plain')){
    rm(results_federated_plain)
}

datasets <- list(
  read.csv('../data/logistic_party1.csv'),
  read.csv('../data/logistic_party2.csv'),
  read.csv('../data/logistic_party3.csv')
)
datasets_combined <- do.call(rbind.data.frame, datasets)
results_centralized <- glm(data=datasets_combined, formula = y ~ x1 + x2, family=binomial(link = "logit"))

#### Federated

In [7]:
client <- vtg::MockClient$new(datasets, "vtg.glm")
results_federated_plain <- vtg.glm::dglm(client, formula = y ~ x1 + x2, family=binomial(link = "logit"), tol=1e-08, maxit=25)

[90mDEBUG[39m [11:56:31.097] Initializing... 
INFO  [11:56:31.100]  
INFO  [11:56:31.102] ############################################### 
INFO  [11:56:31.104] # Starting iteration 0 
INFO  [11:56:31.107] ############################################### 
INFO  [11:56:31.110]  
INFO  [11:56:31.112] 0.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:31.115] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:31.117] Regular call 
[90mDEBUG[39m [11:56:31.120] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.138] Log for site 1: 
[90mDEBUG[39m [11:56:31.141] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.126] Initializing node beta... 
[90mDEBUG[39m [11:56:31.133] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:31.135] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.156] Log for site 2: 
[90mDEBUG[39m [11:56:31.158] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.144] Initializing node beta... 
[90mDEBUG[39m [11:56:31.150] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:31.153] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.183] Log for site 3: 
[90mDEBUG[39m [11:56:31.188] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.161] Initializing node beta... 
[90mDEBUG[39m [11:56:31.168] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:31.170] Calculating the Betas.  
[90mDEBUG[39m [11:56:31.192]   - [DONE] 
INFO  [11:56:31.195] 0.2 - Master beta 
[90mDEBUG[39m [11:56:31.199] Initializing master Beta... 
[90mDEBUG[39m [11:56:31.202] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:31.205] Updating the Betas. 
[90mDEBUG[39m [11:56:31.208]   - [DONE] 
INFO  [11:56:31.211] 0.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:31.213] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:31.216] Regular call 
[90mDEBUG[39m [11:56:31.218] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.237] Log for site 1: 
[90mDEBUG[39m [11:56:31.240] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.221] Starting node deviance. 
[90mDEBUG[39m [11:56:31.232] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:31.235] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.253] Log for site 2: 
[90mDEBUG[39m [11:56:31.256] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.242] Starting node deviance. 
[90mDEBUG[39m [11:56:31.248] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:31.251] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.271] Log for site 3: 
[90mDEBUG[39m [11:56:31.274] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.258] Starting node deviance. 
[90mDEBUG[39m [11:56:31.265] First iteration. Initializing variables. 
[90mDEBUG[39m [11:56:31.268] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:31.277]   - [DONE] 
INFO  [11:56:31.279] 0.4 - Master deviance 
[90mDEBUG[39m [11:56:31.282] Starting master deviance. 
[90mDEBUG[39m [11:56:31.284]   - [DONE] 
INFO  [11:56:31.287] 0.5 - Termination conditions 
[90mDEBUG[39m [11:56:31.289]   - [DONE] 
INFO  [11:56:31.291]  
INFO  [11:56:31.294] ############################################### 
INFO  [11:56:31.297] # Starting iteration 1 
INFO  [11:56:31.299] ############################################### 
INFO  [11:56:31.302]  
INFO  [11:56:31.304] 1.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:31.307] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:31.310] R

"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.329] Log for site 1: 
[90mDEBUG[39m [11:56:31.332] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.316] Initializing node beta... 
[90mDEBUG[39m [11:56:31.325] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.344] Log for site 2: 
[90mDEBUG[39m [11:56:31.347] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.334] Initializing node beta... 
[90mDEBUG[39m [11:56:31.340] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.360] Log for site 3: 
[90mDEBUG[39m [11:56:31.363] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.350] Initializing node beta... 
[90mDEBUG[39m [11:56:31.356] Calculating the Betas.  
[90mDEBUG[39m [11:56:31.366]   - [DONE] 
INFO  [11:56:31.367] 1.2 - Master beta 
[90mDEBUG[39m [11:56:31.370] Initializing master Beta... 
[90mDEBUG[39m [11:56:31.372] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:31.374] Updating the Betas. 
[90mDEBUG[39m [11:56:31.377]   - [DONE] 
INFO  [11:56:31.380] 1.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:31.383] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:31.386] Regular call 
[90mDEBUG[39m [11:56:31.391] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.406] Log for site 1: 
[90mDEBUG[39m [11:56:31.409] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.395] Starting node deviance. 
[90mDEBUG[39m [11:56:31.403] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.426] Log for site 2: 
[90mDEBUG[39m [11:56:31.428] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.414] Starting node deviance. 
[90mDEBUG[39m [11:56:31.422] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.439] Log for site 3: 
[90mDEBUG[39m [11:56:31.441] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.430] Starting node deviance. 
[90mDEBUG[39m [11:56:31.436] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:31.444]   - [DONE] 
INFO  [11:56:31.446] 1.4 - Master deviance 
[90mDEBUG[39m [11:56:31.449] Starting master deviance. 
[90mDEBUG[39m [11:56:31.452]   - [DONE] 
INFO  [11:56:31.454] 1.5 - Termination conditions 
[90mDEBUG[39m [11:56:31.456]   - [DONE] 
INFO  [11:56:31.459]  
INFO  [11:56:31.462] ############################################### 
INFO  [11:56:31.464] # Starting iteration 2 
INFO  [11:56:31.468] ############################################### 
INFO  [11:56:31.472]  
INFO  [11:56:31.478] 2.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:31.483] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:31.488] Regular call 
[90mDEBUG[39m [11:56:31.492] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.509] Log for site 1: 
[90mDEBUG[39m [11:56:31.513] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.498] Initializing node beta... 
[90mDEBUG[39m [11:56:31.505] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.545] Log for site 2: 
[90mDEBUG[39m [11:56:31.550] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.516] Initializing node beta... 
[90mDEBUG[39m [11:56:31.541] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.566] Log for site 3: 
[90mDEBUG[39m [11:56:31.569] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.553] Initializing node beta... 
[90mDEBUG[39m [11:56:31.562] Calculating the Betas.  
[90mDEBUG[39m [11:56:31.571]   - [DONE] 
INFO  [11:56:31.575] 2.2 - Master beta 
[90mDEBUG[39m [11:56:31.578] Initializing master Beta... 
[90mDEBUG[39m [11:56:31.585] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:31.591] Updating the Betas. 
[90mDEBUG[39m [11:56:31.595]   - [DONE] 
INFO  [11:56:31.601] 2.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:31.606] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:31.613] Regular call 
[90mDEBUG[39m [11:56:31.620] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.636] Log for site 1: 
[90mDEBUG[39m [11:56:31.639] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.625] Starting node deviance. 
[90mDEBUG[39m [11:56:31.632] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.652] Log for site 2: 
[90mDEBUG[39m [11:56:31.654] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.642] Starting node deviance. 
[90mDEBUG[39m [11:56:31.649] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.670] Log for site 3: 
[90mDEBUG[39m [11:56:31.672] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.658] Starting node deviance. 
[90mDEBUG[39m [11:56:31.667] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:31.675]   - [DONE] 
INFO  [11:56:31.678] 2.4 - Master deviance 
[90mDEBUG[39m [11:56:31.681] Starting master deviance. 
[90mDEBUG[39m [11:56:31.683]   - [DONE] 
INFO  [11:56:31.686] 2.5 - Termination conditions 
[90mDEBUG[39m [11:56:31.688]   - [DONE] 
INFO  [11:56:31.691]  
INFO  [11:56:31.693] ############################################### 
INFO  [11:56:31.696] # Starting iteration 3 
INFO  [11:56:31.699] ############################################### 
INFO  [11:56:31.701]  
INFO  [11:56:31.703] 3.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:31.707] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:31.710] Regular call 
[90mDEBUG[39m [11:56:31.713] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.723] Log for site 1: 
[90mDEBUG[39m [11:56:31.726] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.715] Initializing node beta... 
[90mDEBUG[39m [11:56:31.720] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.742] Log for site 2: 
[90mDEBUG[39m [11:56:31.744] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.730] Initializing node beta... 
[90mDEBUG[39m [11:56:31.738] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.761] Log for site 3: 
[90mDEBUG[39m [11:56:31.764] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.747] Initializing node beta... 
[90mDEBUG[39m [11:56:31.757] Calculating the Betas.  
[90mDEBUG[39m [11:56:31.767]   - [DONE] 
INFO  [11:56:31.770] 3.2 - Master beta 
[90mDEBUG[39m [11:56:31.772] Initializing master Beta... 
[90mDEBUG[39m [11:56:31.775] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:31.778] Updating the Betas. 
[90mDEBUG[39m [11:56:31.780]   - [DONE] 
INFO  [11:56:31.783] 3.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:31.785] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:31.788] Regular call 
[90mDEBUG[39m [11:56:31.791] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.807] Log for site 1: 
[90mDEBUG[39m [11:56:31.809] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.794] Starting node deviance. 
[90mDEBUG[39m [11:56:31.803] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.822] Log for site 2: 
[90mDEBUG[39m [11:56:31.824] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.812] Starting node deviance. 
[90mDEBUG[39m [11:56:31.819] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.837] Log for site 3: 
[90mDEBUG[39m [11:56:31.840] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.827] Starting node deviance. 
[90mDEBUG[39m [11:56:31.835] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:31.842]   - [DONE] 
INFO  [11:56:31.845] 3.4 - Master deviance 
[90mDEBUG[39m [11:56:31.848] Starting master deviance. 
[90mDEBUG[39m [11:56:31.851]   - [DONE] 
INFO  [11:56:31.857] 3.5 - Termination conditions 
[90mDEBUG[39m [11:56:31.860]   - [DONE] 
INFO  [11:56:31.863]  
INFO  [11:56:31.866] ############################################### 
INFO  [11:56:31.868] # Starting iteration 4 
INFO  [11:56:31.870] ############################################### 
INFO  [11:56:31.873]  
INFO  [11:56:31.875] 4.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:31.879] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:31.883] Regular call 
[90mDEBUG[39m [11:56:31.885] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.900] Log for site 1: 
[90mDEBUG[39m [11:56:31.903] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.888] Initializing node beta... 
[90mDEBUG[39m [11:56:31.897] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.915] Log for site 2: 
[90mDEBUG[39m [11:56:31.918] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.905] Initializing node beta... 
[90mDEBUG[39m [11:56:31.912] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.932] Log for site 3: 
[90mDEBUG[39m [11:56:31.934] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:31.921] Initializing node beta... 
[90mDEBUG[39m [11:56:31.929] Calculating the Betas.  
[90mDEBUG[39m [11:56:31.937]   - [DONE] 
INFO  [11:56:31.939] 4.2 - Master beta 
[90mDEBUG[39m [11:56:31.941] Initializing master Beta... 
[90mDEBUG[39m [11:56:31.945] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:31.949] Updating the Betas. 
[90mDEBUG[39m [11:56:31.955]   - [DONE] 
INFO  [11:56:31.958] 4.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:31.961] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:31.964] Regular call 
[90mDEBUG[39m [11:56:31.966] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.978] Log for site 1: 
[90mDEBUG[39m [11:56:31.982] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.969] Starting node deviance. 
[90mDEBUG[39m [11:56:31.975] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:31.996] Log for site 2: 
[90mDEBUG[39m [11:56:31.998] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:31.985] Starting node deviance. 
[90mDEBUG[39m [11:56:31.993] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.010] Log for site 3: 
[90mDEBUG[39m [11:56:32.013] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.001] Starting node deviance. 
[90mDEBUG[39m [11:56:32.006] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:32.017]   - [DONE] 
INFO  [11:56:32.020] 4.4 - Master deviance 
[90mDEBUG[39m [11:56:32.023] Starting master deviance. 
[90mDEBUG[39m [11:56:32.027]   - [DONE] 
INFO  [11:56:32.029] 4.5 - Termination conditions 
[90mDEBUG[39m [11:56:32.032]   - [DONE] 
INFO  [11:56:32.035]  
INFO  [11:56:32.037] ############################################### 
INFO  [11:56:32.039] # Starting iteration 5 
INFO  [11:56:32.042] ############################################### 
INFO  [11:56:32.045]  
INFO  [11:56:32.050] 5.1 - RPC Node Beta 
[90mDEBUG[39m [11:56:32.053] ** Mocking call to "node_beta" ** 
[90mDEBUG[39m [11:56:32.056] Regular call 
[90mDEBUG[39m [11:56:32.059] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.072] Log for site 1: 
[90mDEBUG[39m [11:56:32.075] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.062] Initializing node beta... 
[90mDEBUG[39m [11:56:32.069] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.088] Log for site 2: 
[90mDEBUG[39m [11:56:32.090] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.078] Initializing node beta... 
[90mDEBUG[39m [11:56:32.084] Calculating the Betas.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.114] Log for site 3: 
[90mDEBUG[39m [11:56:32.117] Running a (mere) regular container. 
Calling RPC_node_beta 
[90mDEBUG[39m [11:56:32.096] Initializing node beta... 
[90mDEBUG[39m [11:56:32.109] Calculating the Betas.  
[90mDEBUG[39m [11:56:32.120]   - [DONE] 
INFO  [11:56:32.123] 5.2 - Master beta 
[90mDEBUG[39m [11:56:32.125] Initializing master Beta... 
[90mDEBUG[39m [11:56:32.128] Merging node calculation to update new Betas. 
[90mDEBUG[39m [11:56:32.131] Updating the Betas. 
[90mDEBUG[39m [11:56:32.133]   - [DONE] 
INFO  [11:56:32.136] 5.3 - RPC Node Deviance 
[90mDEBUG[39m [11:56:32.139] ** Mocking call to "node_deviance" ** 
[90mDEBUG[39m [11:56:32.143] Regular call 
[90mDEBUG[39m [11:56:32.147] there are 3 datasets ..  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.161] Log for site 1: 
[90mDEBUG[39m [11:56:32.164] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.149] Starting node deviance. 
[90mDEBUG[39m [11:56:32.158] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.182] Log for site 2: 
[90mDEBUG[39m [11:56:32.184] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.167] Starting node deviance. 
[90mDEBUG[39m [11:56:32.178] Updating the variables for node deviance.  


"the condition has length > 1 and only the first element will be used"


[90mDEBUG[39m [11:56:32.198] Log for site 3: 
[90mDEBUG[39m [11:56:32.201] Running a (mere) regular container. 
Calling RPC_node_deviance 
[90mDEBUG[39m [11:56:32.187] Starting node deviance. 
[90mDEBUG[39m [11:56:32.194] Updating the variables for node deviance.  
[90mDEBUG[39m [11:56:32.203]   - [DONE] 
INFO  [11:56:32.206] 5.4 - Master deviance 
[90mDEBUG[39m [11:56:32.210] Starting master deviance. 
[90mDEBUG[39m [11:56:32.213]   - [DONE] 
INFO  [11:56:32.215] 5.5 - Termination conditions 
[90mDEBUG[39m [11:56:32.218]   - [CONVERGED] 
[90mDEBUG[39m [11:56:32.221] Preparing output 


#### Comparison

In [8]:
val_binomial_res = validate(results_centralized, results_federated_plain)
print(val_binomial_res)

[1] "LaTeX"

% Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
% Date and time: Thu, May 05, 2022 - 11:56:32
\begin{table}[!htbp] \centering 
  \caption{} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}} ccccccccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & coeff\_c & coeff\_f & stderr\_c & stderr\_f & pvalues\_c & pvalues\_f & zvalues\_c & zvalues\_f \\ 
\hline \\[-1.8ex] 
(Intercept) & $0.104$ & $0.104$ & $0.064$ & $0.064$ & $0.106$ & $0.106$ & $1.617$ & $1.617$ \\ 
x1 & $2.593$ & $2.593$ & $0.093$ & $0.093$ & $0$ & $0$ & $27.802$ & $27.802$ \\ 
x2 & $$-$0.054$ & $$-$0.054$ & $0.051$ & $0.051$ & $0.291$ & $0.291$ & $$-$1.055$ & $$-$1.055$ \\ 
\hline \\[-1.8ex] 
\end{tabular} 
\end{table} 
[1] "Text"

            coeff_c coeff_f stderr_c stderr_f pvalues_c pvalues_f zvalues_c zvalues_f
-------------------------------------------------------------------------------------
(Intercept)  0.104   0.104   0.064    0.064     0.106 

### Custom relative survival (with Poisson error)

In [12]:
# Housekeeping
if(exists('results_centralized')){
    rm(results_centralized)
}
if(exists('results_federated_plain')){
    rm(results_federated_plain)
}

modperiod.link <- function(dstar) {
  structure(
    list(
      # Link
      linkfun = function(mu) {log(mu - dstar)} ,
        
      # Inverse link
      linkinv = function(eta) {exp(eta) + dstar} ,
        
      # Derivative of the inverse link (d_mu/d_eta)
      mu.eta = function(eta) {exp(eta)},
        
      # Functions for domain checking
      valideta = function(eta) TRUE,
      validmu = function(mu) mu > dstar ,
        
      #validmu = function(mu) all(is.finite(mu)) && all(mu > 0),
      name = "modperiod.link"
    ),
    class = "link-glm"
  )
}
format_data <- function(data, types) {
  
  column_names = names(types)
  for(i in 1:length(types)) {
    column_name = column_names[i]
    specs = types[[i]]
    type_ = specs$type
    if (type_ == "numeric"){
      data[[column_name]] = as.numeric(data[[column_name]])
    }
    if (type_ == "factor"){
      data = data[data[[column_name]] %in% specs$levels,]
      data[[column_name]] = factor(data[[column_name]], levels=specs$levels)
      if(! is.null(specs$ref)) data[[column_name]] = relevel(data[[column_name]], ref=specs$ref)
    }
  }
  data
}
types=list(prog=list(type='factor', levels=c('Vocational', 'General', 'Academic'), ref=NULL))
data <- format_data(datasets_combined, types)
mustart = pmax(data$y, data$y) + 0.1

results_centralized <- glm(data=data, formula=y ~ x1 + x2 + x3, family = poisson(link = modperiod.link(data$y)), mustart=mustart)

client <- vtg::MockClient$new(datasets, "vtg.glm")
results_federated_plain <- vtg.glm::dglm(client, formula=y ~ x1 + x2 + x3, types=types, family='rs.poi', dstar="x2", tol=1e-08, maxit=25)

val_customglm_res = validate(results_centralized, results_federated_plain)
print(val_customglm_res)
rm(results_centralized)
rm(results_federated_plain)

ERROR: Error in eval(predvars, data, env): object 'x3' not found


## References
For more information, please refer to our paper:

> M. Cellamare, A.J. van Gestel, H. Alradhi, F. Martin, A. Moncada-Torres. "A Federated Generalized Linear Model for Privacy-Preserving Analysis". Under review