Convert netlm to formula-based input #76

jhollway · 2021-01-17T13:18:28Z

Note that we will be wanting to make this function:

base (list of matrices) consistent (this can be the default)
igraph graph consistent (just as input, not output)
tidygraph consistent (input, but also as parsnip/broom consistent output?)

See https://tidymodels.github.io/model-implementation-principles/function-interfaces.html for more

BBieri · 2021-01-21T10:27:26Z

@jhollway Here is my attempt at recreating the function with the "tidy" syntax that I just pushed. Would be grateful for your feedback particularly considering the eventual presence of intercepts, factors multiplying the dependent matrices, or logs, etc. I implemented it this way as from what I understood, from the original netlm() function, this was not an issue.

netlm2 <- function(formula, data, names, rep = 1000){
  
  #Could be automated by extracting the names of the named selected list of dependent variables DV. (after selection)
  if(missing(names)){ 
    names <- paste0("x", 1:length(IV))
  }
  # Decomposing the formula into its components.
  
  formula <- as.formula(formula)
  tn <- as.character(formula[[1]]) # ~
  yn <- as.character(formula[[2]]) # IV
  xn <- deparse(formula[[3]]) 
  xn <- c(unlist(strsplit(xn, split = " ")))
  xn <- as.vector(xn[xn != "+"])
  
  
  #Selecting the matrices in the data list.
  IV <- data %>% keep(names(.) %in% xn)
  DV <- purrr::pluck(data, yn)

  #Permutation, list of matrices.
  rbperm <- function (m) {
    n <- sample(1:dim(m)[1])
    o <- sample(1:dim(m)[2])
    p <- matrix(data = m[n, o], nrow = dim(m)[1], ncol = dim(m)[2])
    p
  }

  nIV <- length(IV)
  M.fit <- lm(as.numeric(unlist(DV)) ~ Reduce(cbind,
                                              lapply(1:length(IV), function(x) unlist(IV[x][1]))))
  M.coeff <- M.fit$coefficients

  permDist <- matrix(0, rep, (nIV+1))

  for(i in 1:rep){
    tempDV <- rbperm(DV)
    permDist[i,] <- (lm(as.numeric(unlist(tempDV)) ~
                          Reduce(cbind,lapply(1:length(IV),
                                              function(x) unlist(IV[x][1])))))$coefficients
  }

  resTable <- data.frame(Effect = c("Intercept", names),
                         Coefficients = formatC(M.coeff, format = "f", digits = 2),
                         Pvalue = signif(as.numeric(lapply(1:(nIV+1),
                                                           function(x) ecdf(permDist[,x])(M.coeff[x]))),
                                         digits = 2),
                         Sig = ifelse(as.numeric(lapply(1:(nIV+1),
                                                        function(x) ecdf(permDist[,x])(M.coeff[x])))<0.05,
                                      ifelse(as.numeric(lapply(1:(nIV+1),
                                                               function(x) ecdf(permDist[,x])(M.coeff[x])))<0.01,
                                             ifelse(as.numeric(lapply(1:(nIV+1),
                                                                      function(x) ecdf(permDist[,x])(M.coeff[x])))<0.001,
                                                    "***", "**"), "*"), ""))
  rownames(resTable) <- NULL
  print(resTable)
  # Turn this into a print function

  cat("\nMultiple R-squared: ", formatC(summary(M.fit)$r.squared),
      ",\tAdjusted R-squared: ", formatC(summary(M.fit)$adj.r.squared),
      "\n", sep="")

  obj <- list()
  obj$results <- data.frame(Effect = c("Intercept", names),
                            Coefficients = as.numeric(formatC(M.coeff, format="f", digits = 2)),
                            Pvalue = signif(as.numeric(lapply(1:(nIV+1),
                                                              function(x) ecdf(permDist[,x])(M.coeff[x]))),
                                            digits=2),
                            Sig = ifelse(as.numeric(lapply(1:(nIV+1),
                                                           function(x) ecdf(permDist[,x])(M.coeff[x])))<0.05,
                                         ifelse(as.numeric(lapply(1:(nIV+1),
                                                                  function(x) ecdf(permDist[,x])(M.coeff[x])))<0.01,
                                                ifelse(as.numeric(lapply(1:(nIV+1),
                                                                         function(x) ecdf(permDist[,x])(M.coeff[x])))<0.001,
                                                       "***", "**"), "*"), ""))
  rownames(obj$results) <- NULL
  obj$r.squared <- formatC(summary(M.fit)$r.squared)
  obj$adj.r.squared <- formatC(summary(M.fit)$adj.r.squared)
  invisible(obj)
}

jhollway self-assigned this Jan 17, 2021

jhollway added a commit that referenced this issue Jan 19, 2021

Re #76 and #77 readded netlm and started new formula-based logic

767b674

jhollway assigned BBieri Jan 19, 2021

jhollway added a commit that referenced this issue Jan 21, 2021

Closed #76 by converting netlm to formula-based input

abfbbca

henriquesposito mentioned this issue Jan 27, 2021

Added netlm and centralization functions #78

Merged

16 tasks

jhollway closed this as completed in #78 Feb 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert netlm to formula-based input #76

Convert netlm to formula-based input #76

jhollway commented Jan 17, 2021 •

edited by BBieri

BBieri commented Jan 21, 2021

Convert netlm to formula-based input #76

Convert netlm to formula-based input #76

Comments

jhollway commented Jan 17, 2021 • edited by BBieri

BBieri commented Jan 21, 2021

jhollway commented Jan 17, 2021 •

edited by BBieri