-
Notifications
You must be signed in to change notification settings - Fork 32
varPro Phase 3: gg_udependent — dependency graph for uvarpro fits #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
dbcd51e
docs: add varPro Phase 3 (gg_udependent) design spec
ehrlinger 2f52528
docs: add varPro Phase 3 (gg_udependent) implementation plan
ehrlinger 428748f
chore: open 2.7.3.9004 dev cycle; add ggraph to Suggests
ehrlinger 8924970
feat(P3-T1): gg_udependent extractor + print/summary/autoplot (TDD)
ehrlinger f7945f4
fix(P3-T1): threshold validation — any positive value is valid (not j…
ehrlinger 389ab51
fix(P3-T1): summary.gg_udependent returns invisibly without side-effe…
ehrlinger 099dcc3
refactor(P3-T1): move S3 companions to shared method files; document …
ehrlinger 24ea32f
feat(P3-T2): plot.gg_udependent ggraph network renderer (TDD)
ehrlinger d859a90
fix(P3-T2): add importFrom igraph; use match for edge-weight backfill
ehrlinger e1982c7
test(P3-T3): add vdiffr snapshot test stubs (ggraph not in dev env)
ehrlinger cf3190f
docs(P3-T4): update NEWS.md for v2.7.3.9004 / gg_udependent Phase 3
ehrlinger d19c42f
fix(P3-T4): move igraph to Imports, guard donttest example, prune sta…
ehrlinger 3902477
chore(P3): remove dead requireNamespace(igraph) guard — igraph now in…
ehrlinger 3fe730f
fix: CI failures — lint, vdiffr guard, undirected mode
ehrlinger 0e1568c
fix: add gg_udependent and plot.gg_udependent to _pkgdown.yml index
ehrlinger 661a388
fix: address Copilot review — undirected symmetry, summary API, print…
ehrlinger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,227 @@ | ||
| ##============================================================================= | ||
| #' Variable dependency graph from a uvarpro model | ||
| #' | ||
| #' Extracts cross-variable dependency scores from a fitted \code{uvarpro} | ||
| #' object using \code{\link[varPro]{get.beta.entropy}} and | ||
| #' \code{\link[varPro]{sdependent}}, and returns a tidy list suitable for | ||
| #' \code{plot.gg_udependent}. | ||
| #' | ||
| #' @param object A fitted \code{uvarpro} object (required). | ||
| #' @param threshold Numeric; positive dependency threshold passed to | ||
| #' \code{sdependent()}. An edge \eqn{i \to j} is drawn when | ||
| #' \code{I[i, j] >= threshold}. Default \code{0.25}. | ||
| #' @param q.signal Quantile threshold (0--1) for signal variable selection; | ||
| #' passed to \code{sdependent()}. Default \code{0.75}. | ||
| #' @param directed Logical; \code{TRUE} (default) builds a directed igraph. | ||
| #' @param min.degree Integer or \code{NULL}. When non-\code{NULL}, only nodes | ||
| #' with degree \eqn{\ge} \code{min.degree} are retained in \code{$nodes}, | ||
| #' \code{$edges}, and \code{$graph}. | ||
| #' @param ... Additional arguments forwarded to \code{varPro::sdependent()}. | ||
| #' | ||
| #' @return A named list of class \code{"gg_udependent"} with elements: | ||
| #' \describe{ | ||
| #' \item{\code{$edges}}{Data frame: \code{variable_from}, \code{variable_to}, | ||
| #' \code{weight} (raw cross-importance value).} | ||
| #' \item{\code{$nodes}}{Data frame: \code{variable} (factor, levels by | ||
| #' descending degree), \code{degree} (integer; out-degree when | ||
| #' \code{directed = TRUE}, total degree when \code{directed = FALSE}), | ||
| #' \code{selected} (logical, \code{TRUE} if in \code{sdependent}'s | ||
| #' signal set).} | ||
| #' \item{\code{$graph}}{igraph object. \code{NULL} if no dependencies | ||
| #' detected.} | ||
| #' } | ||
| #' A \code{"provenance"} attribute carries \code{threshold}, \code{q.signal}, | ||
| #' \code{directed}, \code{min.degree}, \code{xvar.names}, and \code{n}. | ||
| #' | ||
| #' @seealso \code{\link{plot.gg_udependent}} | ||
| #' | ||
| #' @examples | ||
| #' \donttest{ | ||
| #' set.seed(42) | ||
| #' uv <- varPro::uvarpro(iris[, -5], ntree = 50) | ||
| #' gg <- gg_udependent(uv) | ||
| #' print(gg) | ||
| #' } | ||
| #' | ||
| #' @importFrom varPro get.beta.entropy sdependent | ||
| #' @importFrom igraph graph_from_adjacency_matrix degree delete_vertices as_data_frame V | ||
| #' @export | ||
| gg_udependent <- function(object, | ||
| threshold = 0.25, | ||
| q.signal = 0.75, | ||
| directed = TRUE, | ||
| min.degree = NULL, | ||
| ...) { | ||
| .validate_udep_inputs(object, threshold, directed) | ||
|
|
||
| ## ---- Compute cross-variable dependency matrix ---------------------------- | ||
| imp_mat <- varPro::get.beta.entropy(object) | ||
|
|
||
| ## ---- Helper: build and return an empty gg_udependent result --------------- | ||
| .empty_result <- function(msg) { | ||
| warning("gg_udependent: ", msg, | ||
| "\nReturning empty structure. Consider lowering threshold.", | ||
| call. = FALSE) | ||
| empty_edges <- data.frame(variable_from = character(0), | ||
| variable_to = character(0), | ||
| weight = numeric(0), | ||
| stringsAsFactors = FALSE) | ||
| empty_nodes <- data.frame(variable = factor(character(0)), | ||
| degree = integer(0), | ||
| selected = logical(0), | ||
| stringsAsFactors = FALSE) | ||
| result <- structure( | ||
| list(edges = empty_edges, nodes = empty_nodes, graph = NULL), | ||
| class = c("gg_udependent", "list") | ||
| ) | ||
| attr(result, "provenance") <- .udep_provenance(object, threshold, q.signal, | ||
| directed, min.degree) | ||
| result | ||
| } | ||
|
|
||
| ## ---- Build adjacency from threshold; short-circuit if empty -------------- | ||
| adj_mat <- (imp_mat >= threshold) * 1 | ||
| diag(adj_mat) <- 0 | ||
| if (sum(adj_mat) == 0L) { | ||
| return(.empty_result( | ||
| paste0("no edges found at threshold=", threshold) | ||
| )) | ||
| } | ||
|
|
||
| ## ---- Call sdependent for signal detection -------------------------------- | ||
| sdep <- varPro::sdependent(imp_mat, threshold = threshold, | ||
| q.signal = q.signal, directed = directed, | ||
| min.degree = min.degree, plot = FALSE, ...) | ||
|
|
||
| ## ---- Handle empty graph (sdependent may also return character) ----------- | ||
| if (is.character(sdep)) { | ||
| return(.empty_result(sdep)) | ||
| } | ||
|
|
||
| ## ---- Build igraph from adjacency ----------------------------------------- | ||
| ## For undirected, symmetrise first so edge existence = max(I[i,j], I[j,i]) | ||
| ## and mode = "undirected" is valid (igraph >= 1.6.0 requires symmetry). | ||
| if (!isTRUE(directed)) { | ||
| adj_mat <- pmax(adj_mat, t(adj_mat)) | ||
| } | ||
| g <- igraph::graph_from_adjacency_matrix( | ||
| adj_mat, | ||
| mode = if (isTRUE(directed)) "directed" else "undirected", | ||
| diag = FALSE | ||
| ) | ||
| isolated <- igraph::degree(g, mode = "all") == 0 | ||
| g <- igraph::delete_vertices(g, which(isolated)) | ||
|
|
||
| ## ---- Build tidy edge data frame with raw weights ------------------------- | ||
| edge_df <- igraph::as_data_frame(g, what = "edges") | ||
| if (nrow(edge_df) > 0L) { | ||
| if (isTRUE(directed)) { | ||
| edge_df$weight <- mapply(function(i, j) imp_mat[i, j], | ||
| edge_df[[1L]], edge_df[[2L]]) | ||
| } else { | ||
| ## Undirected: weight = max of both directions | ||
| edge_df$weight <- mapply( | ||
| function(i, j) max(imp_mat[i, j], imp_mat[j, i]), | ||
| edge_df[[1L]], edge_df[[2L]]) | ||
| } | ||
| } else { | ||
| edge_df$weight <- numeric(0) | ||
| } | ||
| names(edge_df)[1:2] <- c("variable_from", "variable_to") | ||
|
|
||
| ## ---- Build tidy node data frame ------------------------------------------ | ||
| vnames <- igraph::V(g)$name | ||
| ## degree: out-degree for directed (matches sdependent's signal.vars logic), | ||
| ## total degree for undirected | ||
| deg_vec <- if (isTRUE(directed)) { | ||
| igraph::degree(g, mode = "out")[vnames] | ||
| } else { | ||
| igraph::degree(g)[vnames] | ||
| } | ||
|
|
||
| signal_set <- if (is.null(sdep$signal.vars)) character(0) else sdep$signal.vars | ||
| node_df <- data.frame( | ||
| variable = factor(vnames, levels = vnames[order(-deg_vec)]), | ||
| degree = as.integer(deg_vec), | ||
| selected = vnames %in% signal_set, | ||
| stringsAsFactors = FALSE, | ||
| row.names = NULL | ||
| ) | ||
|
|
||
| ## ---- Apply min.degree node filtering (user-requested subsetting) --------- | ||
| if (!is.null(min.degree)) { | ||
| keep <- node_df$degree >= min.degree | ||
| keep_names <- as.character(node_df$variable)[keep] | ||
| drop_names <- as.character(node_df$variable)[!keep] | ||
| g <- igraph::delete_vertices(g, drop_names) | ||
| edge_df <- edge_df[ | ||
| edge_df$variable_from %in% keep_names & | ||
| edge_df$variable_to %in% keep_names, , drop = FALSE] | ||
| node_df <- node_df[keep, , drop = FALSE] | ||
| rownames(edge_df) <- NULL | ||
| rownames(node_df) <- NULL | ||
| } | ||
|
|
||
| ## ---- Set igraph node attributes ------------------------------------------ | ||
| if (length(igraph::V(g)) > 0L) { | ||
| igraph::V(g)$degree <- node_df$degree[ | ||
| match(igraph::V(g)$name, as.character(node_df$variable))] | ||
| igraph::V(g)$selected <- node_df$selected[ | ||
| match(igraph::V(g)$name, as.character(node_df$variable))] | ||
| } | ||
|
|
||
| ## ---- Set igraph edge weights (order-insensitive for undirected) ----------- | ||
| if (length(igraph::E(g)) > 0L && nrow(edge_df) > 0L) { | ||
| el <- igraph::as_data_frame(g, what = "edges") | ||
| if (isTRUE(directed)) { | ||
| idx <- match(paste(el$from, el$to), | ||
| paste(edge_df$variable_from, edge_df$variable_to)) | ||
| } else { | ||
| key_g <- paste(pmin(el$from, el$to), pmax(el$from, el$to)) | ||
| key_e <- paste(pmin(edge_df$variable_from, edge_df$variable_to), | ||
| pmax(edge_df$variable_from, edge_df$variable_to)) | ||
| idx <- match(key_g, key_e) | ||
| } | ||
| igraph::E(g)$weight <- edge_df$weight[idx] | ||
| } | ||
|
|
||
| ## ---- Assemble result ------------------------------------------------------ | ||
| result <- structure( | ||
| list(edges = edge_df, nodes = node_df, graph = g), | ||
| class = c("gg_udependent", "list") | ||
| ) | ||
| attr(result, "provenance") <- .udep_provenance(object, threshold, q.signal, | ||
| directed, min.degree) | ||
| result | ||
| } | ||
|
|
||
| ## ---- Internal helpers ------------------------------------------------------- | ||
|
|
||
| #' @keywords internal | ||
| .validate_udep_inputs <- function(object, threshold, directed) { | ||
| if (missing(object) || is.null(object)) { | ||
| stop("'object' must be a fitted uvarpro object.", call. = FALSE) | ||
| } | ||
| if (!inherits(object, "uvarpro")) { | ||
| stop("'object' must be a uvarpro fit (class \"uvarpro\").", call. = FALSE) | ||
| } | ||
| if (!is.numeric(threshold) || length(threshold) != 1L || threshold <= 0) { | ||
| stop("'threshold' must be a single positive numeric value.", call. = FALSE) | ||
| } | ||
| if (!is.logical(directed) || length(directed) != 1L) { | ||
| stop("'directed' must be a single logical value.", call. = FALSE) | ||
| } | ||
| invisible(NULL) | ||
| } | ||
|
|
||
| #' @keywords internal | ||
| .udep_provenance <- function(object, threshold, q.signal, directed, min.degree) { | ||
| list( | ||
| threshold = threshold, | ||
| q.signal = q.signal, | ||
| directed = directed, | ||
| min.degree = min.degree, | ||
| xvar.names = object$xvar.names, | ||
| n = nrow(object$x) | ||
| ) | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.