GitHub Actions mermaid diagram#52
Conversation
There was a problem hiding this comment.
Pull request overview
Adds documentation and tooling to visualize the repo’s GitHub Actions CI/CD pipeline as a Mermaid diagram, surfaced as a new pkgdown article and supported by a new exported helper.
Changes:
- Introduces
nemo_gha_mermaid()to generate a Mermaid flowchart fromdeploy.yamlplus reusable workflow YAMLs. - Adds a new Quarto article (
gha.qmd) and links it in pkgdown navbar; adds Mermaid CSS variable overrides inextra.scss. - Small refactor in parse examples/tests/docs to reduce repeated
file.path(...)construction; adds stronger runtime checks/error handling innemo_uml().
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
R/gha.R |
New Mermaid diagram generator for GHA workflow visualization |
man/nemo_gha_mermaid.Rd |
New Rd docs for nemo_gha_mermaid() |
vignettes/gha.qmd |
New pkgdown article rendering the Mermaid diagram |
pkgdown/_pkgdown.yml |
Adds “CI/CD Workflow” article + updates navbar labels + adds function to reference index |
pkgdown/extra.scss |
Adds Mermaid theme CSS variable overrides |
R/uml.R |
Improves dependency checks and plantuml error handling |
R/parse.R |
Refactors roxygen examples to use a small path helper |
man/parse_file.Rd |
Syncs generated docs with updated examples |
man/parse_file_keyvalue.Rd |
Syncs generated docs with updated examples |
tests/testthat/test-roxytest-testexamples-parse.R |
Syncs roxytest-generated tests with updated examples/line numbers |
NAMESPACE |
Exports nemo_gha_mermaid |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| job_chain <- if (length(job_names) > 1) { | ||
| paste0(i2, job_names[-length(job_names)], " --> ", job_names[-1]) | ||
| } else { | ||
| character(0) | ||
| } | ||
|
|
||
| paste( | ||
| c( | ||
| "flowchart TD", | ||
| "", | ||
| paste0(i1, "subgraph BUMP [\"\U0001F527 Bump Version\"]"), | ||
| bump_m$lines, | ||
| paste0(i1, "end"), | ||
| "", | ||
| paste0(i1, 'BUMP --> DEPLOY'), | ||
| "", | ||
| paste0(i1, "subgraph DEPLOY [\"\U0001F527 conda-docs\"]"), | ||
| deploy_subgraph_lines, | ||
| "", | ||
| job_chain, | ||
| paste0(i1, "end") |
There was a problem hiding this comment.
The job-to-job edges are rendered as condarise --> tag (and BUMP --> DEPLOY). condarise, tag, BUMP, and DEPLOY are subgraph identifiers here, not node IDs, so Mermaid will typically create separate implicit nodes with those IDs rather than connecting the actual step nodes inside each subgraph. To reflect the real flow, connect the last step node ID of the upstream subgraph to the first step node ID of the downstream subgraph (you already have m$ids from .gha_render_steps).
| steps <- y$jobs[[1]]$steps | ||
| nms <- steps |> purrr::map_chr("name", .default = "") |
There was a problem hiding this comment.
.gha_read_steps() assumes y$jobs[[1]]$steps exists. If a workflow file has multiple jobs, or the first job is itself a reusable-workflow call (uses:) with no steps:, this will error. Consider iterating over all jobs and collecting their step names where present, or at least guarding for missing steps and returning a meaningful error/message.
| steps <- y$jobs[[1]]$steps | |
| nms <- steps |> purrr::map_chr("name", .default = "") | |
| jobs <- y$jobs | |
| if (is.null(jobs) || !length(jobs)) { | |
| return(character(0)) | |
| } | |
| nms <- jobs |> | |
| purrr::map(\(job) { | |
| steps <- job[["steps"]] | |
| if (is.null(steps) || !length(steps)) { | |
| return(character(0)) | |
| } | |
| purrr::map_chr(steps, "name", .default = "") | |
| }) |> | |
| unlist(use.names = FALSE) |
| nemo_gha_mermaid <- function(actions_url, deploy_yaml) { | ||
| dep <- yaml::read_yaml(deploy_yaml) | ||
| jobs <- dep$jobs | ||
| job_names <- names(jobs) | ||
| job_labs <- job_names |> | ||
| purrr::map_chr(\(j) { | ||
| nm <- jobs[[j]][["name"]] | ||
| if (is.null(nm) || is.na(nm)) j else as.character(nm) | ||
| }) | ||
| job_wf_files <- job_names |> | ||
| purrr::map_chr(\(j) { | ||
| uses <- jobs[[j]][["uses"]] %||% paste0(j, ".yaml") | ||
| basename(sub("@.*$", "", uses)) | ||
| }) | ||
| wf_url <- function(x) paste0(actions_url, "/", x) | ||
| bump_steps <- .gha_read_steps(wf_url("bump.yaml")) | ||
| deploy_steps <- purrr::map(job_wf_files, \(f) .gha_read_steps(wf_url(f))) |> | ||
| purrr::set_names(job_names) | ||
|
|
||
| i1 <- " " | ||
| i2 <- " " | ||
|
|
||
| bump_m <- .gha_render_steps(bump_steps, "B", i1) | ||
|
|
||
| deploy_subgraph_lines <- character(0) | ||
| for (i in seq_along(job_names)) { | ||
| j <- job_names[i] | ||
| prefix <- paste0(toupper(substring(j, 1, 1)), "S") | ||
| m <- .gha_render_steps(deploy_steps[[j]], prefix, i2) | ||
| deploy_subgraph_lines <- c( | ||
| deploy_subgraph_lines, | ||
| paste0(i2, "subgraph ", j, ' ["', job_labs[i], '"]'), | ||
| m$lines, | ||
| paste0(i2, "end") | ||
| ) | ||
| } | ||
|
|
||
| job_chain <- if (length(job_names) > 1) { | ||
| paste0(i2, job_names[-length(job_names)], " --> ", job_names[-1]) | ||
| } else { | ||
| character(0) | ||
| } | ||
|
|
||
| paste( | ||
| c( | ||
| "flowchart TD", | ||
| "", | ||
| paste0(i1, "subgraph BUMP [\"\U0001F527 Bump Version\"]"), | ||
| bump_m$lines, | ||
| paste0(i1, "end"), | ||
| "", | ||
| paste0(i1, 'BUMP --> DEPLOY'), | ||
| "", | ||
| paste0(i1, "subgraph DEPLOY [\"\U0001F527 conda-docs\"]"), | ||
| deploy_subgraph_lines, | ||
| "", | ||
| job_chain, | ||
| paste0(i1, "end") | ||
| ), | ||
| collapse = "\n" | ||
| ) | ||
| } |
There was a problem hiding this comment.
This is a new exported function, but there are no @testexamples (and therefore no roxytest-generated tests) exercising its diagram generation. Since actions_url can point to a local directory, you can add a minimal @testexamples block that writes small temporary YAML workflow files and asserts key substrings are present in the returned Mermaid text, avoiding any network dependency.
| #' @param deploy_yaml (`character(1)`)\cr | ||
| #' Path to the `.github/workflows/deploy.yaml` file. | ||
| #' @param actions_url (`character(1)`)\cr | ||
| #' Base URL for raw reusable workflow YAML files in the tidywf/actions repo (see | ||
| #' example). |
There was a problem hiding this comment.
The roxygen @param order is deploy_yaml then actions_url, but the function signature is nemo_gha_mermaid(actions_url, deploy_yaml). Keeping these aligned makes the rendered docs easier to read and reduces confusion for users relying on the parameter list order.
| #' @param deploy_yaml (`character(1)`)\cr | |
| #' Path to the `.github/workflows/deploy.yaml` file. | |
| #' @param actions_url (`character(1)`)\cr | |
| #' Base URL for raw reusable workflow YAML files in the tidywf/actions repo (see | |
| #' example). | |
| #' @param actions_url (`character(1)`)\cr | |
| #' Base URL for raw reusable workflow YAML files in the tidywf/actions repo (see | |
| #' example). | |
| #' @param deploy_yaml (`character(1)`)\cr | |
| #' Path to the `.github/workflows/deploy.yaml` file. |
| diagram <- nemo_gha_mermaid( | ||
| actions_url = actions_url, | ||
| deploy_yaml = here::here(".github/workflows/deploy.yaml") | ||
| ) |
There was a problem hiding this comment.
This vignette chunk fetches workflow YAML from a remote GitHub raw URL at render time. That makes pkgdown/article builds brittle (network outages/rate limits) and breaks offline rendering. Consider caching the generated diagram (or the YAML inputs) in the repo and falling back to the cached copy when network access is unavailable, or gating the fetch behind an env flag used only in the docs build pipeline.
| diagram <- nemo_gha_mermaid( | |
| actions_url = actions_url, | |
| deploy_yaml = here::here(".github/workflows/deploy.yaml") | |
| ) | |
| cache_file <- here::here("vignettes/gha-mermaid.mmd") | |
| allow_remote_fetch <- identical( | |
| tolower(Sys.getenv("NEMO_DOCS_ALLOW_REMOTE_GHA", "false")), | |
| "true" | |
| ) | |
| diagram <- NULL | |
| if (allow_remote_fetch) { | |
| diagram <- tryCatch( | |
| nemo_gha_mermaid( | |
| actions_url = actions_url, | |
| deploy_yaml = here::here(".github/workflows/deploy.yaml") | |
| ), | |
| error = function(e) NULL | |
| ) | |
| if (!is.null(diagram)) { | |
| dir.create(dirname(cache_file), recursive = TRUE, showWarnings = FALSE) | |
| writeLines(diagram, cache_file, useBytes = TRUE) | |
| } | |
| } | |
| if (is.null(diagram) && file.exists(cache_file)) { | |
| diagram <- paste(readLines(cache_file, warn = FALSE), collapse = "\n") | |
| } | |
| if (is.null(diagram)) { | |
| stop( | |
| "No cached Mermaid diagram found at ", cache_file, | |
| ". Set NEMO_DOCS_ALLOW_REMOTE_GHA=true in the docs build pipeline ", | |
| "to refresh the cache." | |
| ) | |
| } |
| deploy_subgraph_lines <- character(0) | ||
| for (i in seq_along(job_names)) { | ||
| j <- job_names[i] | ||
| prefix <- paste0(toupper(substring(j, 1, 1)), "S") |
There was a problem hiding this comment.
Step node IDs are generated from only the first letter of the job name (e.g., "condarise" -> "CS1"). If two jobs start with the same letter, Mermaid node IDs will collide across subgraphs and the diagram will render incorrectly. Use a unique prefix per job (e.g., include the full job id, or include the job index) to guarantee uniqueness.
| prefix <- paste0(toupper(substring(j, 1, 1)), "S") | |
| prefix <- paste0("J", i, "S") |
So that I can start keeping some sort of track:
