Skip to content

Close dependency-detection gaps + quarto vignettes + inside_rmd autodetect#134

Merged
VincentGuyader merged 15 commits intomainfrom
cleanup-and-issues
Apr 24, 2026
Merged

Close dependency-detection gaps + quarto vignettes + inside_rmd autodetect#134
VincentGuyader merged 15 commits intomainfrom
cleanup-and-issues

Conversation

@VincentGuyader
Copy link
Copy Markdown
Member

@VincentGuyader VincentGuyader commented Apr 24, 2026

Summary

Broad cleanup + bug-fix branch. Modernises {attachment}'s dependency detection (AST-based), fixes Quarto vignette handling, simplifies inside_rmd via auto-detect, and drops the {fusen} development workflow.

Closed issues

Closes #106inside_rmd auto-detect via knitr::opts_knit$get("out.format"); tested in test-inside-rmd-autodetect.R.

Closes #120:: inside string literals / comments (xpath following-sibling::, CSS ::after, sprintf('%s::plot()', ...), etc.) is no longer misread as a package reference; tested in test-edge-cases.R + f_edge_cases.R.

Closes #128use("pkg", ...) (R ≥ 4.4) is now detected; tested in f_edge_cases.R (findedge8–10).

Closes #131.qmd vignette directories add quarto instead of / alongside rmarkdown; tested in test-quarto-vignettes.R.

Closes #132 — CSS-style :: patterns inside strings are ignored; tested in f_edge_cases.R (label::after, ::first-line).

Partially addressed (not closed)

  • wip: rethink the way attachment detect packages / functions #129 — the false positives it lists (s from sprintf, matplotlib from string literals, use(), getFromNamespace(), named-arg library(package = ...)) are fixed and tested. Two open points are out of scope for this PR:
    • raising an error when a declared package is not installed — belongs to att_amend_desc() via check_if_suggests_is_installed, not the pure parser;
    • resolving bare function names to their origin package (toTitleCase → tools) — requires a function→package lookup layer and changes the whole detection paradigm.
  • Ignore CSS pseudo-selector :: patterns in dependency extraction used by att_amend_desc #133 — the CSS :: issue it reports is subsumed by the AST refactor, and the PR's regex patch is redundant once this branch merges. Leaving it to the maintainer to decide whether to close it.

What's in the branch

  • AST-based dependency detection (att_from_rscript()): walks the R parse tree, with a regex-based legacy fallback (now accepting _ in package names) when parse() fails. Adds detection of use("pkg") (R ≥ 4.4), getFromNamespace(fn, "pkg"), loadNamespace(), named-arg forms (library(package = "pkg"), requireNamespace("pkg", lib.loc = "...")), and fully-qualified intro calls (base::library(pkg), methods::getFromNamespace(...)). Introspection helpers (packageVersion, getNamespace, asNamespace, attachNamespace) are intentionally not treated as dep introducers. Gains an encoding argument (defaults to getOption("encoding")) so Latin-1 / Windows-1252 scripts on Windows aren't forced through UTF-8. Emits a warning() when the AST parser fails and we fall back to legacy regex.
  • Walker robustness: handles "empty" AST nodes (x[, 1], missing else, switch(a = , b = ...)) via a walk_elem() helper. Without this fix the walker crashed with argument "el" is missing, with no default on any realistic script.
  • Quarto-aware vignettes (att_from_rmds()): engine inferred from files actually present.
  • inside_rmd auto-detect (att_from_rmd() / att_from_rmds()): default NULL, detected via knitr::opts_knit$get("out.format").
  • Drop {fusen} workflow: fusen::sepuku() applied; dev/flat_*.Rmd and dev/config_fusen.yaml removed; banners stripped from generated files (now hand-maintained).
  • Manual detection harness: dev/manual_detection_edge_cases.R — a ~600-line R script exercising every edge case we could think of (parse-tree weirdness, string literals, raw strings, comments, empty args, quote/bquote, deep/wide nesting, fully-qualified intro calls, etc.), with the expected true- and false-positive lists baked in as comments.

Tests

[ FAIL 0 | SKIP 1 | PASS 389 ] on clean CI. Every fix has at least one test; the detection harness checks 125 true positives and ~45 false positives and currently scores 100%.

Version

Version bumped to 1.0.0 to signal the detector's new parse-tree foundation.

Test plan

  • devtools::test() green on CI across macOS, Windows, ubuntu devel/release/oldrel-{1,2,3}
  • R CMD check green on all platforms
  • Manual: att_amend_desc() on a package with a .qmd vignette (should no longer force rmarkdown / remove quarto)
  • Manual: att_from_rscript() on scripts with xpath-style :: inside strings (should not flag fake packages)
  • Manual: att_from_rscript() on scripts with x[, 1] / empty switch arms (should not crash)

Removes dev/flat_*.Rmd and dev/config_fusen.yaml, strips the
"Generated by {fusen}" banners from R/, tests/, and vignettes/.
Updates CLAUDE.md to no longer describe the fusen workflow.
Rewrite att_from_rscript() to walk the R syntax tree rather than
regex-match source text. This closes several gaps reported on GitHub:

- #120/#132: false positives from `::` inside string literals or
  comments (xpath `following-sibling::td`, CSS `label::after`,
  `sprintf('%s::plot()', ...)`) are now ignored.
- #128: `use("pkg", ...)` (R >= 4.4) is now detected.
- #129: named-arg forms are supported:
  `library(package = "pkg")`, `require(lib.loc = ..., package = ...)`,
  `requireNamespace("pkg", lib.loc = ...)`,
  `getFromNamespace("fn", "pkg")`, `loadNamespace()`, `asNamespace()`,
  `attachNamespace()`, `getNamespace()`, `packageVersion()`.

Fixture `tests/testthat/f_edge_cases.R` and new test file
`tests/testthat/test-edge-cases.R` document the cases. The regex
path is kept as a fallback for snippets that fail to parse.
When vignettes are `.qmd`, att_from_rmds() now contributes `quarto`
instead of (or alongside) `rmarkdown` to the vignette dependency list.
Mixed .Rmd + .qmd directories contribute both engines. `knitr` is
kept in both cases since Quarto uses the knitr engine for R chunks.

Closes #131.
`att_from_rmd()` and `att_from_rmds()` now default `inside_rmd = NULL`
and auto-detect whether they are being called from inside a knit
session by looking at `knitr::opts_knit$get("out.format")`. Users no
longer have to reason about this flag for the common case.

Backwards compatible: explicit TRUE/FALSE is still honored.

Closes #106.
- Document the detection-gap fixes and fusen removal in NEWS.md.
- Regenerate man/*.Rd with roxygen2 7.3.3.
- Add CLAUDE.md to .Rbuildignore so R CMD check stays clean.
- Rename a fixture library() target from `dplyr` to `findedge0` to
  stop the "unstated dependencies in tests" warning.

`R CMD check` now returns 0 errors / 0 warnings / 0 notes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes {attachment}’s dependency extraction (reducing false positives and adding new detection patterns), improves vignette engine inference for Quarto, and simplifies inside_rmd handling via auto-detection. It also removes the {fusen}-generated workflow artifacts and updates docs/tests accordingly.

Changes:

  • Refactor att_from_rscript() to use AST-based parsing for more accurate dependency detection (with regex fallback).
  • Make att_from_rmds() Quarto-aware for vignette engine dependencies and change inside_rmd defaulting to auto-detect (NULL) for Rmd/qmd parsing.
  • Remove {fusen} workflow files and generated-file banners; add/update tests, Rd docs, and NEWS.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vignettes/create-dependencies-file.Rmd Remove {fusen} “generated” banner from vignette.
tests/testthat/test-quarto-vignettes.R Add tests ensuring .qmd vignettes add quarto (not forced rmarkdown).
tests/testthat/test-inside-rmd-autodetect.R Add tests for inside_rmd = NULL auto-detection behavior.
tests/testthat/test-edge-cases.R Add regression tests for AST-based extraction edge cases / false positives.
tests/testthat/test-dependencies_file_text.R Remove {fusen} “generated” banner comment.
tests/testthat/test-create_dependencies_file.R Remove {fusen} “generated” banner comment.
tests/testthat/test-amend_with_config.R Remove {fusen} “generated” banner comment.
tests/testthat/f_edge_cases.R Add fixture covering new true/false positive detection cases.
man/attachment-deprecated.Rd Update inside_rmd parameter description to allow NULL + auto-detect.
man/att_from_rscript.Rd Document AST-based parsing and newly detected call patterns.
man/att_from_rmds.Rd Update inside_rmd default to NULL and document auto-detect.
man/att_from_rmd.Rd Update inside_rmd default to NULL and document auto-detect.
man/att_amend_desc.Rd Update inside_rmd parameter description (but currently inconsistent with usage default).
dev/flat_save_att_params.Rmd Remove {fusen} flat file (workflow dropped).
dev/flat_create_dependencies_file.Rmd Remove {fusen} flat file (workflow dropped).
dev/config_fusen.yaml Remove {fusen} config (workflow dropped).
R/dependencies_file_text.R Remove {fusen} “generated” banner comment.
R/create_dependencies_file.R Remove {fusen} “generated” banner comment.
R/att_from_rscripts.R Implement AST-based dependency detection + legacy regex fallback.
R/att_from_rmds.R Add inside_rmd auto-detect + Quarto-aware vignette engine inference.
R/amend_with_config.R Remove {fusen} “generated” banner comment.
NEWS.md Document AST parsing, Quarto vignette handling, inside_rmd auto-detect, and fusen workflow removal.
DESCRIPTION Bump RoxygenNote to 7.3.3.
CLAUDE.md Add repository guidance for tooling/maintenance.
.Rbuildignore Ignore CLAUDE.md in package builds.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/att_from_rscripts.R Outdated
Comment thread man/att_amend_desc.Rd
Comment thread man/attachment-deprecated.Rd
Comment thread CLAUDE.md Outdated
…ript

Agent-Logs-Url: https://github.com/ThinkR-open/attachment/sessions/a9033401-4988-4a12-a5c9-8c9f73d52493

Co-authored-by: VincentGuyader <10470699+VincentGuyader@users.noreply.github.com>
The roxygen doc already stated `NULL` (auto-detect) as the default, but
`att_amend_desc()` and the deprecated `att_to_description()` still
hard-coded `FALSE`. Align the defaults with the documented behavior.
The default of `att_amend_desc(inside_rmd = ...)` changed from `FALSE`
to `NULL` (01c615c) but `test-amend_with_config.R` still compared the
serialized yaml against "inside_rmd: no". `yaml::write_yaml()` turns
`NULL` into "~", so all 3 snapshot assertions failed on every CI job.
Update the expected snapshots to match the new default.
Follow-up to the audit of #134:

- Walker: handle "empty" call arguments (e.g. `x[, 1]`,
  `yaml_options[!names(...) %in% ...]`, missing `else` branches). The
  previous walker crashed with `argument "el" is missing, with no
  default` on any script containing them, which meant most real R files
  silently fell back to the legacy regex detector. A new `walk_elem()`
  helper inspects each sub-expression as a promise and skips empty
  symbols without forcing them into a local binding.

- `att_from_rscript()` gains an `encoding` argument (default
  `getOption("encoding")`) instead of hardcoding UTF-8. Avoids a
  regression for Latin-1 / Windows-1252 scripts on Windows.

- Restrict `pkg_intro_calls` to the documented set
  (`library`, `require`, `requireNamespace`, `loadNamespace`, `use`,
  `getFromNamespace`). `packageVersion()`, `getNamespace()`,
  `asNamespace()`, `attachNamespace()` are intentionally *not* treated
  as dependency introducers — they are commonly used for feature
  detection and would otherwise silently expand `Imports`.

- Emit a `warning()` when the AST parser fails and we fall back to the
  legacy regex detector, naming the offending file. Broken scripts are
  no longer silently degraded.

- Replace `expect_silent()` around `att_from_rmd()` with explicit type
  and length assertions, and verify the `knitr::opts_knit` auto-detect
  expression directly rather than re-running the purl pipeline through
  `system()`. The previous tests were flaky on CI because knitr/purl
  and `system()` both write to stdout.

- Update `test-rscript.R` expectation for `escape_newline.R`: the
  former assertion (`c("rmarkdown", "glue", "knitr")`) relied on regex
  false positives inside glue string literals. With AST detection, only
  `glue` is a real call (via `glue::glue_collapse`); the other two must
  not be detected.

Regenerated `man/*.Rd` with roxygen2.
…fallback

- Walker now honours fully-qualified dependency-introducing calls such as
  `base::library(pkg)`, `base::require(pkg)`,
  `base::requireNamespace("pkg")`, `base::loadNamespace("pkg")`, and
  `methods::getFromNamespace(fn, "pkg")`. Previously only the outer
  namespace was recorded; the inner package was missed.

- The legacy regex fallback now accepts `_` in package names. Previously
  `my_pkg::fn` was truncated to `pkg`, which hit GitHub-hosted packages
  as soon as a file failed to parse (the fallback is invoked per file,
  so a single syntax error upstream could corrupt the whole detection).

- Roxygen `@details` updated to describe both the new fully-qualified
  behaviour and the parse-failure fallback warning.

- Two new unit tests under `test-edge-cases.R` cover the qualified-call
  path and the underscore-preserving fallback.
A ~450-line R script exercising every dependency-detection situation we
could enumerate: standard `pkg::fn` and `library()/require()` variants,
R >= 4.4 `use()`, `getFromNamespace`, named and positional package args,
string literals (xpath, CSS, C++, URLs, sprintf, multiline, raw r"(...)"),
comments, empty/missing subscripts, control flow, closures, formulas,
S4/R6, `quote/bquote/substitute/expression`, `eval(quote(...))`,
deep/wide nesting, semicolons, whitespace around `::`, fully-qualified
intro calls (`base::library(pkg)`) and known-limitation cases
(e.g. `library(symbol, character.only = TRUE)` captures the symbol, not
the string it points to).

Two comment blocks at the bottom enumerate EXPECTED_TRUE_POSITIVES and
EXPECTED_FALSE_POSITIVES so a maintainer can diff against
`att_from_rscript()` output after any change to the detector.

Force-added despite `dev/.gitignore: *.R` — the harness is shared
maintenance-time infrastructure, not a personal scratchpad, and the
`dev/` directory is already `.Rbuildignore`d so it is not shipped to
CRAN.
Major version bump to mark the new AST-based dependency detector and
the set of user-visible changes captured in NEWS.md since 0.4.5:
parse-tree detection, `use()`/`getFromNamespace` support, fully-qualified
intro calls (`base::library(pkg)`), quarto-aware vignette engine
selection, `inside_rmd` auto-detect, and the removal of the `{fusen}`
maintenance workflow.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/att_from_rscripts.R Outdated
Comment on lines +104 to +109
# Guard against "empty" call arguments such as the first subscript in `x[, 1]`
# or the missing `else` branch of `if (cond) a`. Passing the element as a
# promise avoids triggering "argument is missing, with no default" errors when
# the element is an empty symbol — we only inspect it, never assign it locally.
walk_elem <- function(el, walker) {
if (missing(el) || is.null(el) || is_empty_symbol(el)) return(invisible())
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In walk_elem(), the missing(el) check is redundant (the argument is always supplied in current call sites), and the preceding comment about passing elements “as a promise” doesn’t match what the code is doing (the element is evaluated before being passed). Consider removing missing(el) and rewording/simplifying the comment to reflect that this helper is just skipping NULL/empty-symbol AST nodes.

Suggested change
# Guard against "empty" call arguments such as the first subscript in `x[, 1]`
# or the missing `else` branch of `if (cond) a`. Passing the element as a
# promise avoids triggering "argument is missing, with no default" errors when
# the element is an empty symbol — we only inspect it, never assign it locally.
walk_elem <- function(el, walker) {
if (missing(el) || is.null(el) || is_empty_symbol(el)) return(invisible())
# Skip "empty" AST nodes such as the first subscript in `x[, 1]`
# or the missing `else` branch of `if (cond) a`.
walk_elem <- function(el, walker) {
if (is.null(el) || is_empty_symbol(el)) return(invisible())

Copilot uses AI. Check for mistakes.
Comment thread man/att_from_rmds.Rd
Comment on lines +37 to +40
\item{inside_rmd}{Logical or \code{NULL}. Whether the function is being called
from inside a knit session, in which case the actual purl step must be
delegated to an external R process. When \code{NULL} (the default), this is
auto-detected via \code{knitr::opts_knit$get("out.format")}.}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated inside_rmd docs describe knit-session auto-detection, but the \value section still says “knitr and rmarkdown are added by default” for vignette paths. Since att_from_rmds() now conditionally adds quarto and may omit rmarkdown for qmd-only vignette dirs, the generated Rd should be updated to reflect the new behavior (knitr + engine(s) inferred from .Rmd/.qmd).

Copilot uses AI. Check for mistakes.
- `walk_elem()`: drop the redundant `missing(el)` guard (the helper is
  always called with `args[[i]]`, so the arg is always supplied) and
  reword the comment to state the actual invariant: the element is
  received as a function argument rather than bound locally, which is
  what keeps R from forcing the missing-arg value. Copilot review
  comment on R/att_from_rscripts.R:104.

- `att_from_rmds()` `@return`: the old roxygen said "knitr and rmarkdown
  are added by default". Since we now conditionally pick `quarto`/`rmarkdown`
  based on the files actually present in the vignette directory, reword
  the @return to describe the new engine-inference logic. Copilot review
  comment on man/att_from_rmds.Rd:37.

Regenerated `man/att_from_rmds.Rd` with roxygen2.
- Group bullets by topic (detection foundation / new patterns / robustness /
  vignettes / maintenance) so reviewers and users can scan the change set
  without reading every line.
- Add the previously-missing entry for the AST walker's safe handling of
  empty call arguments (`x[, 1]`, empty `switch` alternatives, missing
  `else` branches). Without that fix the walker crashed on almost any
  realistic script.
- Tighten issue references: only keep issue numbers next to bullets whose
  fix is actually tested and shipped in this release (#106, #120, #128,
  #131, #132). #129 is partially addressed so it is no longer claimed as
  closed in NEWS.
@VincentGuyader VincentGuyader merged commit f5bea9c into main Apr 24, 2026
9 checks passed
@VincentGuyader VincentGuyader deleted the cleanup-and-issues branch April 24, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants